Statistical Analysis and Reporting
Contents
Statistical Analysis and Reporting#
A Jupyter Book to help you find and run statistical tests in both R and Python.
Using your data, find the test you want to run (proportion, assumption, distribution, effect, etc.) and which language you want to run this test in. The code snippet provided with each test is just an example, Code recipes only get one so far. Please do not blindly throw code snippets at every problem you encounter. In reality, there’s sometimes more to be done.
Tests of Proportion Index#
Samples |
Response Categories |
N |
Test in R |
Test in Python |
---|---|---|---|---|
1 |
2 |
≤200 |
||
1 |
≥2 |
≤200 |
||
1 |
≥2 |
>200 |
||
2 |
≥2 |
≤200 |
||
2 |
≥2 |
>200 |
||
2 |
≥2 |
>200 |
Tests of Assumption Index#
Assumption |
Context of Use |
Test in R |
Test in Python |
---|---|---|---|
Normality |
t-test, ANOVA, LM, LMM |
||
Normality |
t-test, ANOVA, LM, LMM |
||
Normality |
t-test, ANOVA, LM, LMM |
||
Normality |
t-test, ANOVA, LM, LMM |
||
Homoscedasticity (Homogeneity of Variance) |
t-test, ANOVA, LM, LMM |
||
Homoscedasticity (Homogeneity of Variance) |
t-test, ANOVA, LM, LMM |
||
Sphericity |
Repeated Measures ANOVA |
Tests of Distributions Index#
Distribution |
Parameterization |
Test in R |
Test in Python |
---|---|---|---|
Normal |
mean (μ): |
||
Lognormal |
mean (μ): |
||
Poisson |
lambda (λ): |
||
Negative Binomial |
theta (θ): |
||
Exponential |
rate (λ): |
||
Gamma |
shape (α): |
Parametric Tests of Effect Index#
Samples |
Levels |
Between or Within Subjects |
Test in R |
Test in Python |
---|---|---|---|---|
1 |
2 |
Between |
||
1 |
2 |
Within |
||
1 |
≥2 |
Between |
||
1 |
≥2 |
Within |
||
≥2 |
≥2 |
Between |
||
≥2 |
≥2 |
Between |
||
≥2 |
≥2 |
Within |
||
≥2 |
≥2 |
Within |
Nonparametric Tests of Effect Index#
Samples |
Levels |
Between or Within Subjects |
Test in R |
Test in Python |
---|---|---|---|---|
1 |
2 |
Between |
||
1 |
2 |
Within |
||
1 |
≥2 |
Between |
||
1 |
≥2 |
Within |
||
≥2 |
≥2 |
Between |
||
≥2 |
≥2 |
Between |
Generalized Linear Model (GLM) |
Generalized Linear Model (GLM) |
≥2 |
≥2 |
Within |
||
≥2 |
≥2 |
Within |
Generalized Linear Mixed Model (GLMM) |
Generalized Linear Mixed Model (GLMM) |
Generalized Linear (Mixed) Models Index#
Distribution |
Typical Uses |
Between or Within Subjects |
Test in R |
Test in Python |
---|---|---|---|---|
Normal |
Linear Regression: equivalent to linear (mixed) model ( |
Between |
||
Normal |
Linear Regression: equivalent to linear (mixed) model ( |
Within |
||
Binomial |
Logistic Regression: dichotomous responses (e.g. nominal responses with two categories) |
Between |
||
Binomial |
Logistic Regression: dichotomous responses (e.g. nominal responses with two categories) |
Within |
||
Multinomial |
Multinomial Logistic Regression: polytomous responses (i.e. nominal responses with more two categories) |
Between |
||
Multinomial |
Multinomial Logistic Regression: polytomous responses (i.e. nominal responses with more two categories) |
Within |
||
Ordinal |
Ordinal Logistic Regression: ordinal responses (i.e. Likert scales) |
Between |
||
Ordinal |
Ordinal Logistic Regression: ordinal responses (i.e. Likert scales) |
Within |
||
Poisson |
Poisson Regression: counts, rare events (e.g. gesture recognition errors, 3-pointers per quarter, number of “F” grades) |
Between |
||
Poisson |
Poisson Regression: counts, rare events (e.g. gesture recognition errors, 3-pointers per quarter, number of “F” grades) |
Within |
||
Zero-Inflated Poisson |
Zero-Inflated Poisson Regression: counts, rare events with a large portion of zeroes |
Between |
||
Zero-Inflated Poisson |
Zero-Inflated Poisson Regression: counts, rare events with a large portion of zeroes |
Within |
||
Negative Binomial |
Negative Binomial Regression: same as Poisson but for use in the presence of overdispersion |
Between |
||
Negative Binomial |
Negative Binomial Regression: same as Poisson but for use in the presence of overdispersion |
Within |
||
Zero-Inflated Negative Binomial |
Zero-Inflated Negative Binomial Regression: same as Zero-Inflated Poisson but for use in the presence of overdispersion |
Between |
||
Zero-Inflated Negative Binomial |
Zero-Inflated Negative Binomial Regression: same as Zero-Inflated Poisson but for use in the presence of overdispersion |
Within |
||
Gamma and Exponential |
Gamme and Exponential Regression: exponentially distributed responses (e.g. income, wait times) |
Between |
||
Gamma and Exponential |
Gamme and Exponential Regression: exponentially distributed responses (e.g. income, wait times) |
Within |