One Factor Parametric Tests - R
Contents
One Factor Parametric Tests - R#
Independent-Samples t-test#
Samples:
1
Levels:
2
Between or Within Subjects: Between
Reporting: “The mean of ‘a’ was 14.63 (SD = 2.13) and of ‘b’ was 11.01 (SD = 1.75). This difference was statistically significant according to an independent-samples t-test (t(58) = 7.18, p < .0001).”
# Example data
# df has subjects (S), one between-Ss factor (X) w/levels (a,b), and continuous response (Y)
df <- read.csv("data/1F2LBs.csv")
head(df, 20)
S | X | Y | |
---|---|---|---|
<int> | <chr> | <dbl> | |
1 | 1 | a | 13.211290 |
2 | 2 | b | 9.376966 |
3 | 3 | a | 9.832110 |
4 | 4 | b | 13.241823 |
5 | 5 | a | 15.290763 |
6 | 6 | b | 11.926719 |
7 | 7 | a | 16.513046 |
8 | 8 | b | 11.817017 |
9 | 9 | a | 14.405615 |
10 | 10 | b | 7.186540 |
11 | 11 | a | 14.620504 |
12 | 12 | b | 9.200230 |
13 | 13 | a | 13.619696 |
14 | 14 | b | 8.989190 |
15 | 15 | a | 10.055274 |
16 | 16 | b | 12.827598 |
17 | 17 | a | 16.136434 |
18 | 18 | b | 11.604144 |
19 | 19 | a | 12.451788 |
20 | 20 | b | 11.255572 |
df$S = factor(df$S) # Subject id is nominal (unused)
df$X = factor(df$X) # X is a 2-level factor
t.test(Y ~ X, data=df, var.equal=TRUE) # use var.equal=FALSE if heteroscedastistic
Two Sample t-test
data: Y by X
t = 7.1775, df = 58, p-value = 1.475e-09
alternative hypothesis: true difference in means between group a and group b is not equal to 0
95 percent confidence interval:
2.609128 4.627288
sample estimates:
mean in group a mean in group b
14.62769 11.00948
Paired-Samples t-test#
Samples:
1
Levels:
2
Between or Within Subjects: Within
Reporting: “The mean of ‘a’ was 13.15 (SD = 2.53) and of ‘b’ was 14.37 (SD = 2.16). This difference was statistically significant according to a paired-samples t-test (t(29) = -2.14, p < .05).”
# Example data
# df has subjects (S), one within-Ss factor (X) w/levels (a,b), and continuous response (Y)
df <- read.csv("data/1F2LWs.csv")
head(df, 20)
S | X | Y | |
---|---|---|---|
<int> | <chr> | <dbl> | |
1 | 1 | a | 9.348176 |
2 | 1 | b | 16.280812 |
3 | 2 | a | 12.797245 |
4 | 2 | b | 14.638421 |
5 | 3 | a | 11.757036 |
6 | 3 | b | 14.221410 |
7 | 4 | a | 13.037475 |
8 | 4 | b | 15.172092 |
9 | 5 | a | 14.934822 |
10 | 5 | b | 16.080187 |
11 | 6 | a | 10.762074 |
12 | 6 | b | 11.558915 |
13 | 7 | a | 14.699569 |
14 | 7 | b | 15.364241 |
15 | 8 | a | 12.476783 |
16 | 8 | b | 13.468385 |
17 | 9 | a | 12.761169 |
18 | 9 | b | 17.316534 |
19 | 10 | a | 16.532902 |
20 | 10 | b | 16.405975 |
library(reshape2) # for dcast
df$S = factor(df$S) # Subject id is nominal
df$X = factor(df$X) # X is a 2-level factor
df2 <- dcast(df, S ~ X, value.var="Y") # make wide-format table
t.test(df2$a, df2$b, paired=TRUE) # homoscedasticity is irrelevant for a paired-samples t-test
Paired t-test
data: df2$a and df2$b
t = -2.1363, df = 29, p-value = 0.04123
alternative hypothesis: true mean difference is not equal to 0
95 percent confidence interval:
-2.39877224 -0.05222605
sample estimates:
mean difference
-1.225499
One-Way ANOVA#
Samples:
1
Levels:
≥2
Between or Within Subjects: Between
Reporting: “The mean of ‘a’ was 13.74 (SD = 2.84), of ‘b’ was 14.15 (SD = 2.65), and of ‘c’ was 9.08 (SD = 4.29). These differences were statistically significant according to a one-way ANOVA (F(2, 57) = 14.18, p < .0001).”
# Example data
# df has subjects (S), one between-Ss factor (X) w/levels (a,b,c), and continuous response (Y)
df <- read.csv("data/1F3LBs.csv")
head(df, 20)
S | X | Y | |
---|---|---|---|
<int> | <chr> | <dbl> | |
1 | 1 | a | 14.310439 |
2 | 2 | b | 17.390453 |
3 | 3 | c | 12.501365 |
4 | 4 | a | 17.943734 |
5 | 5 | b | 10.597671 |
6 | 6 | c | 9.652177 |
7 | 7 | a | 10.095838 |
8 | 8 | b | 15.324131 |
9 | 9 | c | 7.649627 |
10 | 10 | a | 13.517695 |
11 | 11 | b | 13.702848 |
12 | 12 | c | 19.033070 |
13 | 13 | a | 11.871676 |
14 | 14 | b | 12.177908 |
15 | 15 | c | 7.713374 |
16 | 16 | a | 11.698955 |
17 | 17 | b | 13.716288 |
18 | 18 | c | 9.492661 |
19 | 19 | a | 17.384638 |
20 | 20 | b | 13.506336 |
df$S = factor(df$S) # Subject id is nominal (unused)
df$X = factor(df$X) # X is a 3-level factor
m = aov(Y ~ X, data=df) # fit model
anova(m)
Df | Sum Sq | Mean Sq | F value | Pr(>F) | |
---|---|---|---|---|---|
<int> | <dbl> | <dbl> | <dbl> | <dbl> | |
X | 2 | 316.8775 | 158.43873 | 14.18067 | 1.00343e-05 |
Residuals | 57 | 636.8532 | 11.17286 | NA | NA |
One-Way Repeated Measures ANOVA#
Samples:
1
Levels:
≥2
Between or Within Subjects: Within
Reporting: “The mean of ‘a’ was 14.04 (SD = 2.98), of ‘b’ was 11.95 (SD = 1.98), and of ‘c’ was 11.40 (SD = 2.75). Mauchly’s test of sphericity indicated no sphericity violation (W = .926, p = .499), allowing for an uncorrected repeated measures ANOVA, which showed statistically significant differences (F(2, 38) = 6.57, p < .01).”
# Example data
# df has subjects (S), one within-Ss factor (X) w/levels (a,b,c), and continuous response (Y)
df <- read.csv("data/1F3LWs.csv")
head(df, 20)
S | X | Y | |
---|---|---|---|
<int> | <chr> | <dbl> | |
1 | 1 | a | 8.833569 |
2 | 1 | b | 11.849632 |
3 | 1 | c | 7.909041 |
4 | 2 | a | 15.361964 |
5 | 2 | b | 10.177478 |
6 | 2 | c | 13.245308 |
7 | 3 | a | 10.820555 |
8 | 3 | b | 9.240199 |
9 | 3 | c | 14.307353 |
10 | 4 | a | 13.089309 |
11 | 4 | b | 13.276904 |
12 | 4 | c | 11.683161 |
13 | 5 | a | 12.229721 |
14 | 5 | b | 10.442200 |
15 | 5 | c | 7.425539 |
16 | 6 | a | 15.471203 |
17 | 6 | b | 10.401901 |
18 | 6 | c | 14.725702 |
19 | 7 | a | 14.398177 |
20 | 7 | b | 10.755235 |
library(ez) # for ezANOVA
df$S = factor(df$S) # Subject id is nominal
df$X = factor(df$X) # X is a 3-level factor
m = ezANOVA(dv=Y, within=c(X), wid=S, type=3, data=df) # use c() for >1 factors
m$Mauchly # p<.05 indicates a sphericity violation
Effect | W | p | p<.05 | |
---|---|---|---|---|
<chr> | <dbl> | <dbl> | <chr> | |
2 | X | 0.9256272 | 0.4987981 |
m$ANOVA # use if no violation
Effect | DFn | DFd | F | p | p<.05 | ges | |
---|---|---|---|---|---|---|---|
<chr> | <dbl> | <dbl> | <dbl> | <dbl> | <chr> | <dbl> | |
2 | X | 2 | 38 | 6.569937 | 0.003543683 | * | 0.1682495 |
# if there is a sphericity violation, report the Greenhouse-Geisser or Huynh-Feldt correction
p = match(m$Sphericity$Effect, m$ANOVA$Effect) # positions of within-Ss effects in m$ANOVA
m$Sphericity$GGe.DFn = m$Sphericity$GGe * m$ANOVA$DFn[p] # Greenhouse-Geisser DFs
m$Sphericity$GGe.DFd = m$Sphericity$GGe * m$ANOVA$DFd[p]
m$Sphericity$HFe.DFn = m$Sphericity$HFe * m$ANOVA$DFn[p] # Huynh-Feldt DFs
m$Sphericity$HFe.DFd = m$Sphericity$HFe * m$ANOVA$DFd[p]
m$Sphericity # show results
Effect | GGe | p[GG] | p[GG]<.05 | HFe | p[HF] | p[HF]<.05 | GGe.DFn | GGe.DFd | HFe.DFn | HFe.DFd | |
---|---|---|---|---|---|---|---|---|---|---|---|
<chr> | <dbl> | <dbl> | <chr> | <dbl> | <dbl> | <chr> | <dbl> | <dbl> | <dbl> | <dbl> | |
2 | X | 0.9307756 | 0.00446249 | * | 1.027836 | 0.003543683 | * | 1.861551 | 35.36947 | 2.055672 | 39.05776 |
# the following also performs the equivalent repeated measures ANOVA, but does not address sphericity
m = aov(Y ~ X + Error(S/X), data=df)
summary(m)
Error: S
Df Sum Sq Mean Sq F value Pr(>F)
Residuals 19 160.3 8.436
Error: S:X
Df Sum Sq Mean Sq F value Pr(>F)
X 2 78.12 39.06 6.57 0.00354 **
Residuals 38 225.93 5.95
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1