{
"cells": [
{
"cell_type": "markdown",
"id": "c47ebc4d-d172-4e70-9e16-cf6a0ba7153a",
"metadata": {},
"source": [
"# One Sample Tests of Proportion - R"
]
},
{
"cell_type": "markdown",
"id": "4ecefac5-3aac-4309-a413-d1e62eb0fc81",
"metadata": {},
"source": [
"## Binomial Test\n",
"\n",
"* **Samples:** `1`\n",
"* **Response Categories:** `2`\n",
"* **Exact?:** Yes, use with `N≤200`\n",
"* **Reporting:** \"Out of 60 outcomes, 19 were 'x' and 41 were 'y'. A two-sided exact binomial test indicated that these proportions were statistically significantly different from chance (p < .05)\""
]
},
{
"cell_type": "code",
"execution_count": 18,
"id": "c7fa707e-af25-4dab-a4f6-cddd99a5ca4e",
"metadata": {
"vscode": {
"languageId": "r"
}
},
"outputs": [
{
"data": {
"text/html": [
"
\n",
"A data.frame: 20 × 2\n",
"\n",
"\t | S | Y |
\n",
"\t | <int> | <chr> |
\n",
"\n",
"\n",
"\t1 | 1 | y |
\n",
"\t2 | 2 | y |
\n",
"\t3 | 3 | x |
\n",
"\t4 | 4 | y |
\n",
"\t5 | 5 | y |
\n",
"\t6 | 6 | x |
\n",
"\t7 | 7 | y |
\n",
"\t8 | 8 | x |
\n",
"\t9 | 9 | y |
\n",
"\t10 | 10 | y |
\n",
"\t11 | 11 | x |
\n",
"\t12 | 12 | y |
\n",
"\t13 | 13 | y |
\n",
"\t14 | 14 | y |
\n",
"\t15 | 15 | y |
\n",
"\t16 | 16 | x |
\n",
"\t17 | 17 | x |
\n",
"\t18 | 18 | y |
\n",
"\t19 | 19 | y |
\n",
"\t20 | 20 | x |
\n",
"\n",
"
\n"
],
"text/latex": [
"A data.frame: 20 × 2\n",
"\\begin{tabular}{r|ll}\n",
" & S & Y\\\\\n",
" & & \\\\\n",
"\\hline\n",
"\t1 & 1 & y\\\\\n",
"\t2 & 2 & y\\\\\n",
"\t3 & 3 & x\\\\\n",
"\t4 & 4 & y\\\\\n",
"\t5 & 5 & y\\\\\n",
"\t6 & 6 & x\\\\\n",
"\t7 & 7 & y\\\\\n",
"\t8 & 8 & x\\\\\n",
"\t9 & 9 & y\\\\\n",
"\t10 & 10 & y\\\\\n",
"\t11 & 11 & x\\\\\n",
"\t12 & 12 & y\\\\\n",
"\t13 & 13 & y\\\\\n",
"\t14 & 14 & y\\\\\n",
"\t15 & 15 & y\\\\\n",
"\t16 & 16 & x\\\\\n",
"\t17 & 17 & x\\\\\n",
"\t18 & 18 & y\\\\\n",
"\t19 & 19 & y\\\\\n",
"\t20 & 20 & x\\\\\n",
"\\end{tabular}\n"
],
"text/markdown": [
"\n",
"A data.frame: 20 × 2\n",
"\n",
"| | S <int> | Y <chr> |\n",
"|---|---|---|\n",
"| 1 | 1 | y |\n",
"| 2 | 2 | y |\n",
"| 3 | 3 | x |\n",
"| 4 | 4 | y |\n",
"| 5 | 5 | y |\n",
"| 6 | 6 | x |\n",
"| 7 | 7 | y |\n",
"| 8 | 8 | x |\n",
"| 9 | 9 | y |\n",
"| 10 | 10 | y |\n",
"| 11 | 11 | x |\n",
"| 12 | 12 | y |\n",
"| 13 | 13 | y |\n",
"| 14 | 14 | y |\n",
"| 15 | 15 | y |\n",
"| 16 | 16 | x |\n",
"| 17 | 17 | x |\n",
"| 18 | 18 | y |\n",
"| 19 | 19 | y |\n",
"| 20 | 20 | x |\n",
"\n"
],
"text/plain": [
" S Y\n",
"1 1 y\n",
"2 2 y\n",
"3 3 x\n",
"4 4 y\n",
"5 5 y\n",
"6 6 x\n",
"7 7 y\n",
"8 8 x\n",
"9 9 y\n",
"10 10 y\n",
"11 11 x\n",
"12 12 y\n",
"13 13 y\n",
"14 14 y\n",
"15 15 y\n",
"16 16 x\n",
"17 17 x\n",
"18 18 y\n",
"19 19 y\n",
"20 20 x"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"# Example data\n",
"# df is a long-format data table w/columns for subject (S) and 2-category outcome (Y)\n",
"df <- read.csv(\"data/0F0LBs_binomial.csv\")\n",
"head(df, 20)"
]
},
{
"cell_type": "code",
"execution_count": 19,
"id": "9c5f3020-cc55-4064-bb7d-867f7d1f37dc",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"\n",
"\tExact binomial test\n",
"\n",
"data: xt\n",
"number of successes = 19, number of trials = 60, p-value = 0.006218\n",
"alternative hypothesis: true probability of success is not equal to 0.5\n",
"95 percent confidence interval:\n",
" 0.2025755 0.4495597\n",
"sample estimates:\n",
"probability of success \n",
" 0.3166667 \n"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"df$S = factor(df$S) # Subject id is nominal (unused)\n",
"df$Y = factor(df$Y) # Y is an outcome of 2 categories\n",
"xt = xtabs( ~ Y, data=df) # make counts\n",
"binom.test(xt, p=0.5, alternative=\"two.sided\")"
]
},
{
"cell_type": "markdown",
"id": "aee35848-3f2c-4738-9e52-0056aa785564",
"metadata": {},
"source": [
"## Multinomial Test\n",
"\n",
"* **Samples:** `1`\n",
"* **Response Categories:** `≥2`\n",
"* **Exact?:** Yes, use with `N≤200`\n",
"* **Reporting:** \"Out of 60 outcomes, 17 were 'x' and 8 were 'y', and 35 were 'z'. An exact multinomial test indicated that these proportions were statistically significantly different from chance (p < .0001)\""
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "d8b724ff-c01e-4aec-9247-926285b7a52e",
"metadata": {
"tags": [],
"vscode": {
"languageId": "r"
}
},
"outputs": [
{
"data": {
"text/html": [
"\n",
"A data.frame: 20 × 2\n",
"\n",
"\t | S | Y |
\n",
"\t | <int> | <chr> |
\n",
"\n",
"\n",
"\t1 | 1 | z |
\n",
"\t2 | 2 | z |
\n",
"\t3 | 3 | z |
\n",
"\t4 | 4 | x |
\n",
"\t5 | 5 | z |
\n",
"\t6 | 6 | z |
\n",
"\t7 | 7 | z |
\n",
"\t8 | 8 | z |
\n",
"\t9 | 9 | z |
\n",
"\t10 | 10 | z |
\n",
"\t11 | 11 | z |
\n",
"\t12 | 12 | x |
\n",
"\t13 | 13 | z |
\n",
"\t14 | 14 | y |
\n",
"\t15 | 15 | z |
\n",
"\t16 | 16 | y |
\n",
"\t17 | 17 | x |
\n",
"\t18 | 18 | z |
\n",
"\t19 | 19 | z |
\n",
"\t20 | 20 | z |
\n",
"\n",
"
\n"
],
"text/latex": [
"A data.frame: 20 × 2\n",
"\\begin{tabular}{r|ll}\n",
" & S & Y\\\\\n",
" & & \\\\\n",
"\\hline\n",
"\t1 & 1 & z\\\\\n",
"\t2 & 2 & z\\\\\n",
"\t3 & 3 & z\\\\\n",
"\t4 & 4 & x\\\\\n",
"\t5 & 5 & z\\\\\n",
"\t6 & 6 & z\\\\\n",
"\t7 & 7 & z\\\\\n",
"\t8 & 8 & z\\\\\n",
"\t9 & 9 & z\\\\\n",
"\t10 & 10 & z\\\\\n",
"\t11 & 11 & z\\\\\n",
"\t12 & 12 & x\\\\\n",
"\t13 & 13 & z\\\\\n",
"\t14 & 14 & y\\\\\n",
"\t15 & 15 & z\\\\\n",
"\t16 & 16 & y\\\\\n",
"\t17 & 17 & x\\\\\n",
"\t18 & 18 & z\\\\\n",
"\t19 & 19 & z\\\\\n",
"\t20 & 20 & z\\\\\n",
"\\end{tabular}\n"
],
"text/markdown": [
"\n",
"A data.frame: 20 × 2\n",
"\n",
"| | S <int> | Y <chr> |\n",
"|---|---|---|\n",
"| 1 | 1 | z |\n",
"| 2 | 2 | z |\n",
"| 3 | 3 | z |\n",
"| 4 | 4 | x |\n",
"| 5 | 5 | z |\n",
"| 6 | 6 | z |\n",
"| 7 | 7 | z |\n",
"| 8 | 8 | z |\n",
"| 9 | 9 | z |\n",
"| 10 | 10 | z |\n",
"| 11 | 11 | z |\n",
"| 12 | 12 | x |\n",
"| 13 | 13 | z |\n",
"| 14 | 14 | y |\n",
"| 15 | 15 | z |\n",
"| 16 | 16 | y |\n",
"| 17 | 17 | x |\n",
"| 18 | 18 | z |\n",
"| 19 | 19 | z |\n",
"| 20 | 20 | z |\n",
"\n"
],
"text/plain": [
" S Y\n",
"1 1 z\n",
"2 2 z\n",
"3 3 z\n",
"4 4 x\n",
"5 5 z\n",
"6 6 z\n",
"7 7 z\n",
"8 8 z\n",
"9 9 z\n",
"10 10 z\n",
"11 11 z\n",
"12 12 x\n",
"13 13 z\n",
"14 14 y\n",
"15 15 z\n",
"16 16 y\n",
"17 17 x\n",
"18 18 z\n",
"19 19 z\n",
"20 20 z"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"# Example data\n",
"# df is a long-format data table w/columns for subject (S) and N-category outcome (Y)\n",
"df <- read.csv(\"data/0F0LBs_multinomial.csv\")\n",
"head(df, 20)"
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "0f24c392-cfb3-4c94-903d-5799a8a4ecc5",
"metadata": {
"vscode": {
"languageId": "r"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
"P value (Prob) = 8.756e-05\n"
]
}
],
"source": [
"# install.packages(\"XNomial\")\n",
"library(XNomial) # import for xmulti\n",
"\n",
"df$S = factor(df$S) # Subject id is nominal (unused)\n",
"df$Y = factor(df$Y) # Y is an outcome of ≥2 categories\n",
"xt = xtabs( ~ Y, data=df) # make counts\n",
"xmulti(xt, rep(1/length(xt), length(xt)), statName=\"Prob\")"
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "1e5cfb31-bbed-4b9e-9299-d0c9d199b83c",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"\n",
"\tExact multinomial test\n",
"\n",
"data: $(df,Y)\n",
"p-value = 8.756e-05\n"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"# This can also be shortened using the RVAideMemoire library\n",
"# install.packages(\"RVAideMemoire\")\n",
"# on Ubuntu you may have trouble installing, see: TODO\n",
"library(RVAideMemoire)\n",
"\n",
"multinomial.test(df$Y)"
]
},
{
"cell_type": "markdown",
"id": "a6a1992e-7319-46eb-a1b4-3ada64b27eeb",
"metadata": {},
"source": [
"## One-Sample Pearson Chi-Squared Test\n",
"\n",
"* **Samples:** `1`\n",
"* **Response Categories:** `≥2`\n",
"* **Exact?:** No, use with `N>200`\n",
"* **Reporting:** \"Out of 60 outcomes, 17 were ‘x’, 8 were ‘y’, and 35 were ‘z’. A one-sample Pearson Chi-Squared test indicated that these proportions were statistically significantly different from chance (χ2 (2, N=60) = 18.90, p < .0001).\""
]
},
{
"cell_type": "code",
"execution_count": 9,
"id": "ba12d90e-8888-44ff-a7e1-33ac0c43f055",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"A data.frame: 20 × 2\n",
"\n",
"\t | S | Y |
\n",
"\t | <int> | <chr> |
\n",
"\n",
"\n",
"\t1 | 1 | z |
\n",
"\t2 | 2 | z |
\n",
"\t3 | 3 | z |
\n",
"\t4 | 4 | x |
\n",
"\t5 | 5 | z |
\n",
"\t6 | 6 | z |
\n",
"\t7 | 7 | z |
\n",
"\t8 | 8 | z |
\n",
"\t9 | 9 | z |
\n",
"\t10 | 10 | z |
\n",
"\t11 | 11 | z |
\n",
"\t12 | 12 | x |
\n",
"\t13 | 13 | z |
\n",
"\t14 | 14 | y |
\n",
"\t15 | 15 | z |
\n",
"\t16 | 16 | y |
\n",
"\t17 | 17 | x |
\n",
"\t18 | 18 | z |
\n",
"\t19 | 19 | z |
\n",
"\t20 | 20 | z |
\n",
"\n",
"
\n"
],
"text/latex": [
"A data.frame: 20 × 2\n",
"\\begin{tabular}{r|ll}\n",
" & S & Y\\\\\n",
" & & \\\\\n",
"\\hline\n",
"\t1 & 1 & z\\\\\n",
"\t2 & 2 & z\\\\\n",
"\t3 & 3 & z\\\\\n",
"\t4 & 4 & x\\\\\n",
"\t5 & 5 & z\\\\\n",
"\t6 & 6 & z\\\\\n",
"\t7 & 7 & z\\\\\n",
"\t8 & 8 & z\\\\\n",
"\t9 & 9 & z\\\\\n",
"\t10 & 10 & z\\\\\n",
"\t11 & 11 & z\\\\\n",
"\t12 & 12 & x\\\\\n",
"\t13 & 13 & z\\\\\n",
"\t14 & 14 & y\\\\\n",
"\t15 & 15 & z\\\\\n",
"\t16 & 16 & y\\\\\n",
"\t17 & 17 & x\\\\\n",
"\t18 & 18 & z\\\\\n",
"\t19 & 19 & z\\\\\n",
"\t20 & 20 & z\\\\\n",
"\\end{tabular}\n"
],
"text/markdown": [
"\n",
"A data.frame: 20 × 2\n",
"\n",
"| | S <int> | Y <chr> |\n",
"|---|---|---|\n",
"| 1 | 1 | z |\n",
"| 2 | 2 | z |\n",
"| 3 | 3 | z |\n",
"| 4 | 4 | x |\n",
"| 5 | 5 | z |\n",
"| 6 | 6 | z |\n",
"| 7 | 7 | z |\n",
"| 8 | 8 | z |\n",
"| 9 | 9 | z |\n",
"| 10 | 10 | z |\n",
"| 11 | 11 | z |\n",
"| 12 | 12 | x |\n",
"| 13 | 13 | z |\n",
"| 14 | 14 | y |\n",
"| 15 | 15 | z |\n",
"| 16 | 16 | y |\n",
"| 17 | 17 | x |\n",
"| 18 | 18 | z |\n",
"| 19 | 19 | z |\n",
"| 20 | 20 | z |\n",
"\n"
],
"text/plain": [
" S Y\n",
"1 1 z\n",
"2 2 z\n",
"3 3 z\n",
"4 4 x\n",
"5 5 z\n",
"6 6 z\n",
"7 7 z\n",
"8 8 z\n",
"9 9 z\n",
"10 10 z\n",
"11 11 z\n",
"12 12 x\n",
"13 13 z\n",
"14 14 y\n",
"15 15 z\n",
"16 16 y\n",
"17 17 x\n",
"18 18 z\n",
"19 19 z\n",
"20 20 z"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"# Example data\n",
"# df is a long-format data table w/columns for subject (S) and N-category outcome (Y)\n",
"df <- read.csv(\"data/0F0LBs_multinomial.csv\")\n",
"head(df, 20)"
]
},
{
"cell_type": "code",
"execution_count": 10,
"id": "68734281-bbbd-4b96-948d-bbf74532d68d",
"metadata": {
"vscode": {
"languageId": "r"
}
},
"outputs": [
{
"data": {
"text/plain": [
"\n",
"\tChi-squared test for given probabilities\n",
"\n",
"data: xt\n",
"X-squared = 18.9, df = 2, p-value = 7.869e-05\n"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"df$S = factor(df$S) # Subject id is nominal (unused)\n",
"df$Y = factor(df$Y) # Y is an outcome of ≥2 categories\n",
"xt = xtabs( ~ Y, data=df) # make counts\n",
"chisq.test(xt)"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "R",
"language": "R",
"name": "ir"
},
"language_info": {
"codemirror_mode": "r",
"file_extension": ".r",
"mimetype": "text/x-r-source",
"name": "R",
"pygments_lexer": "r",
"version": "4.2.2"
}
},
"nbformat": 4,
"nbformat_minor": 5
}