{ "cells": [ { "cell_type": "markdown", "id": "c47ebc4d-d172-4e70-9e16-cf6a0ba7153a", "metadata": {}, "source": [ "# One Sample Tests of Proportion - R" ] }, { "cell_type": "markdown", "id": "4ecefac5-3aac-4309-a413-d1e62eb0fc81", "metadata": {}, "source": [ "## Binomial Test\n", "\n", "* **Samples:** `1`\n", "* **Response Categories:** `2`\n", "* **Exact?:** Yes, use with `N≤200`\n", "* **Reporting:** \"Out of 60 outcomes, 19 were 'x' and 41 were 'y'. A two-sided exact binomial test indicated that these proportions were statistically significantly different from chance (p < .05)\"" ] }, { "cell_type": "code", "execution_count": 18, "id": "c7fa707e-af25-4dab-a4f6-cddd99a5ca4e", "metadata": { "vscode": { "languageId": "r" } }, "outputs": [ { "data": { "text/html": [ "\n", "\n", "\n", "\t\n", "\t\n", "\n", "\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\n", "
A data.frame: 20 × 2
SY
<int><chr>
1 1y
2 2y
3 3x
4 4y
5 5y
6 6x
7 7y
8 8x
9 9y
1010y
1111x
1212y
1313y
1414y
1515y
1616x
1717x
1818y
1919y
2020x
\n" ], "text/latex": [ "A data.frame: 20 × 2\n", "\\begin{tabular}{r|ll}\n", " & S & Y\\\\\n", " & & \\\\\n", "\\hline\n", "\t1 & 1 & y\\\\\n", "\t2 & 2 & y\\\\\n", "\t3 & 3 & x\\\\\n", "\t4 & 4 & y\\\\\n", "\t5 & 5 & y\\\\\n", "\t6 & 6 & x\\\\\n", "\t7 & 7 & y\\\\\n", "\t8 & 8 & x\\\\\n", "\t9 & 9 & y\\\\\n", "\t10 & 10 & y\\\\\n", "\t11 & 11 & x\\\\\n", "\t12 & 12 & y\\\\\n", "\t13 & 13 & y\\\\\n", "\t14 & 14 & y\\\\\n", "\t15 & 15 & y\\\\\n", "\t16 & 16 & x\\\\\n", "\t17 & 17 & x\\\\\n", "\t18 & 18 & y\\\\\n", "\t19 & 19 & y\\\\\n", "\t20 & 20 & x\\\\\n", "\\end{tabular}\n" ], "text/markdown": [ "\n", "A data.frame: 20 × 2\n", "\n", "| | S <int> | Y <chr> |\n", "|---|---|---|\n", "| 1 | 1 | y |\n", "| 2 | 2 | y |\n", "| 3 | 3 | x |\n", "| 4 | 4 | y |\n", "| 5 | 5 | y |\n", "| 6 | 6 | x |\n", "| 7 | 7 | y |\n", "| 8 | 8 | x |\n", "| 9 | 9 | y |\n", "| 10 | 10 | y |\n", "| 11 | 11 | x |\n", "| 12 | 12 | y |\n", "| 13 | 13 | y |\n", "| 14 | 14 | y |\n", "| 15 | 15 | y |\n", "| 16 | 16 | x |\n", "| 17 | 17 | x |\n", "| 18 | 18 | y |\n", "| 19 | 19 | y |\n", "| 20 | 20 | x |\n", "\n" ], "text/plain": [ " S Y\n", "1 1 y\n", "2 2 y\n", "3 3 x\n", "4 4 y\n", "5 5 y\n", "6 6 x\n", "7 7 y\n", "8 8 x\n", "9 9 y\n", "10 10 y\n", "11 11 x\n", "12 12 y\n", "13 13 y\n", "14 14 y\n", "15 15 y\n", "16 16 x\n", "17 17 x\n", "18 18 y\n", "19 19 y\n", "20 20 x" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "# Example data\n", "# df is a long-format data table w/columns for subject (S) and 2-category outcome (Y)\n", "df <- read.csv(\"data/0F0LBs_binomial.csv\")\n", "head(df, 20)" ] }, { "cell_type": "code", "execution_count": 19, "id": "9c5f3020-cc55-4064-bb7d-867f7d1f37dc", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "\n", "\tExact binomial test\n", "\n", "data: xt\n", "number of successes = 19, number of trials = 60, p-value = 0.006218\n", "alternative hypothesis: true probability of success is not equal to 0.5\n", "95 percent confidence interval:\n", " 0.2025755 0.4495597\n", "sample estimates:\n", "probability of success \n", " 0.3166667 \n" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "df$S = factor(df$S) # Subject id is nominal (unused)\n", "df$Y = factor(df$Y) # Y is an outcome of 2 categories\n", "xt = xtabs( ~ Y, data=df) # make counts\n", "binom.test(xt, p=0.5, alternative=\"two.sided\")" ] }, { "cell_type": "markdown", "id": "aee35848-3f2c-4738-9e52-0056aa785564", "metadata": {}, "source": [ "## Multinomial Test\n", "\n", "* **Samples:** `1`\n", "* **Response Categories:** `≥2`\n", "* **Exact?:** Yes, use with `N≤200`\n", "* **Reporting:** \"Out of 60 outcomes, 17 were 'x' and 8 were 'y', and 35 were 'z'. An exact multinomial test indicated that these proportions were statistically significantly different from chance (p < .0001)\"" ] }, { "cell_type": "code", "execution_count": 2, "id": "d8b724ff-c01e-4aec-9247-926285b7a52e", "metadata": { "tags": [], "vscode": { "languageId": "r" } }, "outputs": [ { "data": { "text/html": [ "\n", "\n", "\n", "\t\n", "\t\n", "\n", "\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\n", "
A data.frame: 20 × 2
SY
<int><chr>
1 1z
2 2z
3 3z
4 4x
5 5z
6 6z
7 7z
8 8z
9 9z
1010z
1111z
1212x
1313z
1414y
1515z
1616y
1717x
1818z
1919z
2020z
\n" ], "text/latex": [ "A data.frame: 20 × 2\n", "\\begin{tabular}{r|ll}\n", " & S & Y\\\\\n", " & & \\\\\n", "\\hline\n", "\t1 & 1 & z\\\\\n", "\t2 & 2 & z\\\\\n", "\t3 & 3 & z\\\\\n", "\t4 & 4 & x\\\\\n", "\t5 & 5 & z\\\\\n", "\t6 & 6 & z\\\\\n", "\t7 & 7 & z\\\\\n", "\t8 & 8 & z\\\\\n", "\t9 & 9 & z\\\\\n", "\t10 & 10 & z\\\\\n", "\t11 & 11 & z\\\\\n", "\t12 & 12 & x\\\\\n", "\t13 & 13 & z\\\\\n", "\t14 & 14 & y\\\\\n", "\t15 & 15 & z\\\\\n", "\t16 & 16 & y\\\\\n", "\t17 & 17 & x\\\\\n", "\t18 & 18 & z\\\\\n", "\t19 & 19 & z\\\\\n", "\t20 & 20 & z\\\\\n", "\\end{tabular}\n" ], "text/markdown": [ "\n", "A data.frame: 20 × 2\n", "\n", "| | S <int> | Y <chr> |\n", "|---|---|---|\n", "| 1 | 1 | z |\n", "| 2 | 2 | z |\n", "| 3 | 3 | z |\n", "| 4 | 4 | x |\n", "| 5 | 5 | z |\n", "| 6 | 6 | z |\n", "| 7 | 7 | z |\n", "| 8 | 8 | z |\n", "| 9 | 9 | z |\n", "| 10 | 10 | z |\n", "| 11 | 11 | z |\n", "| 12 | 12 | x |\n", "| 13 | 13 | z |\n", "| 14 | 14 | y |\n", "| 15 | 15 | z |\n", "| 16 | 16 | y |\n", "| 17 | 17 | x |\n", "| 18 | 18 | z |\n", "| 19 | 19 | z |\n", "| 20 | 20 | z |\n", "\n" ], "text/plain": [ " S Y\n", "1 1 z\n", "2 2 z\n", "3 3 z\n", "4 4 x\n", "5 5 z\n", "6 6 z\n", "7 7 z\n", "8 8 z\n", "9 9 z\n", "10 10 z\n", "11 11 z\n", "12 12 x\n", "13 13 z\n", "14 14 y\n", "15 15 z\n", "16 16 y\n", "17 17 x\n", "18 18 z\n", "19 19 z\n", "20 20 z" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "# Example data\n", "# df is a long-format data table w/columns for subject (S) and N-category outcome (Y)\n", "df <- read.csv(\"data/0F0LBs_multinomial.csv\")\n", "head(df, 20)" ] }, { "cell_type": "code", "execution_count": 4, "id": "0f24c392-cfb3-4c94-903d-5799a8a4ecc5", "metadata": { "vscode": { "languageId": "r" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\n", "P value (Prob) = 8.756e-05\n" ] } ], "source": [ "# install.packages(\"XNomial\")\n", "library(XNomial) # import for xmulti\n", "\n", "df$S = factor(df$S) # Subject id is nominal (unused)\n", "df$Y = factor(df$Y) # Y is an outcome of ≥2 categories\n", "xt = xtabs( ~ Y, data=df) # make counts\n", "xmulti(xt, rep(1/length(xt), length(xt)), statName=\"Prob\")" ] }, { "cell_type": "code", "execution_count": 5, "id": "1e5cfb31-bbed-4b9e-9299-d0c9d199b83c", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "\n", "\tExact multinomial test\n", "\n", "data: $(df,Y)\n", "p-value = 8.756e-05\n" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "# This can also be shortened using the RVAideMemoire library\n", "# install.packages(\"RVAideMemoire\")\n", "# on Ubuntu you may have trouble installing, see: TODO\n", "library(RVAideMemoire)\n", "\n", "multinomial.test(df$Y)" ] }, { "cell_type": "markdown", "id": "a6a1992e-7319-46eb-a1b4-3ada64b27eeb", "metadata": {}, "source": [ "## One-Sample Pearson Chi-Squared Test\n", "\n", "* **Samples:** `1`\n", "* **Response Categories:** `≥2`\n", "* **Exact?:** No, use with `N>200`\n", "* **Reporting:** \"Out of 60 outcomes, 17 were ‘x’, 8 were ‘y’, and 35 were ‘z’. A one-sample Pearson Chi-Squared test indicated that these proportions were statistically significantly different from chance (χ2 (2, N=60) = 18.90, p < .0001).\"" ] }, { "cell_type": "code", "execution_count": 9, "id": "ba12d90e-8888-44ff-a7e1-33ac0c43f055", "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", "\n", "\n", "\t\n", "\t\n", "\n", "\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\n", "
A data.frame: 20 × 2
SY
<int><chr>
1 1z
2 2z
3 3z
4 4x
5 5z
6 6z
7 7z
8 8z
9 9z
1010z
1111z
1212x
1313z
1414y
1515z
1616y
1717x
1818z
1919z
2020z
\n" ], "text/latex": [ "A data.frame: 20 × 2\n", "\\begin{tabular}{r|ll}\n", " & S & Y\\\\\n", " & & \\\\\n", "\\hline\n", "\t1 & 1 & z\\\\\n", "\t2 & 2 & z\\\\\n", "\t3 & 3 & z\\\\\n", "\t4 & 4 & x\\\\\n", "\t5 & 5 & z\\\\\n", "\t6 & 6 & z\\\\\n", "\t7 & 7 & z\\\\\n", "\t8 & 8 & z\\\\\n", "\t9 & 9 & z\\\\\n", "\t10 & 10 & z\\\\\n", "\t11 & 11 & z\\\\\n", "\t12 & 12 & x\\\\\n", "\t13 & 13 & z\\\\\n", "\t14 & 14 & y\\\\\n", "\t15 & 15 & z\\\\\n", "\t16 & 16 & y\\\\\n", "\t17 & 17 & x\\\\\n", "\t18 & 18 & z\\\\\n", "\t19 & 19 & z\\\\\n", "\t20 & 20 & z\\\\\n", "\\end{tabular}\n" ], "text/markdown": [ "\n", "A data.frame: 20 × 2\n", "\n", "| | S <int> | Y <chr> |\n", "|---|---|---|\n", "| 1 | 1 | z |\n", "| 2 | 2 | z |\n", "| 3 | 3 | z |\n", "| 4 | 4 | x |\n", "| 5 | 5 | z |\n", "| 6 | 6 | z |\n", "| 7 | 7 | z |\n", "| 8 | 8 | z |\n", "| 9 | 9 | z |\n", "| 10 | 10 | z |\n", "| 11 | 11 | z |\n", "| 12 | 12 | x |\n", "| 13 | 13 | z |\n", "| 14 | 14 | y |\n", "| 15 | 15 | z |\n", "| 16 | 16 | y |\n", "| 17 | 17 | x |\n", "| 18 | 18 | z |\n", "| 19 | 19 | z |\n", "| 20 | 20 | z |\n", "\n" ], "text/plain": [ " S Y\n", "1 1 z\n", "2 2 z\n", "3 3 z\n", "4 4 x\n", "5 5 z\n", "6 6 z\n", "7 7 z\n", "8 8 z\n", "9 9 z\n", "10 10 z\n", "11 11 z\n", "12 12 x\n", "13 13 z\n", "14 14 y\n", "15 15 z\n", "16 16 y\n", "17 17 x\n", "18 18 z\n", "19 19 z\n", "20 20 z" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "# Example data\n", "# df is a long-format data table w/columns for subject (S) and N-category outcome (Y)\n", "df <- read.csv(\"data/0F0LBs_multinomial.csv\")\n", "head(df, 20)" ] }, { "cell_type": "code", "execution_count": 10, "id": "68734281-bbbd-4b96-948d-bbf74532d68d", "metadata": { "vscode": { "languageId": "r" } }, "outputs": [ { "data": { "text/plain": [ "\n", "\tChi-squared test for given probabilities\n", "\n", "data: xt\n", "X-squared = 18.9, df = 2, p-value = 7.869e-05\n" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "df$S = factor(df$S) # Subject id is nominal (unused)\n", "df$Y = factor(df$Y) # Y is an outcome of ≥2 categories\n", "xt = xtabs( ~ Y, data=df) # make counts\n", "chisq.test(xt)" ] } ], "metadata": { "kernelspec": { "display_name": "R", "language": "R", "name": "ir" }, "language_info": { "codemirror_mode": "r", "file_extension": ".r", "mimetype": "text/x-r-source", "name": "R", "pygments_lexer": "r", "version": "4.2.2" } }, "nbformat": 4, "nbformat_minor": 5 }