| Type: | Package | 
| Title: | Categorical Data Analysis | 
| Version: | 0.1.4 | 
| Author: | Nick Williams | 
| Maintainer: | Nick Williams <ntwilliams.personal@gmail.com> | 
| Description: | Includes wrapper functions around existing functions for the analysis of categorical data and introduces functions for calculating risk differences and matched odds ratios. R currently supports a wide variety of tools for the analysis of categorical data. However, many functions are spread across a variety of packages with differing syntax and poor compatibility with each another. prop_test() combines the functions binom.test(), prop.test() and BinomCI() into one output. prop_power() allows for power and sample size calculations for both balanced and unbalanced designs. riskdiff() is used for calculating risk differences and matched_or() is used for calculating matched odds ratios. For further information on methods used that are not documented in other packages see Nathan Mantel and William Haenszel (1959) <doi:10.1093/jnci/22.4.719> and Alan Agresti (2002) <ISBN:0-471-36093-7>. | 
| License: | MIT + file LICENSE | 
| Encoding: | UTF-8 | 
| LazyData: | true | 
| Imports: | epitools, DescTools, cli, magrittr, Hmisc, broom, rlang | 
| RoxygenNote: | 6.1.1 | 
| Suggests: | testthat, dplyr, forcats | 
| NeedsCompilation: | no | 
| Packaged: | 2019-06-14 13:52:19 UTC; niw4001 | 
| Repository: | CRAN | 
| Date/Publication: | 2019-06-14 14:10:03 UTC | 
Pipe operator
Description
See magrittr::%>% for details.
Usage
lhs %>% rhs
Matched pairs odds ratio and confidence interval
Description
Create odds ratio and confidence interval from matched pairs data.
Usage
matched_or(df, ...)
Arguments
df | 
 a dataframe with binary variables x and y or a 2 x 2 frequency table/matrix. If a table or matrix, x and y must be NULL. Used to select method.  | 
... | 
 further arguments passed to or from other methods.  | 
Details
The matched pairs odds ratio and confidence interval is the equivalent of calculating a Cochran-Mantel-Haenszel odds ratio where each pair is treated as a stratum.
Value
a list with class "matched_or" with the following components:
tab | 
 2x2 table using for calculating risk difference  | 
or | 
 dataframe with columns corresponding to matched-pairs OR, lower bound, and upper bound of CI  | 
conf.level | 
 specified confidence level  | 
Examples
set.seed(1)
gene <- data.frame(pair = seq(1:35),
                   ulcer = rbinom(35, 1, .7),
                   healthy = rbinom(35, 1, .4))
matched_or(gene, ulcer, healthy)
Matched pairs odds ratio from a data frame
Description
Create odds ratio and confidence interval from matched pairs data.
Usage
## S3 method for class 'data.frame'
matched_or(df, x, y, weight = NULL, alpha = 0.05,
  rev = c("neither", "rows", "columns", "both"), ...)
Arguments
df | 
 a dataframe with binary variables x and y.  | 
x | 
 binary vector, used as rows for frequency table and calculations.  | 
y | 
 binary vector, used as columns for frequency table and calculations.  | 
weight | 
 an optional vector of count weights.  | 
alpha | 
 level of significance for confidence interval.  | 
rev | 
 reverse order of cells. Options are "row", "columns", "both", and "neither" (default).  | 
... | 
 further arguments passed to or from other methods.  | 
Value
a list with class "matched_or" with the following components:
tab | 
 2x2 table using for calculating risk difference  | 
or | 
 dataframe with columns corresponding to matched-pairs OR, lower bound, and upper bound of CI  | 
conf.level | 
 specified confidence level  | 
Examples
gene <- data.frame(pair = seq(1:35),
                   ulcer = rbinom(35, 1, .7),
                   healthy = rbinom(35, 1, .4))
matched_or(gene, ulcer, healthy)
Matched pairs odds ratio from a table
Description
Create odds ratio and confidence interval from matched pairs data.
Usage
## S3 method for class 'table'
matched_or(df, alpha = 0.05, rev = c("neither", "rows",
  "columns", "both"), ...)
Arguments
df | 
 a dataframe with binary variables x and y or a 2 x 2 frequency table/matrix.  | 
alpha | 
 level of significance for confidence interval.  | 
rev | 
 reverse order of cells. Options are "row", "columns", "both", and "neither" (default).  | 
... | 
 further arguments passed to or from other methods.  | 
Value
a list with class "matched_or" with the following components:
tab | 
 2x2 table using for calculating risk difference  | 
or | 
 dataframe with columns corresponding to matched-pairs OR, lower bound, and upper bound of CI  | 
conf.level | 
 specified confidence level  | 
Examples
gene <- data.frame(pair = seq(1:35),
                   ulcer = rbinom(35, 1, .7),
                   healthy = rbinom(35, 1, .4))
gene_tab <- xtabs(~ ulcer + healthy, data = gene)
gene_tab %>% matched_or()
Power and sample size for 2 proportions
Description
Calculate power and sample size for comparison of 2 proportions for both balanced and unbalanced designs.
Usage
prop_power(n, n1, n2, p1, p2, fraction = 0.5, alpha = 0.05,
  power = NULL, alternative = c("two.sided", "one.sided"), odds.ratio,
  percent.reduction, ...)
Arguments
n | 
 total sample size.  | 
n1 | 
 sample size in group 1.  | 
n2 | 
 sample size in group 2.  | 
p1 | 
 group 1 proportion.  | 
p2 | 
 group 2 proportion.  | 
fraction | 
 fraction of total observations that are in group 1.  | 
alpha | 
 significance level/type 1 error rate.  | 
power | 
 desired power, between 0 and 1.  | 
alternative | 
 alternative hypothesis, one- or two-sided test.  | 
odds.ratio | 
 odds ratio comparing p2 to p2.  | 
percent.reduction | 
 percent reduction of p1 to p2.  | 
... | 
 further arguments passed to or from other methods.  | 
Details
Power calculations are done using the methods described in 'stats::power.prop.test', 'Hmisc::bsamsize', and 'Hmisc::bpower'.
Value
a list with class "prop_power" containing the following components:
n | 
 the total sample size  | 
n1 | 
 the sample size in group 1  | 
n2 | 
 the sample size in group 2  | 
p1 | 
 the proportion in group 1  | 
p2 | 
 the proportion in group 2  | 
power | 
 calculated or desired power  | 
sig.level | 
 level of significance  | 
See Also
[stats::power.prop.test], [Hmisc::bsamsize], [Hmisc:bpower]
Examples
prop_power(n = 220, p1 = 0.35, p2 = 0.2)
prop_power(p1 = 0.35, p2 = 0.2, fraction = 2/3, power = 0.85)
prop_power(p1 = 0.35, n = 220, percent.reduction = 42.857)
prop_power(p1 = 0.35, n = 220, odds.ratio = 0.4642857)
Tests for equality of proportions
Description
Conduct 1-sample tests of proportions and tests for equality of k proportions.
Usage
prop_test(x, ...)
Arguments
x | 
 a vector of counts, a one-dimensional table with two entries, or a two-dimensional table with 2 columns. Used to select method.  | 
... | 
 further arguments passed to or from other methods.  | 
Details
Calculations are done using the methods described in 'stats::binom.test()' and 'stats::prop.test()'
Value
a list with class "prop_test" containing the following components:
x | 
 number of successes  | 
n | 
 number of trials  | 
p | 
 null proportion  | 
statistic | 
 the value of Pearson's chi-squared test statistic  | 
p_value | 
 p-value corresponding to chi-squared test statistic  | 
df | 
 degrees of freedom  | 
method | 
 the method used to calculate the confidence interval  | 
method_ci | 
 confidence interval calculated using specified method  | 
exact_ci | 
 exact confidence interval  | 
exact_p | 
 p-value from exact test  | 
See Also
[stats::binom.test()], [stats::prop.test()]
Examples
prop_test(7, 50, method = "wald", p = 0.2)
prop_test(7, 50, method = "wald", p = 0.2, exact = TRUE)
prop_test(c(23, 24), c(50, 55))
vietnam <- data.frame(
   service = c(rep("yes", 2), rep("no", 2)),
   sleep = c(rep(c("yes", "no"), 2)),
   count = c(173, 160, 599, 851)
)
sleep <- xtabs(count ~ service + sleep, data = vietnam)
prop_test(sleep)
prop_test(vietnam, service, sleep, count)
Tests for equality of proportions
Description
Conduct 1-sample tests of proportions and tests for equality of k proportions.
Usage
## S3 method for class 'data.frame'
prop_test(x, pred, out, weight = NULL,
  rev = c("neither", "rows", "columns", "both"), method = c("wald",
  "wilson", "agresti-couli", "jeffreys", "modified wilson", "wilsoncc",
  "modified jeffreys", "clopper-pearson", "arcsine", "logit", "witting",
  "pratt"), alternative = c("two.sided", "less", "greater"),
  conf.level = 0.95, correct = FALSE, exact = FALSE, ...)
Arguments
x | 
 a dataframe with categorical variable   | 
pred | 
 predictor/exposure, vector.  | 
out | 
 outcome, vector.  | 
weight | 
 an optional vector of count weights.  | 
rev | 
 reverse order of cells. Options are "row", "columns", "both", and "neither" (default).  | 
method | 
 a character string indicating method for calculating confidence interval, default is "wald". Options include, wald, wilson, agresti-couli, jeffreys, modified wilson, wilsoncc modified jeffreys, clopper-pearson, arcsine, logit, witting, and pratt.  | 
alternative | 
 character string specifying the alternative hypothesis. Possible options are "two.sided" (default), "greater", or "less".  | 
conf.level | 
 confidence level for confidence interval, default is 0.95.  | 
correct | 
 a logical indicating whether Yate's continuity correction should be applied.  | 
exact | 
 a logical indicating whether to output exact p-value, ignored if k-sample test.  | 
... | 
 further arguments passed to or from other methods.  | 
Value
a list with class "prop_test" containing the following components:
x | 
 number of successes  | 
n | 
 number of trials  | 
p | 
 null proportion  | 
statistic | 
 the value of Pearson's chi-squared test statistic  | 
p_value | 
 p-value corresponding to chi-squared test statistic  | 
df | 
 degrees of freedom  | 
method | 
 the method used to calculate the confidence interval  | 
method_ci | 
 confidence interval calculated using specified method  | 
exact_ci | 
 exact confidence interval  | 
exact_p | 
 p-value from exact test  | 
Examples
vietnam <- data.frame(
   service = c(rep("yes", 2), rep("no", 2)),
   sleep = c(rep(c("yes", "no"), 2)),
   count = c(173, 160, 599, 851)
)
prop_test(vietnam, service, sleep, count)
Tests for equality of proportions
Description
Conduct 1-sample tests of proportions and tests for equality of k proportions.
Usage
## S3 method for class 'matrix'
prop_test(x, method = c("wald", "wilson",
  "agresti-couli", "jeffreys", "modified wilson", "wilsoncc",
  "modified jeffreys", "clopper-pearson", "arcsine", "logit", "witting",
  "pratt"), alternative = c("two.sided", "less", "greater"),
  conf.level = 0.95, correct = FALSE, exact = FALSE, ...)
Arguments
x | 
 a 2 x k matrix.  | 
method | 
 a character string indicating method for calculating confidence interval, default is "wald". Options include, wald, wilson, agresti-couli, jeffreys, modified wilson, wilsoncc modified jeffreys, clopper-pearson, arcsine, logit, witting, and pratt.  | 
alternative | 
 character string specifying the alternative hypothesis. Possible options are "two.sided" (default), "greater", or "less".  | 
conf.level | 
 confidence level for confidence interval, default is 0.95.  | 
correct | 
 a logical indicating whether Yate's continuity correction should be applied.  | 
exact | 
 a logical indicating whether to output exact p-value, ignored if k-sample test.  | 
... | 
 further arguments passed to or from other methods.  | 
Value
a list with class "prop_test" containing the following components:
x | 
 number of successes  | 
n | 
 number of trials  | 
p | 
 null proportion  | 
statistic | 
 the value of Pearson's chi-squared test statistic  | 
p_value | 
 p-value corresponding to chi-squared test statistic  | 
df | 
 degrees of freedom  | 
method | 
 the method used to calculate the confidence interval  | 
method_ci | 
 confidence interval calculated using specified method  | 
exact_ci | 
 exact confidence interval  | 
exact_p | 
 p-value from exact test  | 
Examples
matrix(c(23, 48, 76, 88), nrow = 2, ncol = 2) %>% prop_test()
Tests for equality of proportions
Description
Conduct 1-sample tests of proportions and tests for equality of k proportions.
Usage
## S3 method for class 'numeric'
prop_test(x, n, p = 0.5, method = c("wald", "wilson",
  "agresti-couli", "jeffreys", "modified wilson", "wilsoncc",
  "modified jeffreys", "clopper-pearson", "arcsine", "logit", "witting",
  "pratt"), alternative = c("two.sided", "less", "greater"),
  conf.level = 0.95, correct = FALSE, exact = FALSE, ...)
Arguments
x | 
 a vector of counts.  | 
n | 
 a vector of counts of trials  | 
p | 
 a probability for the null hypothesis when testing a single proportion; ignored if comparing multiple proportions.  | 
method | 
 a character string indicating method for calculating confidence interval, default is "wald". Options include, wald, wilson, agresti-couli, jeffreys, modified wilson, wilsoncc modified jeffreys, clopper-pearson, arcsine, logit, witting, and pratt.  | 
alternative | 
 character string specifying the alternative hypothesis. Possible options are "two.sided" (default), "greater", or "less".  | 
conf.level | 
 confidence level for confidence interval, default is 0.95.  | 
correct | 
 a logical indicating whether Yate's continuity correction should be applied.  | 
exact | 
 a logical indicating whether to output exact p-value, ignored if k-sample test.  | 
... | 
 further arguments passed to or from other methods.  | 
Value
a list with class "prop_test" containing the following components:
x | 
 number of successes  | 
n | 
 number of trials  | 
p | 
 null proportion  | 
statistic | 
 the value of Pearson's chi-squared test statistic  | 
p_value | 
 p-value corresponding to chi-squared test statistic  | 
df | 
 degrees of freedom  | 
method | 
 the method used to calculate the confidence interval  | 
method_ci | 
 confidence interval calculated using specified method  | 
exact_ci | 
 exact confidence interval  | 
exact_p | 
 p-value from exact test  | 
Examples
prop_test(7, 50, method = "wald", p = 0.2)
prop_test(7, 50, method = "wald", p = 0.2, exact = TRUE)
Tests for equality of proportions
Description
Conduct 1-sample tests of proportions and tests for equality of k proportions.
Usage
## S3 method for class 'table'
prop_test(x, method = c("wald", "wilson",
  "agresti-couli", "jeffreys", "modified wilson", "wilsoncc",
  "modified jeffreys", "clopper-pearson", "arcsine", "logit", "witting",
  "pratt"), alternative = c("two.sided", "less", "greater"),
  conf.level = 0.95, correct = FALSE, exact = FALSE, ...)
Arguments
x | 
 a 2 x k table.  | 
method | 
 a character string indicating method for calculating confidence interval, default is "wald". Options include, wald, wilson, agresti-couli, jeffreys, modified wilson, wilsoncc modified jeffreys, clopper-pearson, arcsine, logit, witting, and pratt.  | 
alternative | 
 character string specifying the alternative hypothesis. Possible options are "two.sided" (default), "greater", or "less".  | 
conf.level | 
 confidence level for confidence interval, default is 0.95.  | 
correct | 
 a logical indicating whether Yate's continuity correction should be applied.  | 
exact | 
 a logical indicating whether to output exact p-value, ignored if k-sample test.  | 
... | 
 further arguments passed to or from other methods.  | 
Value
a list with class "prop_test" containing the following components:
x | 
 number of successes  | 
n | 
 number of trials  | 
p | 
 null proportion  | 
statistic | 
 the value of Pearson's chi-squared test statistic  | 
p_value | 
 p-value corresponding to chi-squared test statistic  | 
df | 
 degrees of freedom  | 
method | 
 the method used to calculate the confidence interval  | 
method_ci | 
 confidence interval calculated using specified method  | 
exact_ci | 
 exact confidence interval  | 
exact_p | 
 p-value from exact test  | 
Examples
vietnam <- data.frame(
     service = c(rep("yes", 2), rep("no", 2), rep("maybe", 2)),
     sleep = rep(c("yes", "no"), 3),
     count = c(173, 160, 599, 851, 400, 212)
)
xtabs(count ~ service + sleep, data = vietnam) %>% prop_test()
Risk difference
Description
Calculate risk difference and 95 percent confidence interval using Wald method.
Usage
riskdiff(df, ...)
Arguments
df | 
 a dataframe with binary variables x and y or a 2 x 2 frequency table/matrix. If a table or matrix, x and y must be NULL. Used to select method.  | 
... | 
 further arguments passed to or from other methods.  | 
Value
a list with class "rdiff" containing the following components:
rd | 
 risk difference  | 
conf.level | 
 specified confidence level  | 
ci | 
 calculated confidence interval  | 
p1 | 
 proportion one  | 
p2 | 
 proportion two  | 
tab | 
 2x2 table using for calculating risk difference  | 
Examples
trial <- data.frame(
  disease = c(rep("yes", 2), rep("no", 2)),
  treatment = c(rep(c("estrogen", "placebo"), 2)),
  count = c(751, 623, 7755, 7479))
riskdiff(trial, treatment, disease, count, rev = "columns")
Risk difference
Description
Calculate risk difference and 95 percent confidence interval using Wald method.
Usage
## S3 method for class 'data.frame'
riskdiff(df, x = NULL, y = NULL, weight = NULL,
  conf.level = 0.95, rev = c("neither", "rows", "columns", "both"),
  ...)
Arguments
df | 
 a dataframe with binary variables x and y.  | 
x | 
 binary predictor/exposure, vector.  | 
y | 
 binary outcome, vector.  | 
weight | 
 an optional vector of count weights.  | 
conf.level | 
 confidence level for confidence interval, default is 0.95.  | 
rev | 
 reverse order of cells. Options are "row", "columns", "both", and "neither" (default).  | 
... | 
 further arguments passed to or from other methods.  | 
Value
a list with class "rdiff" containing the following components:
rd | 
 risk difference  | 
conf.level | 
 specified confidence level  | 
ci | 
 calculated confidence interval  | 
p1 | 
 proportion one  | 
p2 | 
 proportion two  | 
tab | 
 2x2 table using for calculating risk difference  | 
Examples
trial <- data.frame(
  disease = c(rep("yes", 2), rep("no", 2)),
  treatment = c(rep(c("estrogen", "placebo"), 2)),
  count = c(751, 623, 7755, 7479))
riskdiff(trial, treatment, disease, count, rev = "columns")
Risk difference
Description
Calculate risk difference and 95 percent confidence interval using Wald method.
Usage
## S3 method for class 'matrix'
riskdiff(df, conf.level = 0.95, dnn = NULL,
  rev = c("neither", "rows", "columns", "both"), ...)
Arguments
df | 
 a 2 x 2 frequency matrix.  | 
conf.level | 
 confidence level for confidence interval, default is 0.95.  | 
dnn | 
 optional character vector of dimension names.  | 
rev | 
 reverse order of cells. Options are "row", "columns", "both", and "neither" (default).  | 
... | 
 further arguments passed to or from other methods.  | 
Value
a list with class "rdiff" containing the following components:
rd | 
 risk difference  | 
conf.level | 
 specified confidence level  | 
ci | 
 calculated confidence interval  | 
p1 | 
 proportion one  | 
p2 | 
 proportion two  | 
tab | 
 2x2 table using for calculating risk difference  | 
Examples
matrix(c(12, 45, 69, 15), nrow = 2, ncol = 2) %>%
   riskdiff(dnn = c("New Drug", "Adverse Outcome"))
Risk difference
Description
Calculate risk difference and 95 percent confidence interval using Wald method.
Usage
## S3 method for class 'table'
riskdiff(df, conf.level = 0.95, rev = c("neither",
  "rows", "columns", "both"), ...)
Arguments
df | 
 a 2 x 2 frequency table.  | 
conf.level | 
 confidence level for confidence interval, default is 0.95.  | 
rev | 
 reverse order of cells. Options are "row", "columns", "both", and "neither" (default).  | 
... | 
 further arguments passed to or from other methods.  | 
Value
a list with class "rdiff" containing the following components:
rd | 
 risk difference  | 
conf.level | 
 specified confidence level  | 
ci | 
 calculated confidence interval  | 
p1 | 
 proportion one  | 
p2 | 
 proportion two  | 
tab | 
 2x2 table using for calculating risk difference  | 
Examples
trial <- data.frame(
  disease = c(rep("yes", 2), rep("no", 2)),
  treatment = c(rep(c("estrogen", "placebo"), 2)),
  count = c(751, 623, 7755, 7479))
xtabs(count ~ treatment + disease, data = trial) %>% riskdiff()
Create 2 x k frequency tables
Description
Helper function for creating 2 x k frequency tables.
Usage
tavolo(df, ...)
Arguments
df | 
 a dataframe with binary variable y and categorical variable x or a 2 x k frequency table/matrix. If a table or matrix, x and y must be NULL. Used to select method.  | 
... | 
 further arguments passed to or from other methods.  | 
Value
tab | 
 2 x k frequency table  | 
Examples
trial <- data.frame(disease = c(rep("yes", 2), rep("no", 2)),
                    treatment = c(rep(c("estrogen", "placebo"), 2)),
                    count = c(751, 623, 7755, 7479))
tavolo(trial, treatment, disease, count)
Create 2 x k frequency tables
Description
Helper function for creating 2 x k frequency tables.
Usage
## S3 method for class 'data.frame'
tavolo(df, x, y, weight = NULL, rev = c("neither",
  "rows", "columns", "both"), ...)
Arguments
df | 
 a dataframe with binary variable y and categorical variable x.  | 
x | 
 categorical predictor/exposure, vector.  | 
y | 
 binary outcome, vector.  | 
weight | 
 an optional vector of count weights.  | 
rev | 
 character string indicating whether to switch row or column order, possible options are "neither", "rows", "columns", or "both". The default is "neither".  | 
... | 
 further arguments passed to or from other methods.  | 
Value
tab | 
 2 x k frequency table  | 
Examples
trial <- data.frame(disease = c(rep("yes", 2), rep("no", 2)),
                    treatment = c(rep(c("estrogen", "placebo"), 2)),
                    count = c(751, 623, 7755, 7479))
tavolo(trial, treatment, disease, count)
Create 2 x k frequency tables
Description
Helper function for creating 2 x k frequency tables.
Usage
## S3 method for class 'matrix'
tavolo(df, dnn = NULL, rev = c("neither", "rows",
  "columns", "both"), ...)
Arguments
df | 
 a 2 x k frequency matrix.  | 
dnn | 
 optional character vector of dimension names.  | 
rev | 
 character string indicating whether to switch row or column order, possible options are "neither", "rows", "columns", or "both". The default is "neither".  | 
... | 
 further arguments passed to or from other methods.  | 
Value
tab | 
 2 x k frequency table  | 
Examples
tavolo(matrix(c(23, 45, 67, 12), nrow = 2, ncol = 2), rev = "both")
Create 2 x k frequency tables
Description
Helper function for creating 2 x k frequency tables.
Usage
## S3 method for class 'table'
tavolo(df, rev = c("neither", "rows", "columns", "both"),
  ...)
Arguments
df | 
 a 2 x k frequency table.  | 
rev | 
 character string indicating whether to switch row or column order, possible options are "neither", "rows", "columns", or "both". The default is "neither".  | 
... | 
 further arguments passed to or from other methods.  | 
Value
tab | 
 2 x k frequency table  | 
Examples
trial <- data.frame(disease = c(rep("yes", 3), rep("no", 3)),
                    treatment = rep(c("estrogen", "placebo", "other"), 2),
                    count = c(751, 623, 7755, 7479, 9000, 456))
xtabs(count ~ treatment + disease, data = trial) %>% tavolo(rev = "columns")