| Title: | A Workflow for Statistical Testing, Interpretation, and 'ggplot2'-Based Visualization |
| Version: | 0.8.2 |
| Author: | Imad EL BADISY [aut, cre] |
| Maintainer: | Imad EL BADISY <elbadisyimad@gmail.com> |
| Description: | Provides a unified workflow for choosing, running, interpreting, and visualizing common statistical tests. The package combines assumption checks, test selection, effect sizes, formatted results, plain-language interpretation, and 'ggplot2'-based statistical visualizations. Implemented methods follow standard references including Casella and Berger (2002, ISBN:9780534243128), Hollander et al. (2013, ISBN:9781118553299), Agresti (2013, ISBN:9780470463635), and Cohen (1988, ISBN:9780805802832). |
| License: | MIT + file LICENSE |
| Encoding: | UTF-8 |
| Depends: | R (≥ 4.1.0) |
| RoxygenNote: | 7.3.3 |
| Imports: | stats, cli, rlang, dplyr, tidyr, tidyselect, tibble, purrr, broom, ggplot2 |
| Suggests: | testthat (≥ 3.0.0), knitr, rmarkdown, car, effectsize, rstatix, patchwork |
| Config/testthat/edition: | 3 |
| VignetteBuilder: | knitr |
| NeedsCompilation: | no |
| Packaged: | 2026-06-26 18:39:54 UTC; imad-el-badisy |
| Repository: | CRAN |
| Date/Publication: | 2026-07-02 18:40:02 UTC |
Convert a testflow object to a one-row tibble
Description
Convert a testflow object to a one-row tibble
Usage
as_tibble(x, ...)
Arguments
x |
A testflow object. |
... |
Unused. |
Value
A one-row tibble with the workflow name, design, variables, recommended test, null hypothesis, statistic, degrees of freedom when available, p-value, confidence interval when available, effect-size fields, and decision text.
Simulate a small cardiovascular teaching dataset
Description
Simulate a small cardiovascular teaching dataset
Usage
make_cardio_data(n = 180, seed = 2026)
Arguments
n |
Number of rows. |
seed |
Random seed. |
Value
A tibble with example numeric and categorical variables.
Plot a testflow object
Description
Plot a testflow object
Usage
## S3 method for class 'testflow'
plot(x, title = NULL, subtitle = NULL, caption = NULL, ...)
Arguments
x |
A testflow object. |
title |
Optional plot title override. Defaults to the stored title. |
subtitle |
Optional plot subtitle override. Defaults to the stored subtitle. |
caption |
Optional plot caption override. Defaults to the stored caption. |
... |
Unused. |
Value
A ggplot object stored in the testflow object, with optional
title, subtitle, and caption overrides applied. If the workflow was created
with plot = FALSE, returns NULL.
Print a testflow object
Description
Print a testflow object.
Usage
## S3 method for class 'testflow'
print(x, ...)
Arguments
x |
A testflow object. |
... |
Unused. |
Details
Console colors are enabled by default in interactive sessions. Use
options(testflow.cli_colors = FALSE) to disable colors, or
options(testflow.cli_colors = TRUE) to force colors in non-interactive
output.
Value
The input testflow object, invisibly. Called for its side effect of
printing a formatted workflow summary to the console.
Return a ready-to-use testflow report
Description
Return a ready-to-use testflow report
Usage
report(x, ...)
Arguments
x |
A testflow object. |
... |
Unused. |
Value
A length-one character vector containing a plain-language summary of the workflow result, including the design, recommended test, p-value, effect size when reported, and null hypothesis when available.
Return a ready-to-use testflow report
Description
Return a ready-to-use testflow report
Usage
report_test(x)
Arguments
x |
A testflow object. |
Value
A length-one character vector containing the same report text as
report() for a testflow object.
Summarize a testflow object
Description
Summarize a testflow object.
Usage
## S3 method for class 'testflow'
summary(object, ...)
Arguments
object |
A testflow object. |
... |
Unused. |
Details
Console colors follow the same testflow.cli_colors option used by
print.testflow().
Value
A summary.testflow list containing the workflow metadata,
descriptives, assumptions, recommended test, primary and alternative test
results, post-hoc results when available, effect size, decision, and report
text.
Build a compact descriptive summary table
Description
Builds a compact table of descriptive summaries using a formula interface. Numeric variables are summarized as mean (SD); median [Q1, Q3]; n. Categorical variables are summarized as n (percent).
Usage
sumtab(
formula,
data,
p_value = FALSE,
overall = TRUE,
digits = 1,
p_digits = 3,
alpha = 0.05,
fisher_threshold = 5,
na.rm = TRUE
)
Arguments
formula |
A one-sided formula such as |
data |
A data frame. |
p_value |
Logical; add a p-value column when a grouping variable is supplied. |
overall |
Logical; include an overall summary column. |
digits |
Number of digits for summary statistics. |
p_digits |
Number of digits for formatted p-values. |
alpha |
Significance level used by automatic test selection. |
fisher_threshold |
Expected-count threshold for Fisher's exact test. |
na.rm |
Logical; remove missing values before summaries and tests. |
Details
When p_value = TRUE and a grouping variable is supplied,
sumtab() chooses the p-value test automatically. Numeric variables use
Student t-test, Welch t-test, or Wilcoxon rank-sum test for two groups, and
one-way ANOVA, Welch ANOVA, or Kruskal-Wallis test for more than two groups.
Categorical variables use a chi-square test unless expected counts fall below
fisher_threshold, in which case Fisher's exact test is used.
Value
A tibble with one row per numeric variable and one row per categorical level.
Examples
dat <- make_cardio_data(80, seed = 1)
sumtab(~ age + sex | treatment, dat, p_value = TRUE)
Test association between two categorical variables
Description
Test association between two categorical variables
Usage
test_categorical(
formula,
data,
y = NULL,
alpha = 0.05,
fisher_threshold = 5,
plot = TRUE,
na.rm = TRUE
)
Arguments
formula |
A formula such as |
data |
A data frame, or a first categorical column when using data-first style. |
y |
Second categorical column. Optional when using formula style. |
alpha |
Significance level. |
fisher_threshold |
Expected-count threshold for Fisher's exact test. |
plot |
Logical; include a ggplot object. |
na.rm |
Logical; remove missing values. |
Value
A testflow object with class testflow_categorical. The object is
a list containing the cleaned data, categorical descriptives, assumption
checks, recommended association test, primary test result with null
hypothesis, alternative chi-square and Fisher results, effect size, optional
ggplot, original call, and report text.
References
Pearson, K. (1900). On the criterion that a given system of deviations from the probable in the case of a correlated system of variables is such that it can be reasonably supposed to have arisen from random sampling. Philosophical Magazine, 50(302), 157-175.
Fisher, R. A. (1922). On the interpretation of chi-square from contingency tables, and the calculation of P. Journal of the Royal Statistical Society, 85(1), 87-94.
Cramer, H. (1946). Mathematical Methods of Statistics. Princeton.
Test correlation between two numeric variables
Description
Test correlation between two numeric variables
Usage
test_correlation(
formula,
data,
y = NULL,
method = c("auto", "pearson", "spearman", "kendall"),
alpha = 0.05,
plot = TRUE,
na.rm = TRUE
)
Arguments
formula |
A formula such as |
data |
A data frame, or a first numeric column when using data-first style. |
y |
Second numeric column. Optional when using formula style. |
method |
Correlation method or |
alpha |
Significance level. |
plot |
Logical; include a ggplot object. |
na.rm |
Logical; remove missing values. |
Value
A testflow object with class testflow_correlation. The object is
a list containing the cleaned complete-case data, numeric descriptives,
assumption checks, recommended correlation method, primary correlation test
with null hypothesis, Pearson/Spearman/Kendall results, a correlation table,
effect size, optional ggplot, original call, and report text.
References
Pearson, K. (1895). Notes on regression and inheritance in the case of two parents. Proceedings of the Royal Society of London, 58, 240-242.
Spearman, C. (1904). The proof and measurement of association between two things. The American Journal of Psychology, 15(1), 72-101.
Kendall, M. G. (1938). A new measure of rank correlation. Biometrika, 30(1/2), 81-93.
Test a correlation matrix
Description
Test a correlation matrix
Usage
test_correlation_matrix(
data,
vars,
method = c("spearman", "pearson", "kendall"),
alpha = 0.05,
plot = TRUE,
na.rm = TRUE
)
Arguments
data |
A data frame. |
vars |
Numeric columns. |
method |
Correlation method. |
alpha |
Significance level. |
plot |
Logical; include a ggplot object. |
na.rm |
Logical; remove missing values. |
Value
A testflow object with class testflow_correlation_matrix. The
object is a list containing the cleaned data, numeric descriptives,
screening assumptions, selected correlation-matrix method, pairwise
correlation matrix, pairwise p-value table, maximum absolute correlation as
an effect-size summary, optional heatmap ggplot, original call, and report
text.
Run a factorial ANOVA workflow
Description
Run a factorial ANOVA workflow
Usage
test_factorial(
formula,
data,
factors = NULL,
alpha = 0.05,
type = 2,
plot = TRUE,
na.rm = TRUE
)
Arguments
formula |
A formula such as |
data |
A data frame, or the outcome column when using data-first style. |
factors |
Factor columns selected with tidyselect syntax. Optional when using formula style. |
alpha |
Significance level. |
type |
ANOVA type placeholder for future car integration. |
plot |
Logical; include a ggplot object. |
na.rm |
Logical; remove missing values. |
Value
A testflow object with class testflow_factorial. The object is a
list containing the cleaned data, descriptive statistics, residual and
variance assumption checks, recommended factorial ANOVA, primary ANOVA term
result with null hypothesis, ANOVA table, effect size, optional ggplot,
original call, and report text.
References
Fisher, R. A. (1925). Statistical Methods for Research Workers. Oliver and Boyd.
Cohen, J. (1988). Statistical Power Analysis for the Behavioral Sciences (2nd ed.). Lawrence Erlbaum.
Compare a numeric outcome across more than two groups
Description
Compare a numeric outcome across more than two groups
Usage
test_groups(
formula,
data,
group = NULL,
alpha = 0.05,
posthoc = TRUE,
plot = TRUE,
na.rm = TRUE
)
Arguments
formula |
A formula such as |
data |
A data frame, or an outcome column when using data-first style. |
group |
Grouping column. Optional when using formula style. |
alpha |
Significance level. |
posthoc |
Logical; compute post-hoc comparisons. |
plot |
Logical; include a ggplot object. |
na.rm |
Logical; remove missing values. |
Value
A testflow object with class testflow_groups. The object is a
list containing the cleaned data, grouped descriptive statistics, assumption
checks, recommended omnibus test, primary test result with null hypothesis,
alternative omnibus results, post-hoc comparisons when requested, effect
size, optional ggplot, original call, and report text.
Test multinomial goodness of fit
Description
Test multinomial goodness of fit
Usage
test_multinomial(
data,
outcome,
p = NULL,
alpha = 0.05,
plot = TRUE,
na.rm = TRUE
)
Arguments
data |
A data frame. |
outcome |
Categorical outcome column. |
p |
Expected probabilities, or |
alpha |
Significance level. |
plot |
Logical; include a ggplot object. |
na.rm |
Logical; remove missing values. |
Value
A testflow object with class testflow_multinomial. The object is
a list containing the cleaned data, categorical descriptives, assumption
checks, recommended goodness-of-fit test, primary chi-square result with null
hypothesis, pairwise binomial checks, effect-size summary, optional
ggplot, original call, and report text.
References
Pearson, K. (1900). On the criterion that a given system of deviations from the probable in the case of a correlated system of variables is such that it can be reasonably supposed to have arisen from random sampling. Philosophical Magazine, 50(302), 157-175.
Test one numeric sample against a reference value
Description
Test one numeric sample against a reference value
Usage
test_one_sample(
data,
outcome,
mu = 0,
alternative = c("two.sided", "less", "greater"),
alpha = 0.05,
plot = TRUE,
na.rm = TRUE
)
Arguments
data |
A data frame. |
outcome |
Numeric outcome column. |
mu |
Reference value. |
alternative |
Alternative hypothesis. |
alpha |
Significance level. |
plot |
Logical; include a ggplot object. |
na.rm |
Logical; remove missing values. |
Value
A testflow object with class testflow_one_sample. The object is
a list containing the cleaned data, descriptive statistics, assumption
checks, recommended test, primary test result with null hypothesis,
alternative test results, effect size, optional ggplot, original call, and
report text.
References
Gosset, W. S. (1908). The probable error of a mean. Biometrika, 6(1), 1-25.
Wilcoxon, F. (1945). Individual comparisons by ranking methods. Biometrics Bulletin, 1(6), 80-83.
Detect numeric outliers
Description
Detect numeric outliers
Usage
test_outliers(
formula,
data,
group = NULL,
method = c("iqr", "mahalanobis", "both"),
plot = TRUE,
na.rm = TRUE
)
Arguments
formula |
Numeric columns to screen for outliers. |
data |
A data frame. |
group |
Optional grouping column. |
method |
Outlier method. |
plot |
Logical; include a ggplot object. |
na.rm |
Logical; remove missing values. |
Value
A testflow object with class testflow_outliers. The object is a
list containing the cleaned data, numeric descriptives, screening
assumptions, selected outlier-screening method, flagged IQR and/or
Mahalanobis rows, outlier-count summary, optional ggplot, original call,
and report text. This is a screening workflow, not a single hypothesis test.
Compare paired before and after numeric measurements
Description
Compare paired before and after numeric measurements
Usage
test_paired(
formula,
data,
after = NULL,
alternative = c("two.sided", "less", "greater"),
alpha = 0.05,
plot = TRUE,
na.rm = TRUE
)
Arguments
formula |
A formula such as |
data |
A data frame, or the before column when using data-first style. |
after |
After column. Optional when using formula style. |
alternative |
Alternative hypothesis. |
alpha |
Significance level. |
plot |
Logical; include a ggplot object. |
na.rm |
Logical; remove missing values. |
Value
A testflow object with class testflow_paired. The object is a
list containing the cleaned paired data, paired difference, descriptive
statistics, assumption checks, recommended paired test, primary test result
with null hypothesis, alternative test results, effect size, optional
ggplot, original call, and report text.
Test paired categorical measurements
Description
Test paired categorical measurements
Usage
test_paired_categorical(
data,
before,
after,
alpha = 0.05,
plot = TRUE,
na.rm = TRUE
)
Arguments
data |
A data frame. |
before |
Before categorical column. |
after |
After categorical column. |
alpha |
Significance level. |
plot |
Logical; include a ggplot object. |
na.rm |
Logical; remove missing values. |
Value
A testflow object with class testflow_paired_categorical. The
object is a list containing the cleaned paired categorical data, categorical
descriptives, assumption checks, McNemar test result, discordant-pair table,
optional ggplot, original call, and report text.
Test a one-sample proportion
Description
Test a one-sample proportion
Usage
test_proportion(
data,
outcome,
success,
p = 0.5,
alpha = 0.05,
plot = TRUE,
na.rm = TRUE
)
Arguments
data |
A data frame. |
outcome |
Categorical outcome column. |
success |
Value counted as success. |
p |
Reference probability. |
alpha |
Significance level. |
plot |
Logical; include a ggplot object. |
na.rm |
Logical; remove missing values. |
Value
A testflow object with class testflow_proportion. The object is
a list containing the cleaned data, categorical descriptives, assumption
checks, recommended exact or approximate one-sample proportion test, primary
test result with null hypothesis, alternative exact and approximate results,
observed proportion summary, optional ggplot, original call, and report
text.
References
Clopper, C. J., & Pearson, E. S. (1934). The use of confidence or fiducial limits illustrated in the case of the binomial. Biometrika, 26(4), 404-413.
Run a repeated-measures workflow from wide data
Description
Run a repeated-measures workflow from wide data
Usage
test_repeated(
data,
measures,
id = NULL,
between = NULL,
alpha = 0.05,
plot = TRUE,
na.rm = TRUE
)
Arguments
data |
A data frame. |
measures |
Repeated numeric columns selected with tidyselect syntax. |
id |
Optional subject identifier. |
between |
Optional between-subject factor. |
alpha |
Significance level. |
plot |
Logical; include a ggplot object. |
na.rm |
Logical; remove missing values. |
Value
A testflow object with class testflow_repeated. The object is a
list containing long-form repeated-measures data, numeric descriptives,
assumption checks, recommended repeated-measures ANOVA or Friedman test,
primary test result with null hypothesis, alternative repeated-measures
results, post-hoc paired comparisons, effect size, optional ggplot,
original call, and report text.
References
Fisher, R. A. (1925). Statistical Methods for Research Workers. Oliver and Boyd.
Friedman, M. (1937). The use of ranks to avoid the assumption of normality implicit in the analysis of variance. Journal of the American Statistical Association, 32(200), 675-701.
Wilcoxon, F. (1945). Individual comparisons by ranking methods. Biometrics Bulletin, 1(6), 80-83.
Girden, E. R. (1992). ANOVA: Repeated Measures. Sage.
Test repeated categorical measurements
Description
Test repeated categorical measurements
Usage
test_repeated_categorical(
data,
measures,
id = NULL,
alpha = 0.05,
plot = TRUE,
na.rm = TRUE
)
Arguments
data |
A data frame. |
measures |
Repeated binary columns selected with tidyselect syntax. |
id |
Optional subject identifier. |
alpha |
Significance level. |
plot |
Logical; include a ggplot object. |
na.rm |
Logical; remove missing values. |
Value
A testflow object with class testflow_repeated_categorical. The
object is a list containing the cleaned repeated categorical data,
descriptive counts, assumption checks, Cochran Q test result with null
hypothesis, pairwise McNemar post-hoc comparisons, effect size, optional
ggplot, original call, and report text.
References
Cochran, W. G. (1950). The comparison of percentages in matched samples. Biometrika, 37(3/4), 256-266.
McNemar, Q. (1947). Note on the sampling error of the difference between correlated proportions or percentages. Psychometrika, 12(2), 153-157.
Run a repeated-measures workflow from long data
Description
Run a repeated-measures workflow from long data
Usage
test_repeated_long(
data,
outcome,
within,
id,
between = NULL,
alpha = 0.05,
plot = TRUE,
na.rm = TRUE
)
Arguments
data |
A data frame. |
outcome |
Numeric outcome column. |
within |
Within-subject time/condition column. |
id |
Subject identifier column. |
between |
Optional between-subject factor. |
alpha |
Significance level. |
plot |
Logical; include a ggplot object. |
na.rm |
Logical; remove missing values. |
Value
A testflow object with class testflow_repeated. The object is a
list containing the cleaned long-format data, numeric descriptives,
assumption checks, recommended repeated-measures ANOVA or Friedman test,
primary test result with null hypothesis, alternative repeated-measures
results, post-hoc paired comparisons, effect size, optional ggplot,
original call, and report text.
Compare a numeric outcome between two independent groups
Description
Compare a numeric outcome between two independent groups
Usage
test_two_groups(
formula,
data,
group = NULL,
alternative = c("two.sided", "less", "greater"),
alpha = 0.05,
plot = TRUE,
na.rm = TRUE
)
Arguments
formula |
A formula such as |
data |
A data frame, or an outcome column when using data-first style. |
group |
Two-level grouping column. Optional when using formula style. |
alternative |
Alternative hypothesis. |
alpha |
Significance level. |
plot |
Logical; include a ggplot object. |
na.rm |
Logical; remove missing values. |
Value
A testflow object with class testflow_two_groups. The object is
a list containing the cleaned data, descriptive statistics by group,
assumption checks, recommended test, primary test result with null
hypothesis, alternative test results, effect size, optional ggplot,
original call, and report text.