Title: | Model Wrappers for Multi-Level Models |
Version: | 1.0.0 |
Description: | Bindings for hierarchical regression models for use with the 'parsnip' package. Models include longitudinal generalized linear models (Liang and Zeger, 1986) <doi:10.1093/biomet/73.1.13>, and mixed-effect models (Pinheiro and Bates) <doi:10.1007/978-1-4419-0318-1_1>. |
License: | MIT + file LICENSE |
URL: | https://github.com/tidymodels/multilevelmod, http://multilevelmod.tidymodels.org/ |
Depends: | parsnip (≥ 1.0.0), R (≥ 2.10) |
Imports: | dplyr, lme4, purrr, rlang, tibble, withr |
Suggests: | covr, gee, ggplot2, knitr, nlme, rmarkdown, spelling, testthat, tidymodels |
VignetteBuilder: | knitr |
Config/Needs/website: | tidymodels/tidymodels, tidyverse/tidytemplate |
Encoding: | UTF-8 |
Language: | en-US |
LazyData: | true |
RoxygenNote: | 7.2.0 |
NeedsCompilation: | no |
Packaged: | 2022-06-17 11:38:17 UTC; hannah |
Author: | Max Kuhn |
Maintainer: | Hannah Frick <hannah@rstudio.com> |
Repository: | CRAN |
Date/Publication: | 2022-06-17 12:00:02 UTC |
parsnip methods for hierarchical models
Description
multilevelmod allows users to use the parsnip package to fit certain hierarchical models (e.g., linear, logistic, and Poisson regression). The package relies on the formula method to specify the random effects.
Details
As an example, the package includes simulated longitudinal data where subjects were measured over time points. The outcome was the number of counts and the predictors are the time point as well as an additional numeric covariate.
We can fit the model using lme4::glmer()
:
library(tidymodels) library(multilevelmod) library(poissonreg) # current required for poisson_reg() # The lme4 package is required for this model. tidymodels_prefer() # Split out two subjects to show how prediction works data_train <- longitudinal_counts %>% filter(!(subject %in% c("1", "2"))) data_new <- longitudinal_counts %>% filter(subject %in% c("1", "2")) # Fit the model count_mod <- poisson_reg() %>% set_engine("glmer") %>% fit(y ~ time + x + (1 | subject), data = data_train)
count_mod
## parsnip model object ## ## Generalized linear mixed model fit by maximum likelihood (Laplace ## Approximation) [glmerMod] ## Family: poisson ( log ) ## Formula: y ~ time + x + (1 | subject) ## Data: data ## AIC BIC logLik deviance df.resid ## 4474.553 4494.104 -2233.277 4466.553 976 ## Random effects: ## Groups Name Std.Dev. ## subject (Intercept) 0.9394 ## Number of obs: 980, groups: subject, 98 ## Fixed Effects: ## (Intercept) time x ## -0.5946 1.5145 0.2395
When making predictions, the basic predict()
method does the trick:
count_mod %>% predict(data_new)
## # A tibble: 20 × 1 ## .pred ## <dbl> ## 1 1.19 ## 2 1.42 ## 3 1.65 ## 4 1.83 ## 5 2.04 ## 6 2.66 ## 7 2.96 ## 8 3.43 ## 9 3.94 ## 10 4.64 ## 11 2.21 ## 12 2.60 ## 13 2.97 ## 14 3.38 ## 15 4.16 ## 16 4.90 ## 17 5.45 ## 18 6.20 ## 19 7.55 ## 20 8.64
Author(s)
Maintainer: Hannah Frick hannah@rstudio.com (ORCID)
Authors:
Max Kuhn max@rstudio.com (ORCID)
Other contributors:
RStudio [copyright holder, funder]
See Also
Useful links:
GEE fitting function
Description
Custom fitting function to add GEE model with cluster variable to parsnip GEE function call.
Usage
gee_fit(formula, data, family = gaussian, ...)
Arguments
formula |
Normal formula but uses the |
data |
Modeling data |
family |
a family object: a list of functions and expressions for
defining link and variance functions. Families supported in gee are
|
... |
For additional parameters |
Details
gee()
always prints out warnings and output even when
silent = TRUE
. gee_fit()
will never produce output, even if
silent = FALSE
.
Also, because of issues with the gee()
function, a supplementary call to
glm()
is needed to get the rank and QR decomposition objects so that
predict()
can be used.
Value
A gee object
Simulated longitudinal Poisson counts
Description
Simulated longitudinal Poisson counts
Details
These are simulated data of 100 subjects each with 10 time points and an additional numeric covariate. The linear predictor has a random standard normal intercept per subject, a time coefficient of 1.50, and a covariate coefficient of 0.25.
Value
longitudinal_counts |
a tibble |
Examples
data(longitudinal_counts)
str(longitudinal_counts)
Measurement systems analysis data
Description
Measurement systems analysis data
Details
A biological assay (i.e. a lab test) was run on 56 separate samples twice. The goal is to measure what percentage of the total variation in the results is related to the measurement system and how much is attributable to the true systematic difference (sample-to-sample).
Value
msa_data |
a tibble |
Examples
data(msa_data)
str(msa_data)
Imipramine longitudinal data
Description
Imipramine longitudinal data
Details
These data are from a longitudinal clinical trial for depression.
The outcome is the change in depression scores week-to-week. The endogenous
column is an indicator for whether the subject fit the WHO Depression Scale
classification of endogenous. The imipramine
and desipramine
columns are
measurements of plasma levels for both substances.
Value
riesby |
a tibble |
Source
Reisby, N., Gram, L.F., Bech, P. et al. Imipramine: Clinical effects and pharmacokinetic variability. Psychopharmacology 54, 263-272 (1977).
Examples
data(riesby)
str(riesby)