Help for package multifamm

Type:

Package

Title:

Multivariate Functional Additive Mixed Models

Version:

0.1.1

Description:

An implementation for multivariate functional additive mixed models (multiFAMM), see Volkmann et al. (2021, <doi:10.48550/arXiv.2103.06606>). It builds on developed methods for univariate sparse functional regression models and multivariate functional principal component analysis. This package contains the function to run a multiFAMM and some convenience functions useful when working with large models. An additional package on GitHub contains more convenience functions to reproduce the analyses of the corresponding paper (https://github.com/alexvolkmann/multifammPaper).

License:

GPL-2 | GPL-3 [expanded from: GPL (≥ 2)]

Encoding:

UTF-8

LazyData:

true

RoxygenNote:

7.1.1

Depends:

R (≥ 3.5.0)

Imports:

data.table, funData, MFPCA (≥ 1.3-2), mgcv, sparseFLMM (> 0.3.0), stats, zoo

NeedsCompilation:

Packaged:

2021-09-24 10:04:19 UTC; alex

Author:

Alexander Volkmann [aut, cre]

Maintainer:

Alexander Volkmann <alexandervolkmann8@gmail.com>

Repository:

CRAN

Date/Publication:

2021-09-28 09:00:02 UTC

Compute the Number of FPCs needed

Description

This is an internal function. The function takes all the information needed to calculate how many FPCs are needed to reach the pre-specified cutoff level.

Usage

compute_var(sigma_sq, values, norms_sq, mfpc_cutoff)

Arguments

sigma_sq

Vector containing the estimated variances on each dimension.

values

List containing the multivariate Eigenvalues for each variance component.

norms_sq

Vector containing the squared norms to be used as weights on the Eigenvalues.

mfpc_cutoff

Pre-specified level of explained variance of results of MFPCA.

Conduct the MFPCA

Description

This is an internal function contained in the multiFAMM function. This step uses the information from the univariate FLMMs for the MFPCA. It also allows a single weighting scheme of the MFPCA.

Usage

conduct_mfpca(mfpca_info, mfpc_weight)

Arguments

mfpca_info

Object containing all the neccessary information for the MFPCA. List as given by the output of prepare_mfpca().

mfpc_weight

TRUE if the estimated univariate error variance is to be used as weights in the scalar product of the MFPCA.

Details

Currently, it is possible to conduct a non-weighted MFPCA (default) as well as a MFPCA that uses the estimated univariate error variances as weights.

Extract Model Components to be Compared

Description

This is an internal function that helps to compare different models. The models resulting from a multiFAMM() call are typically very big. This function extracts the main information from a model so that a smaller R object can be saved.

Usage

extract_components(model, dimnames)

Arguments

model

multiFAMM model object from which to extract the information.

dimnames

Names of the dimensions of the model.

Details

So far the grid is fixed to be on [0,1].

Value

A list with the following elements

error_var: A list containing the following elements
- model_weights: Model weights used in the final multiFAMM.
- modelsig2: Estimate of sigma squared in the final model.
- uni_vars: Univariate estimates of sigma squared.
eigenvals: List containing the estimated eigenvalues.
fitted_curves: multiFunData object containing the fitted curves.
eigenfcts: multiFunData object containing the estimated eigenfunctions.
cov_preds: multiFunData object containing the estimated covariate effects.
ran_preds: List containing multiFunData objects of the predicted random effects.
scores: List containing matrices of the estimated scores.
meanfun: multiFunData object containing the estimated mean function.
var_info: List containing all eigenvalues and univariate norms before the MFPC pruning step
- eigenvals: Vector of all multivariate eigenvalues.
- uni_norms: List of univariate norms of all eigenfunctions.

Extract Model Components to be Compared from Univariate Model

Description

Usage

extract_components_uni(model)

Arguments

model

Univariate multiFAMM model object from which to extract the information.

Details

So far the grid is fixed to be on [0,1].

Value

A list with the following elements

error_var: A list containing the following elements
- model_weights: Model weights used in the final multiFAMM.
- modelsig2: Estimate of sigma squared in the final model.
- uni_vars: Univariate estimates of sigma squared.
eigenvals: List containing the estimated eigenvalues.
fitted_curves: multiFunData object containing the fitted curves.
eigenfcts: multiFunData object containing the estimated eigenfunctions.
cov_preds: multiFunData object containing the estimated covariate effects.
ran_preds: List containing multiFunData objects of the predicted random effects.
scores: List containing matrices of the estimated scores.

Extract Variance Information from MFPCA Object

Description

This is an internal function contained in the multiFAMM function. This step allows to extract the information of the total variation in the data (multi- and univariate).

Usage

extract_var_info(MFPC = MFPC)

Arguments

MFPC

MFPCA object from which to extract multivariate eigenvalues and univariate norms.

Multivariate Functional Additive Mixed Model Regression

Description

This is the main function of the package and fits the multivariate functional additive regression model with potentially nested or crossed functional random intercepts.

Usage

multiFAMM(data, fRI_B = FALSE, fRI_C = FALSE, nested = FALSE,
  bs = "ps", bf_mean = 8, bf_covariates = 8, m_mean = c(2, 3),
  covariate = FALSE, num_covariates = NULL, covariate_form = NULL,
  interaction = FALSE, which_interaction = matrix(NA), bf_covs, m_covs,
  var_level = 1, use_famm = FALSE, save_model_famm = FALSE,
  one_dim = NULL, mfpc_weight = FALSE, mfpc_cutoff = 0.95,
  number_mfpc = NULL, mfpc_cut_method = c("total_var", "unidim"),
  final_method = c("w_bam", "bam", "gaulss"), weight_refit = FALSE,
  verbose = TRUE, ...)

Arguments

data

Data.table that contains the information with some fixed variable names, see Details.

fRI_B

Boolean for including functional random intercept for individual (B in Cederbaum). Defaults to FALSE.

fRI_C

Boolean for including functional random intercept for word (C in Cederbaum). Defaults to FALSE.

nested

TRUE to specify a model with nested functional random intercepts for the first and second grouping variable and a smooth error curve. Defaults to FALSE.

bs

Spline basis function, only tested for "ps" (as in sparseFLMM).

bf_mean

Basis dimension for functional intercept (as in sparseFLMM).

bf_covariates

Basis dimension for all covariates (as in sparseFLMM).

m_mean

Order of penalty for basis function (as in sparseFLMM).

covariate

Covariate effects (as in sparseFLMM).

num_covariates

Number of covariates included in the model (as in sparseFLMM).

covariate_form

Vector of strings for type of covariate (as in sparseFLMM).

interaction

TRUE if there are interactions between covariates (as in sparseFLMM). Defaults to FALSE.

which_interaction

Symmetric matrix specifying the interaction terms (as in sparseFLMM).

bf_covs

Vector of marginal basis dimensions for fRI covariance estimation (as in sparseFLMM).

m_covs

List of marginal orders for the penalty in fRI covariance estimation (as in sparseFLMM).

var_level

Pre-specified level of explained variance on each dimension (as in sparseFLMM). Defaults to including all non-negative Eigenvalues.

use_famm

Re-estimate the mean in FAMM context (as in sparseFLMM) - overwritten by one_dim.

save_model_famm

Give out the FAMM model object (as in sparseFLMM) - overwritten by one_dim.

one_dim

Specify the name of the dimension if sparseFLMM is to be computed only on one dimension.

mfpc_weight

TRUE if the estimated univariate error variance is to be used as weights in the scalar product of the MFPCA.

mfpc_cutoff

Pre-specified level of explained variance of results of MFPCA. Defaults to 0.95.

number_mfpc

List containing the number of mfPCs needed for each variance component e.g. list("E" = x, "B" = y).

mfpc_cut_method

Method to determine the level of explained variance

total_var: (weighted) sum of variation over the dimensions.
unidim: separate on each dimension.

final_method

Function used for estimation of final model to allow for potential heteroscedasticity ("w_bam", "bam", "gaulss").

weight_refit

Get the weights for the weighted bam by first refitting the model under an independence assumption but with mfpc basis functions. Defaults to FALSE.

verbose

Print progress of the multifamm. Defaults to TRUE.

...

Additional arguments to be passed to (mainly) the underlying sparseFLMM function.

Details

Expand the method proposed by Fabian Scheipl to incorporate the variance decomposition developed by Cederbaum et al. (2016). To account for the correlation between the dimensions, the MFPCA approach by Happ and Greven (2016) is applied.

The data set has to be of the following format:

y_vec (numeric): vector of response values
t (numeric): observation point locations
n_long (integer): curve identification
subject_long (integer): subject identification (NEEDS TO BE SPECIFIED)
word_long (integer): word identification
combi_long (integer): repetition
dim (factor): level of the dimension
covariate.1 (numeric): potential covariate(s) named with trailing 1,2,3,...

It is possible to introduce weights for the final estimation of the multiFAMM. Currently, it is only implemented to use the inverse of the univariate measurement error estimates as weights. Note that negative values of variance estimates are set to zero in fast symmetric additive covariance smoothing. In order to still include weights, zero-values are substituted by values of the smallest positive variance estimate.

Value

A list with five elements

the final multivariate FAMM
the sparseFLMM output for each of the dimensions
information on the untruncated MPFCA results
the truncated MFPC output
the data used to fit the model.

Examples


# subset of the phonetic data (very small subset, no meaningful results can
# be expected and no random effects other than the random smooth should be
# included in the model)

data(phonetic_subset)

m <- multiFAMM(data = phonetic_subset, covariate = TRUE, num_covariates = 2,
               covariate_form = c("by", "by"), interaction = TRUE,
               which_interaction = matrix(c(FALSE, TRUE, TRUE, FALSE),
               nrow = 2, ncol = 2), bf_covs = c(5), m_covs = list(c(2, 3)),
               mfpc_cut_method = "total_var", final_method = "w_bam")

Phonetic data

Description

The data are part of a large study on consonant assimilation, which is the phenomenon that the articulation of two consonants becomes phonetically more alike when they appear subsequently in fluent speech. The data set contains the audio signals of nine different speakers which repeated the same sixteen German target words each five times. In addition to these acoustic signals, the data set also contains the electropalatographic data. The target words are bisyllabic noun-noun compound words which contained the two abutting consonants of interest, s and sh, in either order. Consonant assimilation is accompanied by a complex interplay of language-specific, perceptual and articulatory factors. The aim in the study was to investigate the assimilation of the two consonants as a function of their order (either first s, then sh or vice-versa), syllable stress (stressed or unstressed) and vowel context, i.e. which vowels are immediately adjacent to the target consonants of interest. The vowels are either of the form ia or ai. For more details, see references below.

Usage

phonetic

Format

A data.frame with 50644 observations and 12 variables:

dim: Factor for identifying the acoustic (aco) and electropalatographic (epg) dimensions.
subject_long: Unique identification number for each speaker.
word_long: Unique identification number for each target word.
combi_long: Number of the repetition of the combination of the corresponding speaker and target word.
y_vec: The response values for each observation point.
n_long: Unique identification number for each curve.
t: The observations point locations.
covariate.1: Order of the consonants, reference category first /s/ then /sh/.
covariate.2: Stress of the final syllable of the first compound, reference category 'stressed'.
covariate.3: Stress of the initial syllable of the second compound, reference category 'stressed'.
covariate.4: Vowel context, reference category ia.
word_names_long: Names of the target words

Source

Pouplier, Marianne and Hoole, Philip (2016): Articulatory and Acoustic Characteristics of German Fricative Clusters, Phonetica, 73(1), 52–78.

Cederbaum, Pouplier, Hoole, Greven (2016): Functional Linear Mixed Models for Irregularly or Sparsely Sampled Data. Statistical Modelling, 16(1), 67-88.

Jona Cederbaum (2019). sparseFLMM: Functional Linear Mixed Models for Irregularly or Sparsely Sampled Data. R package version 0.3.0. https://CRAN.R-project.org/package=sparseFLMM

Phonetic data (subset)

Description

A small subset of the phonetics data set phonetic with observations from two speakers and two items only. This will not produce meaningful results but can be used as a toy data set when testing the code. The variables are as in the full data set, see phonetic

Usage

phonetic_subset

Format

A data.frame with 1336 observations and 12 variables.

Source

Pouplier, Marianne and Hoole, Philip (2016): Articulatory and Acoustic Characteristics of German Fricative Clusters, Phonetica, 73(1), 52–78.

Cederbaum, Pouplier, Hoole, Greven (2016): Functional Linear Mixed Models for Irregularly or Sparsely Sampled Data. Statistical Modelling, 16(1), 67-88.

Jona Cederbaum (2019). sparseFLMM: Functional Linear Mixed Models for Irregularly or Sparsely Sampled Data. R package version 0.3.0. https://CRAN.R-project.org/package=sparseFLMM

Predict The Mean Function For the FPC Plots

Description

This is an internal function that helps to interpret the FPCs. Extract the mean function for all covariates set to 0.5. This is useful if combined with the estimated FPCs because one can then add and subtract suitable multiples from this function.

Usage

predict_mean(model, multi = TRUE, dimnames = c("aco", "epg"))

Arguments

model

multiFAMM model or list of univariate models for which to predict the mean.

multi

Indicator if it is a multiFAMM model (TRUE) or a list of univariate models.

dimnames

Vector of strings containing the names of the dimensions.

Value

A multiFunData object.

Prepare Information Necessary for MFPCA

Description

This is an internal function contained in the multiFAMM function. This step uses the information from the univariate FLMMs for the MFPCA. It also allows a simple weighting scheme of the MFPCA.

Usage

prepare_mfpca(model_list, fRI_B, mfpc_weight)

Arguments

model_list

List containing sparseFLMM objects for each dimension as given by the output of apply_sparseFLMM()

fRI_B

Boolean for including functional random intercept for individual (B in Cederbaum). Defaults to FALSE.

mfpc_weight

TRUE if the estimated univariate error variance is to be used as weights in the scalar product of the MFPCA.

Prune the MFPC object to include only a prespecified level of explained var

Description

This is an internal function contained in the multiFAMM function. This function takes the MFPCA object and decides how many functional principal components are to be included in the model.

Usage

prune_mfpc(MFPC, mfpc_cutoff, model_list, mfpc_cut_method, number_mfpc,
  mfpca_info)

Arguments

MFPC

List containing MFPC objects for each variance component as given by the function conduct_mfpca()

mfpc_cutoff

Pre-specified level of explained variance of results of MFPCA. Defaults to 0.95.

model_list

List containing sparseFLMM objects for each dimension as given by the output of apply_sparseFLMM()

mfpc_cut_method

Method to determine the level of explained variance

total_var: (weighted) sum of variation over the dimensions.
unidim: separate on each dimension.

number_mfpc

List containing the number of mfPCs needed for each variance component e.g. list("E" = x, "B" = y).

mfpca_info

Object containing all the neccessary information for the MFPCA. List as given by the output of prepare_mfpca().

Refit the model under an independence assumption

Description

This is an internal function. Refit the model under an independence assumption now with the basis functions from the MFPCA. Goal is to extract an estimate for the error variances.

Usage

refit_for_weights(formula, data, model_list)

Arguments

formula

Formula to fit the final model.

data

Data that contains all the variables specified in formula.

model_list

List containing sparseFLMM objects for each dimension

Snooker data

Description

The data are part of a study on the impact of a muscular training program on snooker technique. 25 recreational snooker players were split into treatment (receiving instructions for a training program) and control group (no training program). The data set contains the movement trajectories of the snooker players in two sessions (before and after the training period), where each snooker player repeated a snooker shot of maximal force six times. The interest lies in the movement of hand, elbow, and shoulder on a two-dimensional grid (called X and Y). The trajectories are normalized on a [0,1] time grid and the beginning of the hand trajectories are centered to the origin.

Usage

snooker

Format

A data.frame with 56910 observations and 11 variables:

y_vec: The response values for each observation point.
t: The observations point locations.
n_long: Unique identification number for each curve.
subject_long: Unique identification number for each snooker player.
word_long: Integer specifying the session. 1: Before the training, 2: After the training.
dim: Factor for identifying the univariate dimensions.
combi_long: Number of the repetition of the snooker shot.
covariate.1: Skill level of the snooker player. 0: Unskilled, 1: Skilled.
covariate.2: Group of the snooker player. 0: Control group, 1: Treatment group.
covariate.3: Session indicator. 0: Before the treatment, 1: After the treatment.
covariate.4: Interaction of group and session, i.e. the treatment effect indicator.

Source

Enghofer, T. (2014). Überblick über die Sportart Snooker, Entwicklung eines Muskeltrainings und Untersuchung dessen Einflusses auf die Stoßtechnik. Unpublished Zulassungsarbeit, Technische Universität München.