Encoding: | UTF-8 |
Type: | Package |
Title: | Testing and Plotting Procedures for Biostatistics |
Version: | 0.9-83-12 |
Date: | 2025-07-12 |
Imports: | ade4 (≥ 1.7-8), boot, car, FactoMineR, graphics, grDevices, lme4 (≥ 1.0-4), MASS, nnet, pls, pspearman, stats, utils, vegan (≥ 2.4-3) |
Suggests: | ape, betareg, dgof, emmeans (≥ 1.11.2), EMT, FSA, glmmTMB, labdsv, mixOmics, MuMIn, mvnormtest, ordinal, RGCCA, statmod, survival |
Description: | Contains miscellaneous functions useful in biostatistics, mostly univariate and multivariate testing procedures with a special emphasis on permutation tests. Many functions intend to simplify user's life by shortening existing procedures or by implementing plotting functions that can be used with as many methods from different packages as possible. |
Enhances: | glmmADMB |
Additional_repositories: | http://r-forge.r-project.org |
License: | GPL-2 |
LazyLoad: | yes |
NeedsCompilation: | no |
Packaged: | 2025-07-15 05:24:13 UTC; maherve |
Author: | Maxime HERVE [aut, cre] |
Maintainer: | Maxime HERVE <maxime.herve@univ-rennes.fr> |
Repository: | CRAN |
Date/Publication: | 2025-07-15 06:00:02 UTC |
Testing and Plotting Procedures for Biostatistics
Description
Contains miscellaneous functions useful in biostatistics, mostly univariate and multivariate testing procedures with a special emphasis on permutation tests. Many functions intend to simplify user's life by shortening existing procedures or by implementing plotting functions that can be used with as many methods from different packages as possible.
Details
Package: | RVAideMemoire |
Type: | Package |
Version: | 0.9-83-12 |
Date: | 2025-07-12 |
License: | GPL-2 |
LazyLoad: | yes |
Author(s)
Maxime HERVE
Maintainer: Maxime HERVE <maxime.herve@univ-rennes.fr>
References
Document : "Aide-memoire de statistique appliquee a la biologie - Construire son etude et analyser les resutats a l'aide du logiciel R" (available on CRAN)
Anova Tables for Cumulative Link (Mixed) Models
Description
These functions are methods for Anova
to calculate type-II or type-III analysis-of-deviance tables for model objects produced by clm
and clmm
. Likelihood-ratio tests are calculated in both cases.
Usage
## S3 method for class 'clm'
Anova(mod, type = c("II", "III", 2, 3), ...)
## S3 method for class 'clmm'
Anova(mod, type = c("II", "III", 2, 3), ...)
Arguments
mod |
|
type |
type of test, |
... |
additional arguments to |
Details
See help of the Anova
for a detailed explanation of what "type II" and "typ III" mean.
Value
See Anova
.
Author(s)
Maxime HERVE <maxime.herve@univ-rennes1.fr>
See Also
Cross validation
Description
Performs cross validation with correspondence discriminant analyses.
Usage
CDA.cv(X, Y, repet = 10, k = 7, ncomp = NULL, method = c("mahalanobis",
"euclidian"))
Arguments
X |
a data frame of dependent variables (typically contingency or presence-absence table). |
Y |
factor giving the groups. |
repet |
an integer giving the number of times the whole procedure has to be repeated. |
k |
an integer giving the number of folds (can be re-set internally if needed). |
ncomp |
an integer giving the number of components to be used for prediction. If |
method |
criterion used to predict class membership. See |
Details
The training sets are generated in respect to the relative proportions of the levels of Y
in the original data set (see splitf
).
Value
repet |
number of times the whole procedure was repeated. |
k |
number of folds. |
ncomp |
number of components used. |
method |
criterion used to classify individuals of the test sets. |
groups |
levels of |
models.list |
list of of models generated ( |
NMC |
Classification error rates ( |
Author(s)
Maxime HERVE <maxime.herve@univ-rennes1.fr>
See Also
Examples
require(ade4)
data(perthi02)
## Not run: CDA.cv(perthi02$tab,perthi02$cla)
Significance test for CDA
Description
Performs a significance test for correspondence discriminant analysis. See Details.
Usage
CDA.test(X, fact, ncomp = NULL, ...)
Arguments
X |
a data frame of dependent variables (typically contingency or presence-absence table). |
fact |
factor giving the groups. |
ncomp |
an integer giving the number of components to be used for the test. If |
... |
other arguments to pass to |
Details
CDA consists in two steps: building a correspondence analysis (CA) on X
, then using row coordinates on all CA components as input variables for a linear discriminant analysis. CDA.test
builds the intermediate CA, then uses the first ncomp
components to test for an effect of fact
. If 1 component is used the test is an ANOVA, if more than 1 component are used the test is a MANOVA.
Value
An ANOVA or MANOVA table.
Author(s)
Maxime HERVE <maxime.herve@univ-rennes1.fr>
See Also
Examples
require(ade4)
data(perthi02)
CDA.test(perthi02$tab,perthi02$cla)
Cross validation
Description
Performs cross validation with DIABLO (block.plsda
or block.splsda
).
Usage
DIABLO.cv(x, method = c("mahalanobis.dist", "max.dist", "centroids.dist"),
validation = c("Mfold", "loo"), k = 7, repet = 10, ...)
Arguments
x |
an object of class |
method |
criterion used to predict class membership. See |
validation |
a character giving the kind of (internal) validation to use. See |
k |
an integer giving the number of folds (can be re-set internally if needed). |
repet |
an integer giving the number of times the whole procedure has to be repeated. |
... |
other arguments to pass to |
Details
The function uses the weighted predicted classification error rate (see perf
).
Value
repet |
number of times the whole procedure was repeated. |
k |
number of folds. |
validation |
kind of validation used. |
ncomp |
number of components used. |
method |
criterion used to classify individuals of the test sets. |
NMC.mean |
mean classification error rate (based on |
NMC.se |
standard error of the classification error rate (based on |
Author(s)
Maxime HERVE <mx.herve@gmail.com>
See Also
block.plsda
, block.splsda
, perf
Examples
## Not run:
require(mixOmics)
data(nutrimouse)
data <- list(gene=nutrimouse$gene,lipid=nutrimouse$lipid,Y=nutrimouse$diet)
DIABLO <- block.plsda(X=data,indY=3)
DIABLO.cv(DIABLO)
## End(Not run)
Significance test based on cross-validation
Description
Performs a permutation significance test based on cross-validation with DIABLO (block.plsda
or block.splsda
).
Usage
DIABLO.test(x, method = c("mahalanobis.dist", "max.dist", "centroids.dist"),
validation = c("Mfold", "loo"), k = 7, nperm = 999, progress = TRUE, ...)
Arguments
x |
an object of class |
method |
criterion used to predict class membership. See |
validation |
a character giving the kind of (internal) validation to use. See |
k |
an integer giving the number of folds (can be re-set internally if needed). |
nperm |
number of permutations. |
progress |
logical indicating if the progress bar should be displayed. |
... |
other arguments to pass to |
Details
The function uses the weighted predicted classification error rate (see perf
).
Value
method |
a character string indicating the name of the test. |
data.name |
a character string giving the name of the data, plus additional information. |
statistic |
the value of the test statistics (classification error rate). |
permutations |
the number of permutations. |
p.value |
the p-value of the test. |
Author(s)
Maxime HERVE <maxime.herve@univ-rennes1.fr>
See Also
block.plsda
, block.splsda
, perf
Examples
## Not run:
require(mixOmics)
data(nutrimouse)
data <- list(gene=nutrimouse$gene,lipid=nutrimouse$lipid,Y=nutrimouse$diet)
DIABLO <- block.plsda(X=data,indY=3)
DIABLO.test(DIABLO)
## End(Not run)
G-test for binary variables
Description
Performs a G-test for comparing response probabilities (i.e. when the response variable is a binary variable). The function is in fact a wrapper to the G-test for comparison of proportions on a contingency table. If the p-value of the test is significant, the function performs pairwise comparisons by using G-tests.
Usage
G.bintest(formula, data, alpha = 0.05, p.method = "fdr")
Arguments
formula |
a formula of the form |
data |
an optional data frame containing the variables in the formula |
alpha |
significance level to compute pairwise comparisons. |
p.method |
method for p-values correction. See help of |
Details
If the response is a 0/1 variable, the probability of the '1' group is tested. In any other cases, the response is transformed into a factor and the probability of the second level is tested.
Since a G-test is an approximate test, an exact test is preferable when the number of individuals is small (200 is a reasonable minimum). See fisher.bintest
in that case.
Value
method.test |
a character string giving the name of the global test computed. |
data.name |
a character string giving the name(s) of the data. |
alternative |
a character string describing the alternative hypothesis. |
estimate |
the estimated probabilities. |
null.value |
the value of the difference in probabilities under the null hypothesis, always 0. |
statistic |
test statistics. |
parameter |
test degrees of freedom. |
p.value |
p-value of the global test. |
alpha |
significance level. |
p.adjust.method |
method for p-values correction. |
p.value.multcomp |
data frame of pairwise comparisons result. |
method.multcomp |
a character string giving the name of the test computed for pairwise comparisons. |
Author(s)
Maxime HERVE <maxime.herve@univ-rennes1.fr>
See Also
Examples
response <- c(rep(0:1,c(40,60)),rep(0:1,c(55,45)),rep(0:1,c(65,35)))
fact <- gl(3,100,labels=LETTERS[1:3])
G.bintest(response~fact)
Pairwise comparisons after a G-test
Description
Performs pairwise comparisons after a global G-test.
Usage
G.multcomp(x, p.method = "fdr")
Arguments
x |
numeric vector (counts). |
p.method |
method for p-values correction. See help of |
Details
Since a G-test is an approximate test, an exact test is preferable when the number of individuals is small (200 is a reasonable minimum). See multinomial.multcomp
in that case.
Value
method |
name of the test. |
data.name |
a character string giving the name(s) of the data. |
p.adjust.method |
method for p-values correction. |
p.value |
table of results. |
Author(s)
Maxime HERVE <maxime.herve@univ-rennes1.fr>
See Also
G.test
, multinomial.test
, multinomial.multcomp
Examples
counts <- c(49,30,63,59)
G.test(counts)
G.multcomp(counts)
G-test
Description
Perfoms a G-test on a contingency table or a vector of counts.
Usage
G.test(x, p = rep(1/length(x), length(x)))
Arguments
x |
a numeric vector or matrix (see Details). |
p |
theoretical proportions (optional). |
Details
If x
is matrix, it must be constructed like this:
- 2 columns giving number of successes (left) and fails (right)
- 1 row per population.
The function works as chisq.test
:
- if x
is a vector and theoretical proportions are not given, equality of counts is tested
- if x
is a vector and theoretical proportions are given, equality of counts to theoretical counts (given by theoretical proportions) is tested
- if x
is a matrix with two columns, equality of proportion of successes between populations is tested.
- if x
is a matrix with more than two columns, independence of rows and columns is tested.
Since a G-test is an approximate test, an exact test is preferable when the number of individuals is small (200 is a reasonable minimum). See multinomial.test
in that case with a vector, fisher.test
with a matrix.
Value
method |
name of the test. |
statistic |
test statistics. |
parameter |
test degrees of freedom. |
p.value |
p-value. |
data.name |
a character string giving the name(s) of the data. |
observed |
the observed counts. |
expected |
the expected counts under the null hypothesis. |
Author(s)
Maxime HERVE <maxime.herve@univ-rennes1.fr>
See Also
chisq.test
, multinomial.test
, fisher.test
G.multcomp
, G.theo.multcomp
, pairwise.G.test
Examples
counts <- c(49,30,63,59)
G.test(counts)
Pairwise comparisons after a G-test for given probabilities
Description
Performs pairwise comparisons after a global G-test for given probabilities.
Usage
G.theo.multcomp(x, p = rep(1/length(x), length(x)), p.method = "fdr")
Arguments
x |
numeric vector (counts). |
p |
theoretical proportions. |
p.method |
method for p-values correction. See help of |
Details
Since a G-test is an approximate test, an exact test is preferable when the number of individuals is small (200 is a reasonable minimum). See multinomial.theo.multcomp
in that case.
Value
method |
name of the test. |
data.name |
a character string giving the name(s) of the data. |
observed |
observed counts. |
expected |
expected counts. |
p.adjust.method |
method for p-values correction. |
statistic |
statistics of each test. |
p.value2 |
corrected p-values. |
p.value |
data frame of results. |
Author(s)
Maxime HERVE <maxime.herve@univ-rennes1.fr>
See Also
G.test
, multinomial.test
, multinomial.theo.multcomp
Examples
counts <- c(49,30,63,59)
p.theo <- c(0.2,0.1,0.45,0.25)
G.test(counts,p=p.theo)
G.theo.multcomp(counts,p=p.theo)
Significance test for GPA
Description
Performs a permutation significance test based on total variance explained for Generalized Procrustes Analysis. The function uses GPA
.
Usage
GPA.test(df, group, tolerance = 10^-10, nbiteration = 200, scale = TRUE,
nperm = 999, progress = TRUE)
Arguments
df |
a data frame with n rows (individuals) and p columns (quantitative varaibles), in which all data frames are combined. |
group |
a vector indicating the number of variables in each group (i.e. data frame). |
tolerance |
a threshold with respect to which the algorithm stops, i.e. when the difference between the GPA loss function at step n and n+1 is less than |
nbiteration |
the maximum number of iterations until the algorithm stops. |
scale |
logical, if |
nperm |
number of permutations. |
progress |
logical indicating if the progress bar should be displayed. |
Details
Rows of each data frame are randomly and independently permuted.
The function deals with the limitted floating point precision, which can bias calculation of p-values based on a discrete test statistic distribution.
Value
method |
a character string indicating the name of the test. |
data.name |
a character string giving the name(s) of the data, plus additional information. |
statistic |
the value of the test statistics. |
permutations |
the number of permutations. |
p.value |
the p-value of the test. |
Author(s)
Maxime HERVE <maxime.herve@univ-rennes1.fr>
References
Wakeling IN, Raats MM and MacFie HJH (1992) A new significance test for consensus in Generalized Procrustes Analysis. Journal of Sensory Studies 7:91-96.
See Also
Examples
require(FactoMineR)
data(wine)
## Not run: GPA.test(wine[,-(1:2)],group=c(5,3,10,9,2))
Type II permutation test for constrained multivariate analyses
Description
This function is a wrapper to anova.cca(...,by="terms")
but performs type II tests (whereas anova.cca
performs type I).
Usage
MVA.anova(object, ...)
Arguments
object |
|
... |
additional arguments to |
Details
See anova.cca
for detailed explanation of what is done. The only difference with anova.cca
is that MVA.anova
performs type II tests instead of type I.
See example of adonis.II
for the difference between type I (sequential) and type II tests.
Value
a data frame of class "anova"
.
Author(s)
Maxime HERVE <maxime.herve@univ-rennes1.fr>
Biplot of multivariate analyses
Description
Displays a biplot of a multivariate analysis. This just consists in superimposing a score plot and a correlation circle (plus centroids of factor levels in constrained analyses, RDA or CCA). The correlation circle is adjusted to fit the size of the score plot.
Usage
MVA.biplot(x, xax = 1, yax = 2, scaling = 2, sco.set = c(12, 1, 2),
cor.set = c(12, 1, 2), space = 1, ratio = 0.9, weights = 1,
constraints = c("nf", "n", "f", NULL), sco.args = list(),
cor.args = list(), f.col = 1, f.cex = 1)
Arguments
x |
a multivariate analysis (see Details). |
xax |
the horizontal axis. |
yax |
the vertical axis. |
scaling |
type of scaling (see |
sco.set |
scores to be displayed, when several sets are available (see |
cor.set |
correlations to be displayed, when several sets are available (see |
space |
space to use, when several are available (see |
ratio |
constant for adjustement of correlations to the size of the score plot ( |
weights |
only used with constrained analyses (RDA or CCA) where some constraints are factors. Individual weights, used to calculate barycenter positions. |
constraints |
only used with constrained analyses (RDA or CCA). Type of constraints to display: quantitative ( |
sco.args |
list containing optional arguments to pass to |
cor.args |
list containing optional arguments to pass to |
f.col |
color(s) used for barycenters in case of a constraint analysis (RDA or CCA) containing factor constraint(s). Can be a unique value, a vector giving one color per constraint or a vector giving one color per barycenter (all factors confounded). |
f.cex |
size(s) used for barycenters in case of a constraint analysis (RDA or CCA) containing factor constraint(s). Can be a unique value, a vector giving one size per constraint or a vector giving one size per barycenter (all factors confounded). |
Details
This function should not be use directly. Prefer the general MVA.plot
, to which all arguments can be passed.
All multivariate analyses covered by MVA.corplot
can be used for biplots.
Author(s)
Maxime HERVE <maxime.herve@univ-rennes1.fr>
Examples
require(vegan)
data(iris)
RDA <- rda(iris[,1:4]~Species,data=iris)
MVA.plot(RDA,"biplot",cor.args=list(col="purple"),ratio=0.8,f.col=c("red","green","blue"))
Cross model validation
Description
Performs cross model validation (2CV) with different PLS analyses.
Usage
MVA.cmv(X, Y, repet = 10, kout = 7, kinn = 6, ncomp = 8, scale = TRUE,
model = c("PLSR", "CPPLS", "PLS-DA", "PPLS-DA", "PLS-DA/LDA", "PLS-DA/QDA",
"PPLS-DA/LDA", "PPLS-DA/QDA"), crit.inn = c("RMSEP", "Q2", "NMC"),
Q2diff = 0.05, lower = 0.5, upper = 0.5, Y.add = NULL, weights = rep(1, nrow(X)),
set.prior = FALSE, crit.DA = c("plug-in", "predictive", "debiased"), ...)
Arguments
X |
a data frame of independent variables. |
Y |
the dependent variable(s): numeric vector, data frame of quantitative variables or factor. |
repet |
an integer giving the number of times the whole 2CV procedure has to be repeated. |
kout |
an integer giving the number of folds in the outer loop (can be re-set internally if needed). |
kinn |
an integer giving the number of folds in the inner loop (can be re-set internally if needed). Cannot be |
ncomp |
an integer giving the maximal number of components to be tested in the inner loop (can be re-set depending on the size of the train sets). |
scale |
logical indicating if data should be scaled (see Details). |
model |
the model to be fitted (see Details). |
crit.inn |
the criterion to be used to choose the number of components in the inner loop. Root Mean Square Error of Prediction ( |
Q2diff |
the threshold to be used if the number of components is chosen according to Q2. The next component is added only if it makes the Q2 increase more than |
lower |
a vector of lower limits for power optimisation in CPPLS or PPLS-DA (see |
upper |
a vector of upper limits for power optimisation in CPPLS or PPLS-DA (see |
Y.add |
a vector or matrix of additional responses containing relevant information about the observations, in CPPLS or PPLS-DA (see |
weights |
a vector of individual weights for the observations, in CPPLS or PPLS-DA (see |
set.prior |
only used when a second analysis (LDA or QDA) is performed. If |
crit.DA |
criterion used to predict class membership when a second analysis (LDA or QDA) is used. See |
... |
other arguments to pass to |
Details
Cross model validation is detailed is Szymanska et al (2012). Some more details about how this function works:
- when a discriminant analysis is used ("PLS-DA"
, "PPLS-DA"
, "PLS-DA/LDA"
, "PLS-DA/QDA"
, "PPLS-DA/LDA"
or "PPLS-DA/QDA"
), the training sets (test set itself in the inner loop, test+validation sets in the outer loop) are generated in respect to the relative proportions of the levels of Y
in the original data set (see splitf
).
- "PLS-DA"
is considered as PLS2 on a dummy-coded response. For a PLS-DA based on the CPPLS algorithm, use "PPLS-DA"
with lower
and upper
limits of the power parameters set to 0.5
.
- if a second analysis is used ("PLS-DA/LDA"
, "PLS-DA/QDA"
, "PPLS-DA/LDA"
or "PPLS-DA/QDA"
), a LDA or QDA is built on scores of the first analysis (PLS-DA or PPLS-DA) also in the inner loop. The classification error rate, based on this second analysis, is used to choose the number of components.
If scale = TRUE
, the scaling is done as this:
- for each step of the outer loop (i.e. kout
steps), the rest set is pre-processed by centering and unit-variance scaling. Means and standard deviations of variables in the rest set are then used to scale the test set.
- for each step of the inner loop (i.e. kinn
steps), the training set is pre-processed by centering and unit-variance scaling. Means and standard deviations of variables in the training set are then used to scale the validation set.
Value
model |
model used. |
type |
type of model used. |
repet |
number of times the whole 2CV procedure was repeated. |
kout |
number of folds in the outer loop. |
kinn |
number of folds in the inner loop. |
crit.inn |
criterion used to choose the number of components in the inner loop. |
crit.DA |
criterion used to classify individuals of the test and validation sets. |
Q2diff |
threshold used if the number of components is chosen according to Q2. |
groups |
levels of |
models.list |
list of of models generated ( |
models1.list |
list of of (P)PLS-DA models generated ( |
models2.list |
list of of LDA/QDA models generated ( |
RMSEP |
RMSEP computed from the models used in the outer loops ( |
Q2 |
Q2 computed from the models used in the outer loops ( |
NMC |
Classification error rate computed from the models used in the outer loops ( |
confusion |
Confusion matrices computed from the models used in the outer loops ( |
pred.prob |
Probability of each individual of being of each level of |
Author(s)
Maxime HERVE <maxime.herve@univ-rennes1.fr>
References
Szymanska E, Saccenti E, Smilde AK and Westerhuis J (2012) Double-check: validation of diagnostic statistics for PLS-DA models in metabolomics studies. Metabolomics (2012) 8:S3-S16.
See Also
predict.MVA.cmv
, mvr
, lda
, qda
Examples
require(pls)
require(MASS)
# PLSR
data(yarn)
## Not run: MVA.cmv(yarn$NIR,yarn$density,model="PLSR")
# PPLS-DA coupled to LDA
data(mayonnaise)
## Not run: MVA.cmv(mayonnaise$NIR,factor(mayonnaise$oil.type),model="PPLS-DA/LDA",crit.inn="NMC")
Correlations of multivariate analyses
Description
Returns correlations of a multivariate analysis.
Usage
MVA.cor(x, xax = 1, yax = 2, set = c(12, 1, 2), space = 1, ...)
Arguments
x |
a multivariate analysis (see Details). |
xax |
axis or axes for which to extract correlations. |
yax |
axis for which to extract correlations (ignored if |
set |
variables to be displayed, when several sets are available (see Details). |
space |
variables to be displayed, when several spaces are available (see Details). |
... |
not used. |
Details
Many multivariate analyses are supported, from various packages:
- sPCA: spca
.
- IPCA: ipca
.
- sIPCA: sipca
.
- PLS-DA (PLS2 on a dummy-coded factor): plsda
. X space only.
- sPLS-DA (sPLS2 on a dummy-coded factor): splsda
. X space only.
- CPPLS: mvr
. Set 1 is X, set 2 is Y. If set=12
(default), fac
is not available and pch
,cex
, col
, lwd
can be defined differently for each set. X space only.
- PLSR: mvr
, pls
, plsR
(plsRglm package). Set 1 is X, set 2 is Y. If set=12
(default), fac
is not available and pch
,cex
, col
, lwd
can be defined differently for each set. X space only.
- sPLSR: pls
. Set 1 is X, set 2 is Y. If set=12
(default), fac
is not available and pch
,cex
, col
, lwd
can be defined differently for each set. X space only.
- PLS-GLR: plsRglm
(plsRglm package). Set 1 is X, set 2 is Y. If set=12
(default), fac
is not available and pch
,cex
, col
, lwd
can be defined differently for each set. Correlations are computed with Y on the link scale.
- PCR: mvr
. Set 1 is X, set 2 is Y. If set=12
(default), fac
is not available and pch
,cex
, col
, lwd
can be defined differently for each set.
- CDA: discrimin
, discrimin.coa
.
- NSCOA: dudi.nsc
. For NSCOA there is no real correlation, but the classical representation of columns is arrows. This is why MVA.corplot was made able to deal with this analysis.
- CCA: cca
, pcaiv
. Constraints (only quantitative constraints are extracted) in constrained space only.
- Mix analysis: dudi.mix
, dudi.hillsmith
. Only quantitative variables are displayed.
- RDA (or PCAIV): pcaiv
, pcaivortho
, rda
. With rda
, space 1 is constrained space, space 2 is unconstrained space. Only constrained space is available with pcaiv
, the opposite for pcaivortho
. Set 1 is constraints (only quantitative constraints are extracted), set 2 is dependent variables (only set 2 is available for pcaivortho
). If set=12
(default), fac
is not available and pch
,cex
, col
, lwd
can be defined differently for each set.
- CCorA: CCorA
, rcc
. Space 1 is X, space 2 is Y. With rcc
a third space is available, in which coordinates are means of X and Y coordinates. In this third space, set 1 is X, set 2 is Y. If set=12
(default), fac
is not available and pch
,cex
, col
, lwd
can be defined differently for each set.
- rCCorA: rcc
. Space 1 is X, space 2 is Y, space 3 is a "common" space in which coordinates are means of X and Y coordinates. In space 3, set 1 is X and set 2 is Y. If set=12
(default), fac
is not available and pch
,cex
, col
, lwd
can be defined differently for each set.
- CIA: coinertia
. Space 1 is X, space 2 is Y, space 3 is a "common" space where X and Y scores are normed. In space 3, set 1 is X and set 2 is Y. If set=12
in space 3 (default), fac
is not available and pch
,cex
, col
, lws
can be defined differently for each set.
- GPA: GPA
. Only the consensus ordination can be displayed.
- 2B-PLS: pls
. Space 1 is X, space 2 is Y, space 3 is a "common" space in which coordinates are means of X and Y coordinates. In space 3, set 1 is X and set 2 is Y. If set=12
(default), fac
is not available and pch
,cex
, col
, lwd
can be defined differently for each set.
- 2B-sPLS: pls
. Space 1 is X, space 2 is Y, space 3 is a "common" space in which coordinates are means of X and Y coordinates. In space 3, set 1 is X and set 2 is Y. If set=12
(default), fac
is not available and pch
,cex
, col
, lwd
can be defined differently for each set.
- rGCCA: wrapper.rgcca
. Space can be 1 to n, the number of blocks (i.e. datasets).
- sGCCA: wrapper.sgcca
. Space can be 1 to n, the number of blocks (i.e. datasets).
- DIABLO: block.plsda
, block.splsda
. Space can be 1 to n, the number of blocks (i.e. datasets).
Author(s)
Maxime HERVE <maxime.herve@univ-rennes1.fr>
Correlation circle of multivariate analyses
Description
Displays a correlation circle of a multivariate analysis.
Usage
MVA.corplot(x, xax = 1, yax = 2, thresh = 0, fac = NULL, set = c(12, 1, 2), space = 1,
xlab = NULL, ylab = NULL, main = NULL, circle = TRUE, intcircle = 0.5, points = TRUE,
ident = TRUE, arrows = TRUE, labels = NULL, main.pos = c("bottomleft", "topleft",
"bottomright", "topright"), main.cex = 1.3, legend = FALSE, legend.pos = c("topleft",
"topright", "bottomleft", "bottomright"), legend.title = NULL, legend.lab = NULL,
pch = 16, cex = 1, col = 1, lwd = 1, drawintaxes = TRUE, add = FALSE, add.const = 1,
keepmar = FALSE)
Arguments
x |
a multivariate analysis (see Details). |
xax |
the horizontal axis. |
yax |
the vertical axis. This can be set to |
thresh |
threshold (in absolute value of the correlation coefficient) of variables to be plotted. |
fac |
an optional factor defining groups of variables. |
set |
variables to be displayed, when several sets are available (see Details). |
space |
variables to be displayed, when several spaces are available (see Details). |
xlab |
legend of the horizontal axis. If |
ylab |
only used for two-dimensional graphs. Legend of the vertical axis. If |
main |
optional title of the graph. |
circle |
only used for two-dimensional graphs. Logical indicating if the circle of radius 1 should be plotted. |
intcircle |
only used for two-dimensional graphs. Vector of one or several values indicating radii of circles to be plotted inside the main circle. Can be set to |
points |
only used for two-dimensional graphs. If |
ident |
only used for two-dimensional graphs when |
arrows |
only used if |
labels |
names of the variables. If |
main.pos |
position of the title, if |
main.cex |
size of the title, if |
legend |
only used for two-dimensional graphs. Logical indicating if a legend should be added to the graph. |
legend.pos |
position of the legend, if |
legend.title |
optional title of the legend, if |
legend.lab |
legend labels, if |
pch |
symbol(s) used for points, when points are displayed (see |
cex |
size of the points and/or of the variable names. For two-dimensional graphs: if |
col |
color(s) used for points and/or variable names. If |
lwd |
only used if arrows are displayed. Width of arrows. If |
drawintaxes |
logical indicating if internal axes should be drawn. |
add |
only used for two-dimensional graphs. Logical indicating if the correlation circle should be added to an existing graph. |
add.const |
only used for two-dimensional graphs and if |
keepmar |
only used for two-dimensional graphs. Logical indicating if margins defined by MVA.corplot should be kept after plotting (necessary in some cases when |
Details
This function should not be use directly. Prefer the general MVA.plot
, to which all arguments can be passed.
Many multivariate analyses are supported, from various packages:
- sPCA: spca
.
- IPCA: ipca
.
- sIPCA: sipca
.
- PLS-DA (PLS2 on a dummy-coded factor): plsda
. X space only.
- sPLS-DA (sPLS2 on a dummy-coded factor): splsda
. X space only.
- CPPLS: mvr
. Set 1 is X, set 2 is Y. If set=12
(default), fac
is not available and pch
,cex
, col
, lwd
can be defined differently for each set. X space only.
- PLSR: mvr
, pls
, plsR
(plsRglm package). Set 1 is X, set 2 is Y. If set=12
(default), fac
is not available and pch
,cex
, col
, lwd
can be defined differently for each set. X space only.
- sPLSR: pls
. Set 1 is X, set 2 is Y. If set=12
(default), fac
is not available and pch
,cex
, col
, lwd
can be defined differently for each set. X space only.
- PLS-GLR: plsRglm
(plsRglm package). Set 1 is X, set 2 is Y. If set=12
(default), fac
is not available and pch
,cex
, col
, lwd
can be defined differently for each set. Correlations are computed with Y on the link scale.
- PCR: mvr
. Set 1 is X, set 2 is Y. If set=12
(default), fac
is not available and pch
,cex
, col
, lwd
can be defined differently for each set.
- CDA: discrimin
, discrimin.coa
.
- NSCOA: dudi.nsc
. For NSCOA there is no real correlation, but the classical representation of columns is arrows. This is why MVA.corplot was made able to deal with this analysis.
- CCA: cca
, pcaiv
. Constraints (only quantitative constraints are extracted) in constrained space only.
- Mix analysis: dudi.mix
, dudi.hillsmith
. Only quantitative variables are displayed.
- RDA (or PCAIV): pcaiv
, pcaivortho
, rda
. With rda
, space 1 is constrained space, space 2 is unconstrained space. Only constrained space is available with pcaiv
, the opposite for pcaivortho
. Set 1 is constraints (only quantitative constraints are extracted), set 2 is dependent variables (only set 2 is available for pcaivortho
). If set=12
(default), fac
is not available and pch
,cex
, col
, lwd
can be defined differently for each set.
- db-RDA: capscale
, dbrda
. Constraints (only quantitative constraints are extracted) in constrained space only.
- CCorA: CCorA
, rcc
. Space 1 is X, space 2 is Y. With rcc
a third space is available, in which coordinates are means of X and Y coordinates. In this third space, set 1 is X, set 2 is Y. If set=12
(default), fac
is not available and pch
,cex
, col
, lwd
can be defined differently for each set.
- rCCorA: rcc
. Space 1 is X, space 2 is Y, space 3 is a "common" space in which coordinates are means of X and Y coordinates. In space 3, set 1 is X and set 2 is Y. If set=12
(default), fac
is not available and pch
,cex
, col
, lwd
can be defined differently for each set.
- CIA: coinertia
. Space 1 is X, space 2 is Y, space 3 is a "common" space where X and Y scores are normed. In space 3, set 1 is X and set 2 is Y. If set=12
in space 3 (default), fac
is not available and pch
,cex
, col
, lws
can be defined differently for each set.
- PCIA: procuste
. Set 1 is X, set 2 is Y.
- 2B-PLS: pls
. Space 1 is X, space 2 is Y, space 3 is a "common" space in which coordinates are means of X and Y coordinates. In space 3, set 1 is X and set 2 is Y. If set=12
(default), fac
is not available and pch
,cex
, col
, lwd
can be defined differently for each set.
- 2B-sPLS: pls
. Space 1 is X, space 2 is Y, space 3 is a "common" space in which coordinates are means of X and Y coordinates. In space 3, set 1 is X and set 2 is Y. If set=12
(default), fac
is not available and pch
,cex
, col
, lwd
can be defined differently for each set.
- rGCCA: wrapper.rgcca
. Space can be 1 to n, the number of blocks (i.e. datasets).
- sGCCA: wrapper.sgcca
. Space can be 1 to n, the number of blocks (i.e. datasets).
- DIABLO: block.plsda
, block.splsda
. Space can be 1 to n, the number of blocks (i.e. datasets).
Author(s)
Maxime HERVE <maxime.herve@univ-rennes1.fr>
Examples
require(ade4)
data(olympic)
PCA <- dudi.pca(olympic$tab,scannf=FALSE)
MVA.plot(PCA,"corr")
Cross validation
Description
Performs cross validation with different PLS and/or discriminant analyses.
Usage
MVA.cv(X, Y, repet = 10, k = 7, ncomp = 8, scale = TRUE, model = c("PLSR",
"CPPLS", "PLS-DA", "PPLS-DA", "LDA", "QDA", "PLS-DA/LDA", "PLS-DA/QDA",
"PPLS-DA/LDA", "PPLS-DA/QDA"), lower = 0.5, upper = 0.5, Y.add = NULL,
weights = rep(1, nrow(X)), set.prior = FALSE, crit.DA = c("plug-in",
"predictive", "debiased"), ...)
Arguments
X |
a data frame of independent variables. |
Y |
the dependent variable(s): numeric vector, data frame of quantitative variables or factor. |
repet |
an integer giving the number of times the whole procedure has to be repeated. |
k |
an integer giving the number of folds (can be re-set internally if needed). |
ncomp |
an integer giving the number of components to be used for all models except LDA and QDA (can be re-set depending on the size of the train sets). |
scale |
logical indicating if data should be scaled (see Details). |
model |
the model to be fitted (see Details). |
lower |
a vector of lower limits for power optimisation in CPPLS or PPLS-DA (see |
upper |
a vector of upper limits for power optimisation in CPPLS or PPLS-DA (see |
Y.add |
a vector or matrix of additional responses containing relevant information about the observations, in CPPLS or PPLS-DA (see |
weights |
a vector of individual weights for the observations, in CPPLS or PPLS-DA (see |
set.prior |
only used when a LDA or QDA is performed (coupled or not with a PLS model). If |
crit.DA |
criterion used to predict class membership when a LDA or QDA is used. See |
... |
other arguments to pass to |
Details
When a discriminant analysis is used ("PLS-DA"
, "PPLS-DA"
, "LDA"
, "QDA"
, "PLS-DA/LDA"
, "PLS-DA/QDA"
, "PPLS-DA/LDA"
or "PPLS-DA/QDA"
), the training sets are generated in respect to the relative proportions of the levels of Y
in the original data set (see splitf
).
"PLS-DA"
is considered as PLS2 on a dummy-coded response. For a PLS-DA based on the CPPLS algorithm, use "PPLS-DA"
with lower
and upper
limits of the power parameters set to 0.5
.
If scale = TRUE
, the scaling is done as this: for each step of the validation loop (i.e. k
steps), the training set is pre-processed by centering and unit-variance scaling. Means and standard deviations of variables in the training set are then used to scale the test set.
Value
model |
model used. |
type |
type of model used. |
repet |
number of times the whole procedure was repeated. |
k |
number of folds. |
ncomp |
number of components used. |
crit.DA |
criterion used to classify individuals of the test sets. |
groups |
levels of |
models.list |
list of of models generated ( |
models1.list |
list of of (P)PLS-DA models generated ( |
models2.list |
list of of LDA/QDA models generated ( |
RMSEP |
RMSEP vales ( |
Q2 |
Q2 values ( |
NMC |
Classification error rates ( |
confusion |
Confusion matrices ( |
pred.prob |
Probability of each individual of being of each level of |
Author(s)
Maxime HERVE <maxime.herve@univ-rennes1.fr>
See Also
predict.MVA.cmv
, mvr
, lda
, qda
Examples
require(pls)
require(MASS)
# PLSR
data(yarn)
## Not run: MVA.cv(yarn$NIR,yarn$density,model="PLSR")
# PPLS-DA coupled to LDA
data(mayonnaise)
## Not run: MVA.cv(mayonnaise$NIR,factor(mayonnaise$oil.type),model="PPLS-DA/LDA")
Loadings of multivariate analyses
Description
Returns loadings of a multivariate analysis.
Usage
MVA.load(x, xax = 1, yax = 2, set = c(12, 1, 2), space = 1, ...)
Arguments
x |
a multivariate analysis (see Details). |
xax |
axis or axes for which to extract loadings. |
yax |
axis for which to extract loadings (ignored if |
set |
variables to be displayed, when several sets are available (see Details). |
space |
variables to be displayed, when several spaces are available (see Details). |
... |
not used. |
Details
Many multivariate analyses are supported, from various packages:
- PCA: prcomp
, princomp
, dudi.pca
, rda
, pca
, pca
.
- sPCA: spca
.
- IPCA: ipca
.
- sIPCA: sipca
.
- PLS-DA (PLS2 on a dummy-coded factor): plsda
. X space only.
- sPLS-DA (sPLS2 on a dummy-coded factor): splsda
. X space only.
- CPPLS: mvr
. X space only.
- PLSR: mvr
, pls
, plsR
(plsRglm package). X space only.
- sPLSR: pls
. X space only.
- PLS-GLR: plsRglm
(plsRglm package).
- PCR: mvr
.
- CDA: discrimin
, discrimin.coa
.
- NSCOA: dudi.nsc
.
- MCA: dudi.acm
.
- Mix analysis: dudi.mix
, dudi.hillsmith
.
- PCIA: procuste
. Set 1 is X, set 2 is Y.
- RDA (or PCAIV): pcaiv
, pcaivortho
, rda
. With rda
, space 1 is constrained space, space 2 is unconstrained space. Only constrained space is available with pcaiv
, the opposite for pcaivortho
.
- CCorA: rcc
. Space 1 is X, space 2 is Y.
- rCCorA: rcc
. Space 1 is X, space 2 is Y.
- CIA: coinertia
. Space 1 is X, space 2 is Y.
- 2B-PLS: pls
. Space 1 is X, space 2 is Y.
- 2B-sPLS: pls
. Space 1 is X, space 2 is Y.
- rGCCA: wrapper.rgcca
. Space can be 1 to n, the number of blocks (i.e. datasets).
- sGCCA: wrapper.sgcca
. Space can be 1 to n, the number of blocks (i.e. datasets).
- DIABLO: block.plsda
, block.splsda
. Space can be 1 to n, the number of blocks (i.e. datasets).
Author(s)
Maxime HERVE <maxime.herve@univ-rennes1.fr>
Loading plot of multivariate analyses
Description
Displays a loading plot of a multivariate analysis.
Usage
MVA.loadplot(x, xax = 1, yax = 2, fac = NULL, set = c(12, 1, 2), space = 1, map = TRUE,
xlab = NULL, ylab = NULL, main = NULL, points = TRUE, ident = TRUE, links = TRUE,
line = TRUE, labels = NULL, main.pos = c("bottomleft", "topleft","bottomright",
"topright"), main.cex = 1.3, legend = FALSE, legend.pos = c("topleft", "topright",
"bottomleft", "bottomright"), legend.title = NULL, legend.lab = NULL, pch = 16,
cex = 1, col = 1, lwd = 1, lty = 1, drawextaxes = TRUE, drawintaxes = TRUE, xlim = NULL,
ylim = NULL)
Arguments
x |
a multivariate analysis (see Details). |
xax |
the horizontal axis. |
yax |
the vertical axis. This can be set to |
fac |
only used for one-dimensional graphs. An optional factor defining groups of variables. |
set |
variables to be displayed, when several sets are available (see Details). |
space |
variables to be displayed, when several spaces are available (see Details). |
map |
logical indicating if a two-dimensional ( |
xlab |
only used for two-dimensional graphs. Legend of the horizontal axis. If |
ylab |
legend of the vertical axis. If |
main |
optional title of the graph. |
points |
only used for two-dimensional graphs. If |
ident |
logical indicating if variable names should be displayed. Only used when |
links |
only used for two-dimensional graphs when |
line |
only used for one-dimensional graphs when |
labels |
only used if |
main.pos |
only used for one-dimensional graphs. Position of the title, if |
main.cex |
size of the title, if |
legend |
logical indicating if a legend should be added to the graph. |
legend.pos |
position of the legend, if |
legend.title |
optional title of the legend, if |
legend.lab |
legend labels, if |
pch |
only used for two-dimensional graphs. Symbol(s) used for points, when points are displayed (see |
cex |
size of the points and/or of the variable names. For two-dimensional graphs: if |
col |
color(s) used for points, variable names and/or lines/sticks. For one-dimensional graphs, can be a vector of length one or a vector giving one value per line. For two-dimensional graphs: if |
lwd |
width of lines. For one-dimensional graphs, can be a vector of length one or a vector giving one value per line. For two-dimensional graphs: if |
lty |
only used for one-dimensional graphs. Can be a vector of length one or a vector giving one value per line. |
drawextaxes |
logical indicating if external axes should be drawn. |
drawintaxes |
only used for two-dimensional graphs. Logical indicating if internal axes should be drawn. |
xlim |
only used in two-dimensional graphs. Limits of the horizontal axis. If |
ylim |
limits of the vertical axis. If |
Details
This function should not be use directly. Prefer the general MVA.plot
, to which all arguments can be passed.
Many multivariate analyses are supported, from various packages:
- PCA: prcomp
, princomp
, dudi.pca
, rda
, pca
, pca
.
- sPCA: spca
.
- IPCA: ipca
.
- sIPCA: sipca
.
- PLS-DA (PLS2 on a dummy-coded factor): plsda
. X space only.
- sPLS-DA (sPLS2 on a dummy-coded factor): splsda
. X space only.
- CPPLS: mvr
. X space only.
- PLSR: mvr
, pls
, plsR
(plsRglm package). X space only.
- sPLSR: pls
. X space only.
- PLS-GLR: plsRglm
(plsRglm package).
- PCR: mvr
.
- CDA: discrimin
, discrimin.coa
.
- NSCOA: dudi.nsc
.
- MCA: dudi.acm
.
- Mix analysis: dudi.mix
, dudi.hillsmith
.
- PCIA: procuste
. Set 1 is X, set 2 is Y.
- RDA (or PCAIV): pcaiv
, pcaivortho
, rda
. With rda
, space 1 is constrained space, space 2 is unconstrained space. Only constrained space is available with pcaiv
, the opposite for pcaivortho
.
- CCorA: rcc
. Space 1 is X, space 2 is Y.
- rCCorA: rcc
. Space 1 is X, space 2 is Y.
- CIA: coinertia
. Space 1 is X, space 2 is Y.
- 2B-PLS: pls
. Space 1 is X, space 2 is Y.
- 2B-sPLS: pls
. Space 1 is X, space 2 is Y.
- rGCCA: wrapper.rgcca
. Space can be 1 to n, the number of blocks (i.e. datasets).
- sGCCA: wrapper.sgcca
. Space can be 1 to n, the number of blocks (i.e. datasets).
- DIABLO: block.plsda
, block.splsda
. Space can be 1 to n, the number of blocks (i.e. datasets).
Author(s)
Maxime HERVE <maxime.herve@univ-rennes1.fr>
Examples
require(ade4)
data(olympic)
PCA <- dudi.pca(olympic$tab,scannf=FALSE)
MVA.plot(PCA,"load")
Paired plot of multivariate analyses
Description
Displays a paired plot (i.e. a score plot of paired points) of a multivariate analysis.
Usage
MVA.pairplot(x, xax = 1, yax = 2, pairs = NULL, scaling = 2, space = 1, fac = NULL,
xlab = NULL, ylab = NULL, main = NULL, ident = TRUE, labels = NULL, cex = 0.7, col = 1,
lwd = 1, main.pos = c("bottomleft", "topleft", "bottomright", "topright"),
main.cex = 1.3, legend = FALSE, legend.pos = c("topleft", "topright", "bottomleft",
"bottomright"), legend.title = NULL, legend.lab = NULL, drawextaxes = TRUE,
drawintaxes = TRUE, xlim = NULL, ylim = NULL)
Arguments
x |
a multivariate analysis (see Details). |
xax |
the horizontal axis. |
yax |
the vertical axis. Cannot be |
pairs |
two-level factor identifying paired individuals (in the same order in both sets of points). Can be omitted with multivariate analyses where two sets of points are available in the same space (see |
scaling |
type of scaling. Only available with some analyses performed with the |
space |
scores to be displayed, when several spaces are available (see Details of |
fac |
an optional factor defining groups pairs. |
xlab |
legend of the horizontal axis. If |
ylab |
legend of the vertical axis. If |
main |
optional title of the graph. |
ident |
logical indicating if variable names should be displayed. |
labels |
names of the individuals. If |
cex |
size of the labels. If |
col |
color(s) used for arrows and labels. If |
lwd |
width of arrows. If |
main.pos |
position of the title, if |
main.cex |
size of the title, if |
legend |
logical indicating if a legend should be added to the graph. |
legend.pos |
position of the legend, if |
legend.title |
optional title of the legend, if |
legend.lab |
legend labels, if |
drawextaxes |
logical indicating if external axes should be drawn.. |
drawintaxes |
logical indicating if internal axes should be drawn. |
xlim |
limits of the horizontal axis. If |
ylim |
limits of the vertical axis. If |
Details
This function should not be use directly. Prefer the general MVA.plot
, to which all arguments can be passed.
All multivariate analyses supported by MVA.scoreplot
can be used for a paired plot.
Author(s)
Maxime HERVE <maxime.herve@univ-rennes1.fr>
Examples
require(ade4)
data(macaca)
PCIA <- procuste(macaca$xy1,macaca$xy2)
MVA.plot(PCIA,"pairs")
Plotting of multivariate analyses
Description
Displays several kinds of plots for multivariate analyses.
Usage
MVA.plot(x, type = c("scores", "loadings", "correlations", "biplot", "pairs",
"trajectories"), ...)
Arguments
x |
a multivariate analysis (see Details). |
type |
the type of plot to be displayed: score plot (default), loading plot, correlation circle, biplot, score plot showing paired samples or score plot showing trajectories, respectively. |
... |
arguments to be passed to subfunctions. See Details. |
Details
Different subfunctions are used depending on the type of plot to be displayed: MVA.scoreplot
, MVA.loadplot
, MVA.corplot
, MVA.biplot
, MVA.pairplot
or MVA.trajplot
. These functions should not be used directly (everything can be done with the general MVA.plot
) but for convenience, arguments and analyses supported are detailed in separate help pages.
Warning: the use of attach
before running a multivariate analysis can prevent MVA.plot
to get the values it needs, and make it fail.
Author(s)
Maxime HERVE <maxime.herve@univ-rennes1.fr>
Score plot of multivariate analyses
Description
Displays a score plot of a multivariate analysis.
Usage
MVA.scoreplot(x, xax = 1, yax = 2, scaling = 2, set = c(12, 1, 2), space = 1,
byfac = TRUE, fac = NULL, barycenters = TRUE, stars = TRUE, contours = FALSE,
dhist = TRUE, weights = 1, xlab = NULL, ylab = NULL, main = NULL, pch = 16,
cex = 1, col = 1, points = TRUE, labels = NULL, main.pos = c("bottomleft",
"topleft", "bottomright", "topright"), main.cex = 1.3, fac.lab = NULL,
fac.cex = 1, legend = FALSE, legend.pos = c("topleft", "topright", "bottomleft",
"bottomright"), legend.title = NULL, legend.lab = NULL, legend.cex = 1,
drawextaxes = TRUE, drawintaxes = TRUE, xlim = NULL, ylim = NULL,
keepmar = FALSE)
Arguments
x |
a multivariate analysis (see Details). |
xax |
the horizontal axis. |
yax |
the vertical axis. This can be set to |
scaling |
type of scaling. Only available with some analyses performed with the |
set |
scores to be displayed, when several sets are available (see Details). |
space |
scores to be displayed, when several spaces are available (see Details). |
byfac |
only used with MCA and mix analyses (see Details). If |
fac |
an optional factor defining groups of individuals. |
barycenters |
only used if |
stars |
only used if |
contours |
only used if |
dhist |
only used in the one-dimensional case. If |
weights |
individual weights, used to calculate barycenter positions (see |
xlab |
legend of the horizontal axis. If |
ylab |
legend of the vertical axis. If |
main |
optional title of the graph. Can be a vector of several values for MCA and mix analyses when |
pch |
symbol(s) used for points, when points are displayed (see |
cex |
size of the points or of the labels (see |
col |
color(s) used for points or labels (see |
points |
only used for two-dimensional graphs. If |
labels |
used in two-dimensional graphs when |
main.pos |
position of the title, if |
main.cex |
size of the title, if |
fac.lab |
only used if |
fac.cex |
only used if |
legend |
logical indicating if a legend should be added to the graph. Available for two-dimensional graphs and for density histograms in the one-dimensional case (see |
legend.pos |
position of the legend, if |
legend.title |
optional title of the legend, if |
legend.lab |
legend labels, if |
legend.cex |
size of legend labels, if |
drawextaxes |
logical indicating if external axes should be drawn. Available for two-dimensional graphs and for density histograms in the one-dimensional case (see |
drawintaxes |
logical indicating if internal axes should be drawn. |
xlim |
limits of the horizontal axis. If |
ylim |
only used in two-dimensional graphs. Limits of the vertical axis. If |
keepmar |
only used in two-dimensional graphs. Logical indicating if margins defined by MVA.scoreplot should be kept after plotting (necessary for biplots). |
Details
This function should not be use directly. Prefer the general MVA.plot
, to which all arguments can be passed.
Many multivariate analyses are supported, from various packages:
- PCA: prcomp
, princomp
(if scores=TRUE
), dudi.pca
, rda
, pca
, pca
. scaling
can be defined for rda
(see scores.rda
).
- sPCA: spca
.
- IPCA: ipca
.
- sIPCA: sipca
.
- PCoA: cmdscale
(with at least on non-default argument), dudi.pco
, wcmdscale
(with at least one non-default argument), capscale
, pco
, pcoa
.
- nMDS: monoMDS
, metaMDS
, nmds
, isoMDS
.
- PLS-DA (PLS2 on a dummy-coded factor): plsda
. X space only.
- sPLS-DA (sPLS2 on a dummy-coded factor): splsda
. X space only.
- CPPLS: mvr
. X space only.
- PLSR: mvr
, pls
, plsR
(plsRglm package). X space only.
- sPLSR: pls
. X space only.
- PLS-GLR: plsRglm
(plsRglm package).
- PCR: mvr
.
- CDA: discrimin
, discrimin.coa
.
- NSCOA: dudi.nsc
.
- MCA: dudi.acm
.
- Mix analysis: dudi.mix
, dudi.hillsmith
.
- COA: dudi.coa
, cca
. Set 1 is rows, set 2 is columns. If set=12
(default), fac
is not available and pch
,cex
, col
can be defined differently for each set. scaling
can be defined for cca
(see scores.cca
).
- DCOA: dudi.dec
. Set 1 is rows, set 2 is columns. If set=12
(default), fac
is not available and pch
,cex
, col
can be defined differently for each set.
- PCIA: procuste
. Set 1 is X, set 2 is Y. If set=12
(default), fac
is not available and pch
,cex
, col
can be defined differently for each set.
- Procrustean superimposition: procrustes
. Set 1 is X, set 2 is Y. If set=12
(default), fac
is not available and pch
,cex
, col
can be defined differently for each set.
- GPA: GPA
. Only the consensus ordination can be displayed.
- DPCoA: dpcoa
. Set 1 is categories, set 2 is collections. If set=12
(default), fac
is not available and pch
,cex
, col
can be defined differently for each set.
- RDA (or PCAIV): pcaiv
, pcaivortho
, rda
. With rda
, space 1 is constrained space, space 2 is unconstrained space. Only constrained space is available with pcaiv
, the opposite for pcaivortho
. scaling
can be defined for rda
(see scores.rda
).
- db-RDA (or CAP): capscale
, dbrda
. Space 1 is constrained space, space 2 is unconstrained space.
- CCA: pcaiv
, cca
. With rda
, space 1 is constrained space, space 2 is unconstrained space. Only constrained space is available with pcaiv
. Set 1 is rows, set 2 is columns. scaling
can be defined for cca
(see scores.cca
).
- CCorA: CCorA
, rcc
. Space 1 is X, space 2 is Y. With rcc
a third space is available, in which coordinates are means of X and Y coordinates.
- rCCorA: rcc
. Space 1 is X, space 2 is Y, space 3 is a "common" space in which coordinates are means of X and Y coordinates.
- CIA: coinertia
. Space 1 is X, space 2 is Y, space 3 is a "common" space where X and Y scores are normed. In space 3, set 1 is X and set 2 is Y. If set=12
in space 3 (default), fac
is not available and pch
,cex
, col
can be defined differently for each set.
- 2B-PLS: pls
. Space 1 is X, space 2 is Y, space 3 is a "common" space in which coordinates are means of X and Y coordinates.
- 2B-sPLS: pls
. Space 1 is X, space 2 is Y, space 3 is a "common" space in which coordinates are means of X and Y coordinates.
- rGCCA: rgcca
, wrapper.rgcca
. Space can be 1 to n, the number of blocks (i.e. datasets).
- sGCCA: rgcca
, wrapper.sgcca
. Space can be 1 to n, the number of blocks (i.e. datasets).
- DIABLO: block.plsda
, block.splsda
. Space can be 1 to n, the number of blocks (i.e. datasets).
Author(s)
Maxime HERVE <maxime.herve@univ-rennes1.fr>
Examples
data(iris)
PCA <- prcomp(iris[,1:4])
MVA.plot(PCA,"scores")
MVA.plot(PCA,"scores",fac=iris$Species,col=1:3,pch=15:17)
Scores of multivariate analyses
Description
Returns scores of a multivariate analysis.
Usage
MVA.scores(x, xax = 1, yax = 2, scaling = 2, set = c(12, 1, 2), space = 1, ...)
Arguments
x |
a multivariate analysis (see Details). |
xax |
axis or axes for which to extract scores. |
yax |
axis for which to extract scores (ignored if |
scaling |
type of scaling. Only available with some analyses performed with the |
set |
scores to be displayed, when several sets are available (see Details). |
space |
scores to be displayed, when several spaces are available (see Details). |
... |
not used. |
Details
Many multivariate analyses are supported, from various packages:
- PCA: prcomp
, princomp
(if scores=TRUE
), dudi.pca
, rda
, pca
, pca
. scaling
can be defined for rda
(see scores.rda
).
- sPCA: spca
.
- IPCA: ipca
.
- sIPCA: sipca
.
- PCoA: cmdscale
(with at least on non-default argument), dudi.pco
, wcmdscale
(with at least one non-default argument), capscale
, pco
, pcoa
.
- nMDS: monoMDS
, metaMDS
, nmds
, isoMDS
.
- PLS-DA (PLS2 on a dummy-coded factor): plsda
. X space only.
- sPLS-DA (sPLS2 on a dummy-coded factor): splsda
. X space only.
- CPPLS: mvr
. X space only.
- PLSR: mvr
, pls
, plsR
(plsRglm package). X space only.
- sPLSR: pls
. X space only.
- PLS-GLR: plsRglm
(plsRglm package).
- PCR: mvr
.
- CDA: discrimin
, discrimin.coa
.
- NSCOA: dudi.nsc
.
- MCA: dudi.acm
.
- Mix analysis: dudi.mix
, dudi.hillsmith
.
- COA: dudi.coa
, cca
. Set 1 is rows, set 2 is columns. If set=12
(default), fac
is not available and pch
,cex
, col
can be defined differently for each set. scaling
can be defined for cca
(see scores.cca
).
- DCOA: dudi.dec
. Set 1 is rows, set 2 is columns. If set=12
(default), fac
is not available and pch
,cex
, col
can be defined differently for each set.
- PCIA: procuste
. Set 1 is X, set 2 is Y. If set=12
(default), fac
is not available and pch
,cex
, col
can be defined differently for each set.
- Procrustean superimposition: procrustes
. Set 1 is X, set 2 is Y. If set=12
(default), fac
is not available and pch
,cex
, col
can be defined differently for each set.
- GPA: GPA
. Only the consensus ordination can be displayed.
- DPCoA: dpcoa
. Set 1 is categories, set 2 is collections. If set=12
(default), fac
is not available and pch
,cex
, col
can be defined differently for each set.
- RDA (or PCAIV): pcaiv
, pcaivortho
, rda
. With rda
, space 1 is constrained space, space 2 is unconstrained space. Only constrained space is available with pcaiv
, the opposite for pcaivortho
. scaling
can be defined for rda
(see scores.rda
).
- db-RDA (or CAP): capscale
, dbrda
. Space 1 is constrained space, space 2 is unconstrained space.
- CCA: pcaiv
, cca
. With rda
, space 1 is constrained space, space 2 is unconstrained space. Only constrained space is available with pcaiv
. Set 1 is rows, set 2 is columns. scaling
can be defined for cca
(see scores.cca
).
- CCorA: CCorA
, rcc
. Space 1 is X, space 2 is Y. With rcc
a third space is available, in which coordinates are means of X and Y coordinates.
- rCCorA: rcc
. Space 1 is X, space 2 is Y, space 3 is a "common" space in which coordinates are means of X and Y coordinates.
- CIA: coinertia
. Space 1 is X, space 2 is Y, space 3 is a "common" space where X and Y scores are normed. In space 3, set 1 is X and set 2 is Y. If set=12
in space 3 (default), fac
is not available and pch
,cex
, col
can be defined differently for each set.
- 2B-PLS: pls
. Space 1 is X, space 2 is Y, space 3 is a "common" space in which coordinates are means of X and Y coordinates.
- 2B-sPLS: pls
. Space 1 is X, space 2 is Y, space 3 is a "common" space in which coordinates are means of X and Y coordinates.
- rGCCA: rgcca
, wrapper.rgcca
. Space can be 1 to n, the number of blocks (i.e. datasets).
- sGCCA: rgcca
, wrapper.sgcca
. Space can be 1 to n, the number of blocks (i.e. datasets).
- DIABLO: block.plsda
, block.splsda
. Space can be 1 to n, the number of blocks (i.e. datasets).
Author(s)
Maxime HERVE <maxime.herve@univ-rennes1.fr>
Synthesis quality of multivariate analyses
Description
Gives a simple estimator of the quality of the (descriptive) synthesis performed by a wide range of multivariate analyses.
Usage
MVA.synt(x, rows = 5)
Arguments
x |
a multivariate analysis (see Details). |
rows |
maximum number of axes to print in the output. |
Details
Many multivariate analyses are supported, from various packages.
- PCA: prcomp
, princomp
, dudi.pca
, rda
, pca
, pca
: % of total variance explained by each axis.
- sPCA: spca
: % of total variance explained by each axis.
- IPCA: ipca
: kurtosis of each axis.
- sIPCA: sipca
: kurtosis of each axis.
- PCoA: cmdscale
(with eig=TRUE
), dudi.pco
, wcmdscale
(with eig=TRUE
), capscale
, pco
, pcoa
: % of total variance explained by each axis.
- nMDS: monoMDS
, metaMDS
, nmds
, isoMDS
: stress.
- RDA: pcaiv
, pcaivortho
, rda
: % of constrained and unconstrained total variance, % of constrained variance explained by constrained axes (pcaiv
and rda
), % of unconstrained variance explained by unconstrained axes (pcaivortho
and rda
).
- db-RDA (or CAP): capscale
, dbrda
: % of constrained and unconstrained total variance, % of constrained variance explained by constrained axes, % of unconstrained variance explained by unconstrained axes.
- COA: dudi.coa
, cca
: % of total inertia explained by each axis.
- CCA: pcaiv
, cca
: % of constrained and unconstrained total inertia, % of constrained inertia explained by constrained axes, % of unconstrained inertia explained by unconstrained axes (cca
only).
- CPPLS: mvr
: % of X and Y variances explained by each axis.
- PLSR: mvr
, plsR
(plsRglm package): % of X and Y variances explained by each axis (only Y for the moment with plsR
).
- 2B-PLS: pls
: % of X/Y square covariance explained by each pair of axes, correlation between each pair of axes (canonical correlations).
- CCorA: CCorA
, rcc
: correlation between each pair of axes (canonical correlations).
- rCCorA: rcc
: correlation between each pair of axes (canonical correlations).
- PCR: mvr
: % of X and Y variances explained by each axis.
- MCA: dudi.acm
: % of total inertia explained by each axis.
- Mix analysis: dudi.mix
, dudi.hillsmith
: % of total inertia explained by each axis.
- GPA: GPA
: % of consensus and residual variance, % of total variance exlained by each axis, % of consensus variance explained by each axis, % of residual variance coming from each group of variables.
- RGCCA: rgcca
, wrapper.rgcca
: % of total intra-block variance explained by each axis, correlation between each pair of axes (canonical correlations).
- DIABLO: block.plsda
, block.splsda
: % of total intra-block variance explained by each axis, correlation between each pair of axes (canonical correlations).
- CIA: coinertia
: RV coefficient, % of co-inertia explained by each pair of axes, correlation between each pair of axes (canonical correlations).
- PCIA: procuste
: m2.
Author(s)
Maxime HERVE <maxime.herve@univ-rennes1.fr>
Examples
data(iris)
PCA <- prcomp(iris[,1:4])
MVA.synt(PCA)
Significance test based on cross (model) validation
Description
Performs a permutation significance test based on cross (model) validation with different PLS and/or discriminant analyses. See MVA.cv
and MVA.cmv
for more details about how cross (model) validation is performed.
Usage
MVA.test(X, Y, cmv = FALSE, ncomp = 8, kout = 7, kinn = 6, scale = TRUE,
model = c("PLSR", "CPPLS", "PLS-DA", "PPLS-DA", "LDA", "QDA", "PLS-DA/LDA",
"PLS-DA/QDA", "PPLS-DA/LDA","PPLS-DA/QDA"), Q2diff = 0.05, lower = 0.5,
upper = 0.5, Y.add = NULL, weights = rep(1, nrow(X)), set.prior = FALSE,
crit.DA = c("plug-in", "predictive", "debiased"), p.method = "fdr",
nperm = 999, progress = TRUE, ...)
Arguments
X |
a data frame of independent variables. |
Y |
the dependent variable(s): numeric vector, data frame of quantitative variables or factor. |
cmv |
a logical indicating if the values (Q2 or NMC) should be generated through cross-validation (classical K-fold process) or cross model validation (inner + outer loops). |
ncomp |
an integer giving the number of components to be used to generate all submodels (cross-validation) or the maximal number of components to be tested in the inner loop (cross model validation). Can be re-set internally if needed. Does not concern LDA and QDA. |
kout |
an integer giving the number of folds (cross-validation) or the number of folds in the outer loop (cross-model validation). Can be re-set internally if needed. |
kinn |
an integer giving the number of folds in the inner loop (cross model validation only). Can be re-set internally if needed. Cannot be |
scale |
logical indicating if data should be scaled. See help of |
model |
the model to be fitted. |
Q2diff |
the threshold to be used if the number of components is chosen according to Q2 (cross model validation only). |
lower |
a vector of lower limits for power optimisation in CPPLS or PPLS-DA (see |
upper |
a vector of upper limits for power optimisation in CPPLS or PPLS-DA (see |
Y.add |
a vector or matrix of additional responses containing relevant information about the observations, in CPPLS or PPLS-DA (see |
weights |
a vector of individual weights for the observations, in CPPLS or PPLS-DA (see |
set.prior |
only used when a LDA or QDA is performed (coupled or not with a PLS model). If |
crit.DA |
criterion used to predict class membership when a LDA or QDA is used. See |
p.method |
method for p-values correction. See help of |
nperm |
number of permutations. |
progress |
logical indicating if the progress bar should be displayed. |
... |
other arguments to pass to |
Details
When Y
consists in quantitative response(s), the null hypothesis is that each response is not predicted better than what would happen by chance. In this case, Q2 is used as the test statistic. When Y
contains several responses, a p-value is computed for each response and p-values are corrected for multiple testing.
When Y
is a factor, the null hypothesis is that the factor has no discriminant ability. In this case, the classification error rate (NMC) is used as the test statistic.
Whatever the response, the reference value of the test statistics is obtained by averaging 20 values coming from independently performed cross (model) validation on the original data.
The function deals with the limitted floating point precision, which can bias calculation of p-values based on a discrete test statistic distribution.
Value
method |
a character string indicating the name of the test. |
data.name |
a character string giving the name(s) of the data, plus additional information. |
statistic |
the value of the test statistics. |
permutations |
the number of permutations. |
p.value |
the p-value of the test. |
p.adjust.method |
a character string giving the method for p-values correction. |
Author(s)
Maxime HERVE <maxime.herve@univ-rennes1.fr>
References
Westerhuis J, Hoefsloot HCJ, Smit S, Vis DJ, Smilde AK, van Velzen EJJ, van Duijnhoven JPM and van Dorsten FA (2008) Assessment of PLSDA cross validation. Metabolomics 4:81-89.
See Also
Examples
require(pls)
require(MASS)
# PLSR
data(yarn)
## Not run: MVA.test(yarn$NIR,yarn$density,cmv=TRUE,model="PLSR")
# PPLS-DA coupled to LDA
data(mayonnaise)
## Not run: MVA.test(mayonnaise$NIR,factor(mayonnaise$oil.type),model="PPLS-DA/LDA")
Trajectory plot of multivariate analyses
Description
Displays a trajectory plot (i.e. a score plot with trajectories linking defined points) of a multivariate analysis.
Usage
MVA.trajplot(x, xax = 1, yax = 2, trajects, trajlab = NULL, scaling = 2,
set = c(12, 1, 2), space = 1, xlab = NULL, ylab = NULL, main = NULL,
pch = 16, cex = 1, trajlab.cex = 1, col = 1, lwd = 1, lty = 1,
points = TRUE, allpoints = TRUE, arrows = TRUE, labels = NULL,
main.pos = c("bottomleft", "topleft", "bottomright", "topright"),
main.cex = 1.3, legend = FALSE, legend.pos = c("topleft", "topright",
"bottomleft", "bottomright"), legend.title = NULL, legend.lab = NULL,
legend.cex = 1, drawextaxes = TRUE, drawintaxes = TRUE, xlim = NULL,
ylim = NULL)
Arguments
x |
a multivariate analysis (see Details). |
xax |
the horizontal axis. |
yax |
the vertical axis. Cannot be |
trajects |
vector or list of vectors identifying trajectories. Each vector should give the number of the individuals to be linked, ordered from the first to the last one. |
trajlab |
optional traject labels. |
scaling |
type of scaling. Only available with some analyses performed with the |
set |
scores to be displayed, when several sets are available (see Details of |
space |
scores to be displayed, when several spaces are available (see Details of |
xlab |
legend of the horizontal axis. If |
ylab |
legend of the vertical axis. If |
main |
optional title of the graph. |
pch |
symbols used for points. Can be a vector giving one value per trajectory (and a last one for non-linked points if |
cex |
size of the labels. Can be a vector giving one value per trajectory (and a last one for non-linked points if |
trajlab.cex |
size of trajectory labels. Can be a vector giving one value per trajectory. |
col |
color(s) used for arrows and labels. If |
lwd |
width of trajectory segments. Can be a vector giving one value per trajectory. |
lty |
type of trajectory segments. Can be a vector giving one value per trajectory. |
points |
logical indicating if points should be displayed. If |
allpoints |
logical indicating if points which do not belong to any trajectory should be drawn. |
arrows |
logical indicating if trajectories should be oriented with arrows. |
labels |
names of the individuals. If |
main.pos |
position of the title, if |
main.cex |
size of the title, if |
legend |
logical indicating if a legend should be added to the graph. |
legend.pos |
position of the legend, if |
legend.title |
optional title of the legend, if |
legend.lab |
legend labels, if |
legend.cex |
size of legend labels, if |
drawextaxes |
logical indicating if external axes should be drawn.. |
drawintaxes |
logical indicating if internal axes should be drawn. |
xlim |
limits of the horizontal axis. If |
ylim |
limits of the vertical axis. If |
Details
This function should not be use directly. Prefer the general MVA.plot
, to which all arguments can be passed.
All multivariate analyses supported by MVA.scoreplot
can be used for a paired plot.
Author(s)
Maxime HERVE <maxime.herve@univ-rennes1.fr>
Examples
require(ade4)
data(olympic)
PCA <- dudi.pca(olympic$tab,scannf=FALSE)
MVA.plot(PCA,"traject",trajects=list(1:10,25:30),col=c(2,3,1),trajlab=c("T1","T2"))
Odds-ratio (multinomial regression)
Description
Computes the odds ratios and their confidence interval for a predictor of a model fitted with multinom
.
Usage
OR.multinom(model, variable, conf.level = 0.95)
Arguments
model |
object of class |
variable |
any predictor present in |
conf.level |
confidence level. |
Author(s)
Maxime HERVE <maxime.herve@univ-rennes1.fr>
Variable Importance in the Projection (VIP)
Description
Returns VIP score of each X-variable in a PLS-DA (obtained from plsda
).
Usage
PLSDA.VIP(model, graph = FALSE)
Arguments
model |
object of class |
graph |
logical: should VIP scores be displayed? |
Value
tab |
table of results. |
sup1 |
name of X-variables having a VIP score > 1. |
Author(s)
Maxime HERVE <maxime.herve@univ-rennes1.fr>
See Also
Examples
if (require(mixOmics)) {
data(yeast)
model.PLSDA <- plsda(t(yeast$data),yeast$strain.cond)
PLSDA.VIP(model.PLSDA)
}
Type II permutation MANOVA using distance matrices
Description
This function is a wrapper to adonis2
but performs type II tests (whereas adonis2
performs type I).
Usage
adonis.II(formula, data, ...)
Arguments
formula |
a typical model formula such as |
data |
the data frame from which |
... |
additional arguments to |
Details
See adonis2
for detailed explanation of what is done. The only difference with adonis2
is that adonis.II
performs type II tests instead of type I.
Value
a data frame of class "anova"
.
Author(s)
Maxime HERVE <maxime.herve@univ-rennes1.fr>
Examples
require(vegan)
data(dune)
data(dune.env)
# Compare:
adonis2(dune~Management*A1,by="terms",data=dune.env,permutations=99)
adonis2(dune~A1*Management,by="terms",data=dune.env,permutations=99)
# With:
adonis.II(dune~Management*A1,data=dune.env,permutations=99)
adonis.II(dune~A1*Management,data=dune.env,permutations=99)
Back-transformation of EMMeans
Description
Back-transforms EMMeans (produced by emmeans
) when the model was built on a transformed response variable. This is typically the case when a LM(M) with log(x+1) as response variable gives a better fitting than a GLM(M) for count data, or when a beta regression takes as response a variable on the [0;1] interval that has been rescaled to the (0;1) interval using p.beta
.
Usage
back.emmeans(emm, transform = c("log", "logit", "sqrt", "4rt", "inverse", "p.beta"),
base = exp(1), add = 0, n = NULL, C = 2, ord = FALSE, decreasing = TRUE)
Arguments
emm |
object returned by |
transform |
transformation applied to the response variable before building the model on which |
base |
the base with respect to which the logarithm transformation was computed (if |
add |
value to be added to x before computing the transformation, if needed (e.g. |
n |
total number of observations in the initial data set. Only used with |
C |
number of categories from which initial continuous proportions were computed. Only used with |
ord |
logical indicating if back-transformed EMMeans should be ordered. |
decreasing |
logical indicating in which order back-transformed EMMeans should be ordered, if |
Author(s)
Maxime HERVE <maxime.herve@univ-rennes1.fr>
See Also
Examples
require(emmeans)
set.seed(1149)
tab <- data.frame(
response <- c(rpois(30,0),rpois(30,2),rpois(30,4)),
fact <- gl(3,30,labels=LETTERS[1:3])
)
model <- lm(log(response+1)~fact,data=tab)
EMM <- emmeans(model,~fact)
back.emmeans(EMM,transform="log",add=1)
Bootstrap
Description
Simplified version of the boot
function.
Usage
bootstrap(x, fun, nrep = 1000, conf.level = 0.95, ...)
Arguments
x |
numeric vector. |
fun |
function to be used for computation ( |
nrep |
number of replicates. |
conf.level |
confidence level for confidence interval. |
... |
additional arguments to |
Details
See help of the boot
function for more explanations.
Value
method |
the character string |
data.name |
a character string giving the name of the data. |
estimate |
the estimated original value |
conf.level |
confidence level for confidence interval. |
rep |
number of replicates. |
conf.int |
limits of the confidence interval. |
Author(s)
Maxime HERVE <maxime.herve@univ-rennes1.fr>
See Also
Examples
# Confidence interval of a mean
samp <- sample(1:50,10,replace=TRUE)
bootstrap(samp,function(x,i) mean(x[i]))
# Confidence interval of the standard error of the mean
bootstrap(samp,function(x,i) sd(x[i])/sqrt(length(x[i])))
Histogram for factor levels
Description
Draws a histogram of a numeric variable per level of a factor.
Usage
byf.hist(formula, data, sep = FALSE, density = TRUE, xlab = NULL, ylab = NULL,
col = NULL)
Arguments
formula |
a formula of the form |
data |
an optional data frame containing the variables in the formula |
sep |
logical. If |
density |
logical. If |
xlab |
label for x-axis (name of the response variable as default). |
ylab |
label for y-axis ("Density" or "Frequency" as default, depending on the type of histogram). |
col |
color(s) used for density curves or bars. |
Author(s)
Maxime HERVE <maxime.herve@univ-rennes1.fr>
See Also
Examples
data(iris)
byf.hist(Sepal.Length~Species,data=iris)
QQ-plot for factor levels
Description
Draws a multivariate QQ-plot of numeric variables per level of a factor.
Usage
byf.mqqnorm(formula, data)
Arguments
formula |
a formula of the form |
data |
an optional data frame containing the variables in the formula |
Author(s)
Maxime HERVE <maxime.herve@univ-rennes1.fr>
See Also
Examples
data(iris)
byf.mqqnorm(as.matrix(iris[,1:4])~Species,data=iris)
Shapiro-Wilk test for factor levels
Description
Performs a multivariate Shapiro-Wilk test on numeric variables per level of a factor.
Usage
byf.mshapiro(formula, data)
Arguments
formula |
a formula of the form |
data |
an optional data frame containing the variables in the formula |
Value
method |
name of the test. |
data.name |
a character string giving the names of the data. |
tab |
table of results. |
Author(s)
Maxime HERVE <maxime.herve@univ-rennes1.fr>
See Also
byf.mqqnorm
, mshapiro.test
, qqPlot
Examples
data(iris)
byf.mshapiro(as.matrix(iris[,1:4])~Species,data=iris)
QQ-plot for factor levels
Description
Draws a QQ-plot of a numeric variable per level of a factor.
Usage
byf.qqnorm(formula, data, ...)
Arguments
formula |
a formula of the form |
data |
an optional data frame containing the variables in the formula |
... |
other arguments to pass to |
Author(s)
Maxime HERVE <maxime.herve@univ-rennes1.fr>
See Also
link[RVAideMemoire]{byf.shapiro}
, qqPlot
Examples
data(iris)
byf.qqnorm(Sepal.Length~Species,data=iris)
Shapiro-Wilk test for factor levels
Description
Performs a Shapiro-Wilk test on a numeric variable per level of a factor.
Usage
byf.shapiro(formula, data)
Arguments
formula |
a formula of the form |
data |
an optional data frame containing the variables in the formula |
Value
method |
name of the test. |
data.name |
a character string giving the names of the data. |
tab |
table of results. |
Author(s)
Maxime HERVE <maxime.herve@univ-rennes1.fr>
See Also
Examples
data(iris)
byf.shapiro(Sepal.Length~Species,data=iris)
Cumulative Distribution Function of a known discrete distribution
Description
Returns an object similar to what is produced by ecdf
, but based on a known discrete distribution.
Usage
cdf.discrete(x, dist = c("binom", "geom", "hyper", "nbinom", "pois"), ...)
Arguments
x |
numeric vector of the observations. |
dist |
character string naming a discrete distribution ( |
... |
parameters of the distribution specified by |
Details
The function is intended to be used in goodness-of-fits tests for discrete distributions, such as proposed in the dgof
package.
Author(s)
Maxime HERVE <maxime.herve@univ-rennes1.fr>
Examples
if(require(dgof)) {
set.seed(1124)
resp <- rpois(20,2)
cvm.test(resp,cdf.discrete(resp,"pois",2))
}
Expected counts for comparison of response probabilities to given values
Description
Returns expected counts before comparing response probabilities (i.e. when the response variable is a binary variable) to given values by a chi-squared test. The function is in fact a wrapper to the chi-squared test for comparison of proportions to given values on a contingency table.
Usage
chisq.bin.exp(formula, data, p, graph = FALSE)
Arguments
formula |
a formula of the form |
data |
an optional data frame containing the variables in the formula |
p |
theoretical probabilities. |
graph |
logical. If |
Details
The function returns how many counts can be < 5 to respect Cochran's rule (80% of counts must be >= 5).
Value
p.theo |
theoretical probabilities. |
mat |
contingency table of expected counts. |
cochran |
number of counts which can be < 5. |
Author(s)
Maxime HERVE <maxime.herve@univ-rennes1.fr>
See Also
prop.test
, chisq.theo.bintest
, mosaicplot
Examples
response <- c(rep(0:1,c(40,60)),rep(0:1,c(55,45)),rep(0:1,c(65,35)))
fact <- gl(3,100,labels=LETTERS[1:3])
p.theo <- c(0.5,0.45,0.2)
chisq.bin.exp(response~fact,p=p.theo)
Pearson's Chi-squared test for binary variables
Description
Performs a Pearson's Chi-squared test for comparing response probabilities (i.e. when the response variable is a binary variable). The function is in fact a wrapper to the chi-squared test for comparison of proportions on a contingency table. If the p-value of the test is significant, the function performs pairwise comparisons by using Pearson's Chi-squared tests.
Usage
chisq.bintest(formula, data, correct = TRUE, alpha = 0.05, p.method = "fdr")
Arguments
formula |
a formula of the form |
data |
an optional data frame containing the variables in the formula |
correct |
a logical indicating whether to apply continuity correction when computing the test statistic for 2 by 2 tables. See help of |
alpha |
significance level to compute pairwise comparisons. |
p.method |
method for p-values correction. See help of |
Details
If the response is a 0/1 variable, the probability of the '1' group is tested. In any other cases, the response is transformed into a factor and the probability of the second level is tested.
Since a chi-squared test is an approximate test, an exact test is preferable when the number of individuals is small (200 is a reasonable minimum). See fisher.bintest
in that case.
Value
method.test |
a character string giving the name of the global test computed. |
data.name |
a character string giving the name(s) of the data. |
alternative |
a character string describing the alternative hypothesis. |
estimate |
the estimated probabilities. |
null.value |
the value of the difference in probabilities under the null hypothesis, always 0. |
statistic |
test statistics. |
parameter |
test degrees of freedom. |
p.value |
p-value of the global test. |
alpha |
significance level. |
p.adjust.method |
method for p-values correction. |
p.value.multcomp |
data frame of pairwise comparisons result. |
method.multcomp |
a character string giving the name of the test computed for pairwise comparisons. |
Author(s)
Maxime HERVE <maxime.herve@univ-rennes1.fr>
See Also
Examples
response <- c(rep(0:1,c(40,60)),rep(0:1,c(55,45)),rep(0:1,c(65,35)))
fact <- gl(3,100,labels=LETTERS[1:3])
chisq.bintest(response~fact)
Expected counts for comparison of proportions to given values
Description
Returns expected counts before comparing proportions to given values by a chi-squared test.
Usage
chisq.exp(data, p, graph = FALSE)
Arguments
data |
contingency table. |
p |
theoretical proportions. |
graph |
logical. If |
Details
The function returns how many counts can be < 5 to respect Cochran's rule (80% of counts must be >= 5).
Value
p.theo |
theoretical proportions. |
mat |
contingency table of expected counts. |
cochran |
number of counts which can be < 5. |
Author(s)
Maxime HERVE <maxime.herve@univ-rennes1.fr>
See Also
prop.test
, chisq.test
, mosaicplot
Examples
proportions <- sample(c(0,1),200,replace=TRUE)
populations <- sample(LETTERS[1:3],200,replace=TRUE)
tab.cont <- table(populations,proportions)
p.theo <- c(0.4,0.5,0.7)
chisq.exp(tab.cont,p=p.theo)
Pairwise comparisons after a chi-squared goodness-of-fit test
Description
Performs pairwise comparisons after a global chi-squared goodness-of-fit test.
Usage
chisq.multcomp(x, p.method = "fdr")
Arguments
x |
numeric vector (counts). |
p.method |
method for p-values correction. See help of |
Details
Since a chi-squared test is an approximate test, an exact test is preferable when the number of individuals is small (200 is a reasonable minimum). See multinomial.multcomp
in that case.
Value
method |
name of the test. |
data.name |
a character string giving the name(s) of the data. |
p.adjust.method |
method for p-values correction. |
p.value |
table of results. |
Author(s)
Maxime HERVE <maxime.herve@univ-rennes1.fr>
See Also
chisq.test
, multinomial.test
, multinomial.multcomp
Examples
counts <- c(49,30,63,59)
chisq.test(counts)
chisq.multcomp(counts)
Pearson's Chi-squared test for comparison of response probabilities to given values
Description
Performs a Pearson's Chi-squared test for comparing response probabilities (i.e. when the response variable is a binary variable) to given values. The function is in fact a wrapper to the chi-squared test for comparison of proportions to given values on a contingency table.
Usage
chisq.theo.bintest(formula, data, p)
Arguments
formula |
a formula of the form |
data |
an optional data frame containing the variables in the formula |
p |
theoretical probabilities. |
Details
If the response is a 0/1 variable, the probability of the '1' group is tested. In any other cases, the response is transformed into a factor and the probability of the second level is tested.
Value
method.test |
a character string giving the name of the test. |
data.name |
a character string giving the name(s) of the data. |
alternative |
a character string describing the alternative hypothesis, always two-sided. |
estimate |
the estimated probabilities. |
null.value |
the theoretical probabilities. |
statistic |
test statistics. |
parameter |
test degrees of freedom. |
p.value |
p-value of the test. |
Author(s)
Maxime HERVE <maxime.herve@univ-rennes1.fr>
See Also
prop.test
, chisq.bin.exp
, prop.bin.multcomp
Examples
response <- c(rep(0:1,c(40,60)),rep(0:1,c(55,45)),rep(0:1,c(65,35)))
fact <- gl(3,100,labels=LETTERS[1:3])
p.theo <- c(0.5,0.45,0.2)
chisq.theo.bintest(response~fact,p=p.theo)
Pairwise comparisons after a chi-squared test for given probabilities
Description
Performs pairwise comparisons after a global chi-squared test for given probabilities.
Usage
chisq.theo.multcomp(x, p = rep(1/length(x), length(x)), p.method = "fdr")
Arguments
x |
numeric vector (counts). |
p |
theoretical proportions. |
p.method |
method for p-values correction. See help of |
Details
Since a chi-squared test is an approximate test, an exact test is preferable when the number of individuals is small (200 is a reasonable minimum). See multinomial.theo.multcomp
in that case.
Value
method |
name of the test. |
data.name |
a character string giving the name(s) of the data. |
observed |
observed counts. |
expected |
expected counts. |
p.adjust.method |
method for p-values correction. |
statistic |
statistics of each test. |
p.value2 |
corrected p-values. |
p.value |
data frame of results. |
Author(s)
Maxime HERVE <maxime.herve@univ-rennes1.fr>
See Also
chisq.test
, multinomial.test
, multinomial.theo.multcomp
Examples
counts <- c(49,30,63,59)
p.theo <- c(0.2,0.1,0.45,0.25)
chisq.test(counts,p=p.theo)
chisq.theo.multcomp(counts,p=p.theo)
Cochran's Q test
Description
Performs the Cochran's Q test for unreplicated randomized block design experiments with a binary response variable and paired data. If the p-value of the test is significant, the function performs pairwise comparisons by using the Wilcoxon sign test.
Usage
cochran.qtest(formula, data, alpha = 0.05, p.method = "fdr")
Arguments
formula |
a formula of the form |
data |
an optional data frame containing the variables in the formula |
alpha |
significance level to compute pairwise comparisons. |
p.method |
method for p-values correction. See help of |
Details
If the response is a 0/1 variable, the probability of the '1' group is tested. In any other cases, the response is transformed into a factor and the probability of the second level is tested.
Value
method.test |
a character string giving the name of the global test computed. |
data.name |
a character string giving the name(s) of the data. |
alternative |
a character string describing the alternative hypothesis. |
estimate |
the estimated probabilities. |
null.value |
the value of the difference in probabilities under the null hypothesis, always 0. |
statistic |
test statistics (Pearson's Chi-squared test only). |
parameter |
test degrees of freedom (Pearson's Chi-squared test only). |
p.value |
p-value of the global test. |
alpha |
significance level. |
p.adjust.method |
method for p-values correction. |
p.value.multcomp |
data frame of pairwise comparisons result. |
method.multcomp |
a character string giving the name of the test computed for pairwise comparisons. |
Author(s)
Maxime HERVE <maxime.herve@univ-rennes1.fr>
Examples
response <- c(0,1,1,0,0,1,0,1,1,1,1,1,0,0,1,1,0,1,0,1,1,0,0,1,0,1,1,0,0,1)
fact <- gl(3,1,30,labels=LETTERS[1:3])
block <- gl(10,3,labels=letters[1:10])
cochran.qtest(response~fact|block)
Condition number of the Hessian matrix of a multinomial log-linear model
Description
Computes the condition number of the Hessian matrix of a model fitted with multinom
.
Usage
cond.multinom(model)
Arguments
model |
object of class |
Author(s)
Maxime HERVE <maxime.herve@univ-rennes1.fr>
Coordinates of projected points
Description
Returns the coordinates of a set of points when orthogonally projected on a new axis.
Usage
coord.proj(coord,slp)
Arguments
coord |
2-column data frame or matrix giving the original coordinates (left column: x, right column: y). |
slp |
slope of the new axis. |
Author(s)
Maxime HERVE <maxime.herve@univ-rennes1.fr>
Examples
data(iris)
# Original coordinates
plot(Petal.Length~Sepal.Length,pch=16,col=as.numeric(iris$Species),data=iris)
# New axis
abline(-6,1.6)
# Coordinates on new axis
new.coord <- coord.proj(iris[,c("Sepal.Length","Petal.Length")],1.6)
stripchart(new.coord~Species,data=iris,col=1:3)
Comparison of 2 Pearson's linear correlation coefficients
Description
Performs the test for equality of 2 Pearson's correlation coefficients. If the difference is not significant, the function returns the common coefficient, its confidence interval and performs the test for equality to a given value.
Usage
cor.2comp(var1, var2, var3, var4, alpha = 0.05, conf.level = 0.95, theo = 0)
Arguments
var1 |
numeric vector (first variable of the first correlation). |
var2 |
numeric vector (second variable of the first correlation). |
var3 |
numeric vector (first variable of the second correlation). |
var4 |
numeric vector (second variable of the second correlation). |
alpha |
significance level. |
conf.level |
confidence level. |
theo |
theoretical coefficient. |
Value
method.test |
a character string giving the name of the global test computed. |
data.name |
a character string giving the name(s) of the data. |
statistic |
test statistics. |
p.value |
p-value for comparison of the 2 coefficients. |
null.value |
the value of the difference in coefficients under the null hypothesis, always 0. |
alternative |
a character string describing the alternative hypothesis. |
estimate |
the estimated correlation coefficients. |
alpha |
significance level. |
conf.level |
confidence level. |
common.name |
a character string explaining the elements of the table below. |
common |
data frame of results if the coefficients are not significantly different (common coefficient). |
Author(s)
Maxime HERVE <maxime.herve@univ-rennes1.fr>
See Also
Examples
cor1.var1 <- 1:30+rnorm(30,0,2)
cor1.var2 <- 1:30+rnorm(30,0,3)
cor2.var1 <- (-1):-30+rnorm(30,0,2)
cor2.var2 <- (-1):-30+rnorm(30,0,3)
cor.2comp(cor1.var1,cor1.var2,cor2.var1,cor2.var2)
Equality of a Pearson's linear correlation coefficient to a given value
Description
Performs a test for equality of a Pearson's linear correlation coefficient to a given value.
Usage
cor.conf(var1, var2, theo)
Arguments
var1 |
numeric vector (first variable). |
var2 |
numeric vector (second variable). |
theo |
theoretical value. |
Value
method |
a character string giving the name of the test. |
data.name |
a character string giving the name(s) of the data. |
statistic |
test statistics. |
p.value |
p-value of the test. |
null.value |
the value of the theoretical coefficient. |
alternative |
a character string describing the alternative hypothesis. |
estimate |
the estimated correlation coefficient. |
Author(s)
Maxime HERVE <maxime.herve@univ-rennes1.fr>
See Also
Examples
var1 <- 1:30+rnorm(30,0,4)
var2 <- 1:30+rnorm(30,0,4)
cor.conf(var1,var2,theo=0.5)
Comparison of several Pearson's linear correlation coefficients
Description
Performs comparisons of several Pearson's linear correlation coefficients. If no difference, the function returns the common correlation coefficient, its confidence interval and test for its equality to a given value. If the difference is significant, the functions performs pairwise comparisons between coefficients.
Usage
cor.multcomp(var1, var2, fact, alpha = 0.05, conf.level = 0.95, theo = 0,
p.method = "fdr")
Arguments
var1 |
numeric vector (first variable). |
var2 |
numeric vector (second variable). |
fact |
factor (groups). |
alpha |
significance level. |
conf.level |
confidence level. |
theo |
theoretical coefficient. |
p.method |
method for p-values correction. See help of |
Value
method.test |
a character string giving the name of the global test computed. |
data.name |
a character string giving the name(s) of the data. |
statistic |
test statistics. |
parameter |
test degrees of freedom. |
p.value |
p-value for comparison of the coefficients. |
null.value |
the value of the difference in coefficients under the null hypothesis, always 0. |
alternative |
a character string describing the alternative hypothesis. |
estimate |
the estimated correlation coefficients. |
alpha |
significance level. |
conf.level |
confidence level. |
p.adjust.method |
method for p-values correction. |
p.value.multcomp |
data frame of pairwise comparisons result. |
common.name |
a character string explaining the elements of the table below. |
common |
data frame of results if the coefficients are not significantly different (common coefficient). |
Author(s)
Maxime HERVE <maxime.herve@univ-rennes1.fr>
See Also
Examples
var1 <- c(1:15+rnorm(15,0,4),1:15+rnorm(15,0,1),1:15+rnorm(15,0,8))
var2 <- c(-1:-15+rnorm(15,0,4),1:15+rnorm(15,0,1),1:15+rnorm(15,0,8))
fact <- gl(3,15,labels=LETTERS[1:3])
cor.multcomp(var1,var2,fact)
var3 <- c(1:15+rnorm(15,0,1),1:15+rnorm(15,0,3),1:15+rnorm(15,0,2))
cor.multcomp(var1,var3,fact)
Significance test for the covariance between two datasets
Description
Performs a permutation test based on the sum of square covariance between variables of two datasets, to test wether the (square) covariance is higher than expected under random association between the two datasets. The test is relevent parallel to a 2B-PLS.
Usage
cov.test(X, Y, scale.X = TRUE, scale.Y = TRUE, nperm = 999, progress = TRUE)
Arguments
X |
a numeric vector, matrix or data frame. |
Y |
a numeric vector, matrix or data frame. |
scale.X |
logical, if |
scale.Y |
logical, if |
nperm |
number of permutations. |
progress |
logical indicating if the progress bar should be displayed. |
Details
The function deals with the limitted floating point precision, which can bias calculation of p-values based on a discrete test statistic distribution.
Value
method |
a character string indicating the name of the test. |
data.name |
a character string giving the name(s) of the data, plus additional information. |
statistic |
the value of the test statistics. |
permutations |
the number of permutations. |
p.value |
the p-value of the test. |
Author(s)
Maxime HERVE <maxime.herve@univ-rennes1.fr>
Martingale residuals of a Cox model
Description
Plots martingale residuals of a Cox model against fitted values, to check for log-linearity of covariates.
Usage
cox.resid(model)
Arguments
model |
a |
Author(s)
Maxime HERVE <maxime.herve@univ-rennes1.fr>, based on an idea of John Fox.
References
Fox, J. 2002 Cox Proportional-Hazards Regression for Survival Data.
See Also
Examples
# 'kidney' dataset of package 'survival'
require(survival)
data(kidney)
model <- coxph(Surv(time,status)~age+factor(sex),data=kidney)
cox.resid(model)
Cramer's association coefficient
Description
Computes the Cramer's association coefficient between 2 nominal variables.
Usage
cramer(x, y)
Arguments
x |
a contingency table ('matrix' or 'table' object). |
y |
ignored if |
Author(s)
Maxime HERVE <maxime.herve@univ-rennes1.fr>
Examples
var1 <- sample(LETTERS[1:3],30,replace=TRUE)
var2 <- sample(letters[1:3],30,replace=TRUE)
cramer(var1,var2)
# or cramer(table(var1,var2))
Cramer's association coefficient
Description
Computes the Cramer's association coefficient between 2 nominal variables, its confidence interval (by bootstraping) and tests for its significance.
Usage
cramer.test(x, y, nrep = 1000, conf.level = 0.95)
Arguments
x |
a contingency table ('matrix' or 'table' object). |
y |
ignored if |
nrep |
number of replicates for bootstraping. |
conf.level |
confidence level. |
Value
method |
name of the test. |
statistic |
test statistics. |
parameter |
test degrees of freedom. |
p.value |
test p-value. |
data.name |
a character string giving the names of the data. |
estimate |
Cramer's coefficient. |
conf.level |
confidence level. |
rep |
number of replicates. |
conf.int |
confidence interval. |
alternative |
a character string giving the alternative hypothesis, always |
null.value |
the value of the association measure under the null hypothesis, always 0. |
Author(s)
Maxime HERVE <maxime.herve@univ-rennes1.fr>
See Also
Examples
var1 <- sample(LETTERS[1:3],30,replace=TRUE)
var2 <- sample(letters[1:3],30,replace=TRUE)
cramer.test(var1,var2)
# or cramer.test(table(var1,var2))
Coefficient of variation
Description
Computes the coefficient of variation of a vector.
Usage
cv(x, abs = TRUE, pc = TRUE)
Arguments
x |
numeric vector. |
abs |
logical. If |
pc |
logical. If |
Details
The function deals with missing values.
Author(s)
Maxime HERVE <maxime.herve@univ-rennes1.fr>
Examples
cv(rnorm(30))
Dendrogram and number of groups to be chosen
Description
Draws a dendrogram and an additional bar plot helping to choose the number of groups to be retained (based on the dendrogram).
Usage
dendro.gp(dend)
Arguments
dend |
a dendrogram obtained from |
Author(s)
Maxime HERVE <maxime.herve@univ-rennes1.fr>
See Also
Examples
data(iris)
distances <- dist(iris[,1:4],method="euclidian")
dendro <- hclust(distances,method="ward.D2")
dendro.gp(dendro)
Deprecated functions in RVAideMemoire package
Description
Functions that are not usable anymore, and will be entirely removed from the package in future versions.
Usage
back.lsmeans(...)
byf.normhist(...)
cor.sparse(...)
CvM.test(...)
DA.confusion(...)
DA.valid(...)
DA.var(...)
dunn.test(...)
fc.multcomp(...)
friedman.rating.test(...)
kruskal.rating.test(...)
pairwise.manova(...)
pairwise.to.groups(...)
pairwise.wilcox.rating.test(...)
plot1comp.ind(...)
plot1comp.var(...)
PLSDA.ncomp(...)
PLSDA.test(...)
rating.lsmeans(...)
s.corcircle2(...)
scat.mix.categorical(...)
scat.mix.numeric(...)
scatter.coa2(...)
wilcox.paired.rating.multcomp(...)
wilcox.rating.signtest(...)
wilcox.rating.test(...)
Arguments
... |
previous arguments. |
Details
back.lsmeans
and rating.lsmeans
are replaced by back.emmeans
and rating.emmeans
. More generally, stop using package lsmeans
and change to package emmeans
, its new version.
byf.normhist
was not very useful and byf.hist
does nearly the same job.
cor.sparse
is replaced by the more generic MVA.plot
.
CvM.test
did actually not perform a Cramer-von Mises test but an alternative Cramer test. Use cramer.test
from package cramer
directly, on which CvM.test
was based.
DA.confusion
and DA.valid
are replaced by the more generic MVA.cmv
and MVA.cv
.
DA.var
is replaced by the more generic MVA.synt
.
dunn.test
is not very useful, prefer dunnTest
from package FSA
.
fc.multcomp
is not useful anymore since emtrends
(package emmeans) does the same job in a much more powerful manner (see argument var
of emtrends
).
friedman.rating.test
, kruskal.rating.test
, wilcox.rating.test
, wilcox.rating.signtest
, pairwise.wilcox.rating.test
and wilcox.paired.rating.multcomp
can be problematic with ratings (in which ties and zeroes are very frequent). The use of CLM(M)s (via clm
and clmm
) is recommended.
pairwise.manova
is not useful anymore since emmeans
(package emmeans
) does the same job in a much more powerful manner (on "mlm"
objects, created by lm
and not manova
)
pairwise.to.groups
was not very useful.
plot1comp.ind
, plot1comp.var
, s.corcircle2
, scat.mix.categorical
, scat.mix.numeric
and scatter.coa2
are replaced by the more generic MVA.plot
.
PLSDA.ncomp
was not really useful and mvr
does nearly the same job.
PLSDA.test
is replaced by the more generic MVA.test
.
Dummy responses
Description
Creates a matrix of dummy responses from a factor. Needed in some multivariate analyses.
Usage
dummy(f, simplify = TRUE)
Arguments
f |
vector (internally transformed into factor). |
simplify |
logical indicating if the last column of the response matrix should be removed (to avoid model overfitting). |
Author(s)
Maxime HERVE <maxime.herve@univ-rennes1.fr>
Examples
fac <- gl(3,5,labels=LETTERS[1:3])
dummy(fac)
Empirical logistic transformation
Description
Empirical logistic transformation for logistic models with data showing (quasi-)complete separation. The function is intended to be used as a link function in GLM(M)s.
Usage
elogis()
Author(s)
Formula from McCullagh & Nelder in their seminal book 'Generalized Linear Models'. R code from Eric Wajnberg & Jean-Sebastien Pierre.
Examples
# An example with 3 groups and complete separation (from E. Wajnberg)
tab <- data.frame(case=letters[1:3],yes=c(25,30,0),no=c(1,0,20))
tab
## Not run:
mod <- glm(cbind(yes,no)~case,family=binomial(link=elogis()),data=tab)
# Warnings are normal
summary(mod)
## End(Not run)
Fisher's exact test for binary variables
Description
Performs a Fisher's exact test for comparing response probabilities (i.e. when the response variable is a binary variable). The function is in fact a wrapper to the Fisher's exact test for count data. If the p-value of the test is significant, the function performs pairwise comparisons by using Fisher's exact tests.
Usage
fisher.bintest(formula, data, alpha = 0.05, p.method = "fdr")
Arguments
formula |
a formula of the form |
data |
an optional data frame containing the variables in the formula |
alpha |
significance level to compute pairwise comparisons. |
p.method |
method for p-values correction. See help of |
Details
If the response is a 0/1 variable, the probability of the '1' group is tested. In any other cases, the response is transformed into a factor and the probability of the second level is tested.
Since chi-squared and G tests are approximate tests, exact tests are preferable when the number of individuals is small (200 is a reasonable minimum).
Value
method.test |
a character string giving the name of the global test computed. |
data.name |
a character string giving the name(s) of the data. |
alternative |
a character string describing the alternative hypothesis. |
estimate |
the estimated probabilities. |
null.value |
the value of the difference in probabilities under the null hypothesis, always 0. |
p.value |
p-value of the global test. |
alpha |
significance level. |
p.adjust.method |
method for p-values correction. |
p.value.multcomp |
data frame of pairwise comparisons result. |
method.multcomp |
a character string giving the name of the test computed for pairwise comparisons. |
Author(s)
Maxime HERVE <maxime.herve@univ-rennes1.fr>
See Also
Examples
response <- c(0,0,0,0,0,0,1,0,0,0,0,0,1,0,1,1,1,0,0,1,1,1,1,1,1,0,0,1,1,1)
fact <- gl(3,10,labels=LETTERS[1:3])
fisher.bintest(response~fact)
Pairwise comparisons using Fisher's exact test
Description
Performs pairwise comparisons after a comparison of proportions or after a test for independence of 2 categorical variables, by using a Fisher's exact test.
Usage
fisher.multcomp(tab.cont, p.method = "fdr")
Arguments
tab.cont |
contingency table. |
p.method |
method for p-values correction. See help of |
Details
Since chi-squared and G tests are approximate tests, exact tests are preferable when the number of individuals is small (200 is a reasonable minimum).
Value
method |
name of the test. |
data.name |
a character string giving the name(s) of the data. |
p.adjust.method |
method for p-values correction. |
p.value |
table of results of pairwise comparisons. |
Author(s)
Maxime HERVE <maxime.herve@univ-rennes1.fr>
See Also
chisq.test
, prop.test
, fisher.test
Examples
# 2-column contingency table: comparison of proportions
tab.cont1 <- matrix(c(17,23,12,24,20,10),ncol=2,dimnames=list(c("Control",
"Treatment1","Treatment2"),c("Alive","Dead")),byrow=TRUE)
fisher.test(tab.cont1)
fisher.multcomp(tab.cont1)
# 3-column contingency table: independence test
tab.cont2 <- as.table(matrix(c(25,10,12,6,15,14,9,16,9),ncol=3,dimnames=list(c("fair",
"dark","russet"),c("blue","brown","green"))))
fisher.test(tab.cont2)
fisher.multcomp(tab.cont2)
Fligner-Policello test
Description
Performs a Fligner-Policello test of the null that the medians in the two groups (samples) are the same.
Usage
fp.test(x, ...)
## Default S3 method:
fp.test(x, y, delta = 0, alternative = "two.sided", ...)
## S3 method for class 'formula'
fp.test(formula, data, subset, ...)
Arguments
x |
a numeric vector of data values. |
y |
a numeric vector of data values. |
delta |
null difference in medians tested. |
alternative |
a character string specifying the alternative hypothesis, must be one of |
formula |
a formula of the form |
data |
an optional data frame containing the variables in the formula |
subset |
an optional vector specifying a subset of observations to be used. |
... |
further arguments to be passed to or from other methods. |
Details
The Fligner-Policello test does not assume that the shape of the distribution is similar in two groups, contrary to the Mann-Whitney-Wilcoxon test. However, it assumes that the the distributions are symmetric.
Value
statistic |
test statistics. |
p.value |
p-value of the test. |
alternative |
a character string describing the alternative hypothesis. |
method |
a character string indicating the name of the test. |
data.name |
a character string giving the names of the data. |
null.value |
the specified hypothesized value of the median difference. |
Author(s)
Maxime HERVE <maxime.herve@univ-rennes1.fr>
See Also
Examples
x <- rpois(20,3)
y <- rpois(20,5)
fp.test(x,y)
Individual contributions in regression
Description
Computes difference in regression parameters when each individual is dropped, expressed in proportion of the whole regression coefficients. The function deals with lm
(including glm) and least.rect
models.
Usage
ind.contrib(model, print.diff = FALSE, graph = TRUE, warning=25)
Arguments
model |
model (of class |
print.diff |
logical. If |
graph |
logical. If |
warning |
level of graphical warning. |
Value
coefficients |
coefficients of each computed regression. |
coefficients.diff |
difference in coefficients between each computed regression and the whole regression. |
coefficients.prop |
difference in coefficients expressed in proportion of the whole regression coefficients. |
Author(s)
Maxime HERVE <maxime.herve@univ-rennes1.fr>
See Also
Examples
x <- 1:30
y <- 1:30+rnorm(30,0,4)
model1 <- lm(y~x)
model2 <- least.rect(y~x)
ind.contrib(model1)
ind.contrib(model2)
Least rectangles linear regression
Description
Fits a least rectangle linear regression, possibly for each level of a factor.
Usage
least.rect(formula, data, conf.level = 0.95, theo = 1, adj = TRUE)
Arguments
formula |
a formula of the form |
data |
an optional data frame containing the variables in the formula |
conf.level |
confidence level. |
theo |
theoretical value of the slope. If several regression are fitted, the same value is used for all comparisons of slope vs. theoretical value. |
adj |
logical indicating if, in case of several regressions fitted, confidence intervals and p-values should be Bonferroni-corrected for multiple testing. |
Value
coefficients |
regression parameters. |
residuals |
residuals. |
fitted.values |
fitted values. |
call |
the matched call. |
model |
the model frame used. |
conf.level |
confidence level. |
conf.int |
confidence interval of regression parameters. |
theo |
theoretical value of the slope. |
comp |
data frame of results for equality of the slope(s) to the theoretical value. |
corr |
data frame of results for significativity of the correlation coefficient(s). |
multiple |
logical, |
adj |
logical, |
Author(s)
Maxime HERVE <maxime.herve@univ-rennes1.fr>
Examples
x <- 1:30+rnorm(30,0,3)
y <- 1:30+rnorm(30,0,3)
regression1 <- least.rect(y~x)
summary(regression1)
x2 <- c(1:30,1:30)
y2 <- c(1:30+rnorm(30,0,3),seq(10,22,12/29)+rnorm(30,0,3))
fact <- gl(2,30,labels=LETTERS[1:2])
regression2 <- least.rect(y2~x2|fact)
summary(regression2)
Slope of a hand-defined line
Description
Returns the slope of a line defined by selecting two points on a graph.
Usage
loc.slp()
Author(s)
Maxime HERVE <maxime.herve@univ-rennes1.fr>
Graphical adujstment of a simple binary logistic regression to data
Description
Cuts the data into intervals, compute the response probability and its standard error for each interval and add the results to the regression curve. No test is performed but this permits to have a graphical idea of the adjustment of the model to the data.
Usage
logis.fit(model, int = 5, ...)
Arguments
model |
|
int |
number of intervals. |
... |
Author(s)
Maxime HERVE <maxime.herve@univ-rennes1.fr>
See Also
Examples
x <- 1:50
y <- c(rep(0,18),sample(0:1,14,replace=TRUE),rep(1,18))
model <- glm(y~x,family=binomial)
plot(x,y)
lines(x,model$fitted)
logis.fit(model)
Creating a nls model for logistic regression from fitted values of a glm model
Description
Adds some noise to the fitted values of a glm
model to create a nls
model for logistic regression (creating a nls
model from exact fitted values can not be done, see help of nls
).
Usage
logis.noise(model, intensity = 25)
Arguments
model |
|
intensity |
intensity of the noise: lower the value, bigger the noise. |
Author(s)
Maxime HERVE <maxime.herve@univ-rennes1.fr>
See Also
Examples
x <- 1:50
y <- c(rep(0,18),sample(0:1,14,replace=TRUE),rep(1,18))
model <- glm(y~x,family=binomial)
y2 <- logis.noise(model)
# Then model2 <- nls(y2~SSlogis(...))
Mode
Description
Computes the mode of a vector. The function makes the difference between continuous and discontinuous variables (which are made up of integers only). By extention, it also gives the most frequent value in a character vector or a factor.
Usage
mod(x)
Arguments
x |
vector (numeric, character or factor). |
Author(s)
Maxime HERVE <maxime.herve@univ-rennes1.fr>
See Also
Examples
# Continuous variable
x <- rnorm(100)
mod(x)
# Discontinuous variable
y <- rpois(100,2)
mod(y)
# Character vector
z <- sample(LETTERS[1:3],20,replace=TRUE)
mod(z)
Mood's median test
Description
Performs a Mood's median test to compare medians of independent samples.
Usage
mood.medtest(x, ...)
## Default S3 method:
mood.medtest(x, g, exact = NULL, ...)
## S3 method for class 'formula'
mood.medtest(formula, data, subset, ...)
Arguments
x |
a numeric vector of data values. |
g |
a vector or factor object giving the group for the corresponding elements of |
exact |
a logical indicating whether an exact p-value should be computed. |
formula |
a formula of the form |
data |
an optional data frame containing the variables in the formula |
subset |
an optional vector specifying a subset of observations to be used. |
... |
further arguments to be passed to or from other methods. |
Details
If exact=NULL
, a Fisher's exact test is used if the number of data values is < 200; otherwise a chi-square test is used, with Yates continuity correction if necessary.
Value
method |
a character string indicating the name of the test. |
data.name |
a character string giving the name(s) of the data. |
statistic |
the value the chi-squared test statistic (in case of a chis-square test). |
parameter |
the degrees of freedom of the approximate chi-squared distribution of the test statistic (in case of a chis-square test). |
p.value |
the p-value of the test. |
Author(s)
Maxime HERVE <maxime.herve@univ-rennes1.fr>
Examples
set.seed(1716)
response <- c(rnorm(10,3,1.5),rnorm(10,5.5,2))
fact <- gl(2,10,labels=LETTERS[1:2])
mood.medtest(response~fact)
Multivariate normality QQ-Plot
Description
Draws a QQ-plot to assess multivariate normality.
Usage
mqqnorm(x, main = "Multi-normal Q-Q Plot")
Arguments
x |
a data frame or a matrix of numeric variables (each column giving a variable). |
main |
title of the graph. |
Author(s)
Maxime HERVE <maxime.herve@univ-rennes1.fr>
See Also
Examples
x <- 1:30+rnorm(30)
y <- 1:30+rnorm(30,1,3)
mqqnorm(cbind(x,y))
Shapiro-Wilk multivariate normality test
Description
Performs a Shapiro-Wilk test to asses multivariate normality. This is a slightly modified copy of the mshapiro.test
function of the package mvnormtest, for internal convenience.
Usage
mshapiro.test(x)
Arguments
x |
a data frame or a matrix of numeric variables (each column giving a variable). |
Value
method |
name of the test. |
data.name |
a character string giving the names of the data. |
statistic |
test statistics of the test. |
p.value |
p-value of the test. |
Author(s)
Maxime HERVE <maxime.herve@univ-rennes1.fr> from the work of Slawomir Jarek
See Also
Examples
x <- 1:30+rnorm(30)
y <- 1:30+rnorm(30,1,3)
mshapiro.test(cbind(x,y))
Pairwise comparisons after an exact multinomial test
Description
Performs pairwise comparisons after a global exact multinomial test. These comparisons are performed using exact binomial tests.
Usage
multinomial.multcomp(x, p.method = "fdr")
Arguments
x |
numeric vector (counts). Can also be a factor; in that case |
p.method |
method for p-values correction. See help of |
Details
Since chi-squared and G tests are approximate tests, exact tests are preferable when the number of individuals is small (200 is a reasonable minimum).
An exact multinomial test with two groups is strictly the same than an exact binomial test.
Value
method |
name of the test. |
data.name |
a character string giving the name(s) of the data. |
p.adjust.method |
method for p-values correction. |
p.value |
table of results. |
Author(s)
Maxime HERVE <maxime.herve@univ-rennes1.fr>
See Also
Examples
counts <- c(5,15,23)
multinomial.test(counts)
multinomial.multcomp(counts)
Exact multinomial test
Description
Perfoms an exact multinomial test on a vector of counts.
Usage
multinomial.test(x, p = rep(1/length(x), length(x)))
Arguments
x |
numeric vector (counts). Can also be a factor; in that case |
p |
theoretical proportions (optional). |
Details
The function works as chisq.test
or G.test
:
- if theoretical proportions are not given, equality of counts is tested
- if theoretical proportions are given, equality of counts to theoretical counts (given by theoretical proportions) is tested.
Since chi-squared and G tests are approximate tests, exact tests are preferable when the number of individuals is small (200 is a reasonable minimum).
Be aware that the calculation time increases with the number of individuals (i.e. the sum of x
) and the number of groups (i.e. the length of x
).
An exact multinomial test with two groups is strictly the same as an exact binomial test.
Value
method |
name of the test. |
p.value |
p-value. |
data.name |
a character string giving the name(s) of the data. |
observed |
the observed counts. |
expected |
the expected counts under the null hypothesis. |
Author(s)
Maxime HERVE <maxime.herve@univ-rennes1.fr> based on multinomial.test
See Also
chisq.test
, G.test
, binom.test
, multinomial.multcomp
, multinomial.theo.multcomp
Examples
counts <- c(5,15,23)
multinomial.test(counts)
Pairwise comparisons after an exact multinomial test for given probabilities
Description
Performs pairwise comparisons after a global exact multinomial test for given probabilities. These comparisons are performed using exact binomial tests.
Usage
multinomial.theo.multcomp(x, p = rep(1/length(x), length(x)), prop = FALSE,
p.method = "fdr")
Arguments
x |
numeric vector (counts). Can also be a factor; in that case |
p |
theoretical proportions. |
prop |
logical indicating if results should be printed as counts ( |
p.method |
method for p-values correction. See help of |
Details
Since chi-squared and G tests are approximate tests, exact tests are preferable when the number of individuals is small (200 is a reasonable minimum).
An exact multinomial test with two groups is strictly the same than an exact binomial test.
Value
method |
name of the test. |
data.name |
a character string giving the name(s) of the data. |
observed |
observed counts. |
expected |
expected counts. |
p.adjust.method |
method for p-values correction. |
p.value2 |
corrected p-values. |
p.value |
data frame of results. |
Author(s)
Maxime HERVE <maxime.herve@univ-rennes1.fr>
See Also
Examples
counts <- c(5,15,23)
p.theo <- c(0.2,0.5,0.3)
multinomial.test(counts,p=p.theo)
multinomial.theo.multcomp(counts,p=p.theo)
Univariate correlation test for multiple variables
Description
Performs correlation tests between one variable and a series of other variables, and corrects p-values.
Usage
multtest.cor(mult.var, uni.var, method = "pearson", p.method = "fdr",
ordered = TRUE)
## S3 method for class 'multtest.cor'
plot(x, arrows = TRUE, main = NULL, pch = 16,
cex = 1, col = c("red", "orange", "black"), labels = NULL, ...)
Arguments
mult.var |
data frame containing a series of numeric variables. |
uni.var |
numeric variable (vector). |
method |
a character string indicating which correlation coefficient is to be computed. See help of |
p.method |
method for p-values correction. See help of |
ordered |
logical indicating if variables should be ordered based on correlation values. |
x |
object returned from |
arrows |
logical indicating if arrows should be plotted. If |
main |
optional title of the graph. |
pch |
symbol(s) used for points, when points are displayed (see |
cex |
size of points and labels (see help of |
col |
vector of three colors: first for variables with P < 0.05, second for variables with 0.05 < P < 0.1, third for variables with P > 0.1. Recycled if only one value. |
labels |
names of the variables. If |
... |
not used. |
Author(s)
Maxime HERVE <maxime.herve@univ-rennes1.fr>
See Also
Examples
data(iris)
# Original coordinates
plot(Petal.Length~Sepal.Length,pch=16,col=as.numeric(iris$Species),data=iris)
# New axis
abline(-6,1.6)
# Coordinates on new axis
new.coord <- coord.proj(iris[,c("Sepal.Length","Petal.Length")],1.6)
# Correlation between the whole dataset and new coordinates
mult.cor <- multtest.cor(iris[,1:4],new.coord)
plot(mult.cor)
Univariate comparison of groups for multiple variables
Description
Performs group comparisons for multiple variables using parametric, permutational or rank tests, and corrects p-values. Gives also group means and standards errors for each variable.
Usage
multtest.gp(tab, fac, test = c("param", "perm", "rank"),
transform = c("none", "sqrt", "4rt", "log"), add = 0, p.method = "fdr",
ordered = TRUE, ...)
## S3 method for class 'multtest.gp'
plot(x, signif = FALSE, alpha = 0.05,
vars = NULL, xlab = "Group", ylab = "Mean (+/- SE) value",
titles = NULL, groups = NULL, ...)
Arguments
tab |
data frame containing response variables. |
fac |
factor defining groups to compare. |
test |
type of test to use: parametric (default), permutational (non parametric) or rank-based (non parametric). See Details. |
transform |
transformation to apply to response variables before testing (none by default; |
add |
value to add to response variables before a log-transformation. |
p.method |
method for p-values correction. See help of |
ordered |
logical indicating if variables should be ordered based on p-values. |
x |
object returned from |
signif |
logical indicating if only variables with significant P-value should be plotted. |
alpha |
significance threshold. |
vars |
numeric vector giving variables to plot (rows of |
xlab |
legend of the x axis. |
ylab |
legend of the y axis |
titles |
titles of the graphs (name of the variables by default). |
groups |
names of the bars (levels of |
... |
additional arguments to testing functions in |
Details
In case of parametric tests, t-tests or ANOVAs are used depending on the number of groups (2 or more, respectively). In case of permutational tests: permutational t-tests or permutational ANOVAs. In case of rank-based tests: Mann-Whitney-Wilcoxon or Kruskal-Wallis tests.
Author(s)
Maxime HERVE <maxime.herve@univ-rennes1.fr>
See Also
Examples
data(iris)
mult <- multtest.gp(iris[,1:4],iris$Species)
plot(mult)
Univariate comparison of groups for multiple binary variables
Description
Performs group comparisons for multiple binary variables using a parametric or an exact test, and corrects p-values. Gives also group proportions and standards errors for each variable.
Usage
multtest.gp.bin(tab, fac, test = c("LRT", "Fisher"),
p.method = "fdr", ordered = TRUE, ...)
## S3 method for class 'multtest.gp.bin'
plot(x, signif = FALSE, alpha = 0.05,
vars = NULL, xlab = "Group", ylab = "Mean (+/- SE) proportion",
titles = NULL, groups = NULL, ...)
Arguments
tab |
data frame containing response variables. |
fac |
factor defining groups to compare. |
test |
type of test to use: likelihood ratio test based on binomial GLM ( |
p.method |
method for p-values correction. See help of |
ordered |
logical indicating if variables should be ordered based on p-values. |
x |
object returned from |
signif |
logical indicating if only variables with significant P-value should be plotted. |
alpha |
significance threshold. |
vars |
numeric vector giving variables to plot (rows of |
xlab |
legend of the x axis. |
ylab |
legend of the y axis |
titles |
titles of the graphs (name of the variables by default). |
groups |
names of the bars (levels of |
... |
additional arguments to testing functions in |
Author(s)
Maxime HERVE <maxime.herve@univ-rennes1.fr>
See Also
multtest.gp
, Anova
, fisher.test
Re-computation of an ordination using given row weights
Description
Re-computes an ordination using given row weights (possibly extracted from a correspondence analysis). The function is intended to be used prior to coinertia
when row weights have to be equalized.
Usage
ord.rw(ord, CA = NULL, rw = NULL)
Arguments
ord |
an ordination to re-compute. Must come from the |
CA |
an optional correspondence analysis from which row weights should be extracted. Must come from |
rw |
an optional vector of row weights. Used only is |
Author(s)
Maxime HERVE <maxime.herve@univ-rennes1.fr>
Estimation of overdispersion with glmer
models
Description
Estimates residual deviance and residual degrees of freedom to check for overdispersion with glmer
models. This function is directly comming from http://glmm.wikidot.com/faq
.
Usage
overdisp.glmer(model)
Arguments
model |
a model fitted by |
Author(s)
Ben Bolker
See Also
Examples
require(lme4)
# Example from the 'glmer' function
gm1 <- glmer(cbind(incidence,size-incidence)~period+(1|herd),
family="binomial",data=cbpp)
overdisp.glmer(gm1)
Rescaling of a [0,1] variable into the (0,1) interval (and vice-versa)
Description
The function uses the formula presented in Douma & Weedon (2019). It is primarily intended to be used in beta regression (regression for continuous proportions) when data contain zeroes and/or ones, but can be applied to any variable initially bounded in the [0,1] interval when rescaling is necessary. The function can also perform back-transformation.
Usage
p.beta(p, n = length(p), C = 2, back = FALSE)
Arguments
p |
numeric vector of values in the [0,1] interval. |
n |
total number of observations in the initial data set. Not very useful when the transformation is applied to the initial data set, but needed when back-transformation is applied from predicted values. |
C |
number of categories from which |
back |
logical. If |
Author(s)
Maxime HERVE <maxime.herve@univ-rennes1.fr> from the following paper: Douma JC & Weedon JT (2019) Analysing continuous proportions in ecology and evolution: A practical introduction to beta and Dirichlet regression. Methods in Ecology and Evolution, 10: 1412-1430
Examples
# A fictive example with four animals performing a behavioral choice-test where time
# can be spent in three branches (total time 20 min)
(tab <- data.frame(Individual=c("Ind1","Ind2","Ind3","Ind4"),Branch1=c(0,12,20,4),
Branch2=c(8,4,0,6),Branch3=c(12,4,0,10)))
# Raw proportions of time spent in branch 1:
(p1 <- tab$Branch1/rowSums(tab[,-1]))
# Scaled proportions:
p.beta(p1,C=3)
Pairwise comparisons for CDA
Description
Performs pairwise comparisons between group levels with corrections for multiple testing, using CDA.test
.
Usage
pairwise.CDA.test(X, fact, ncomp = NULL, p.method = "fdr", ...)
Arguments
X |
a data frame of dependent variables (typically contingency or presence-absence table). |
fact |
factor giving the groups. |
ncomp |
an integer giving the number of components to be used for the test. If |
p.method |
method for p-values correction. See help of |
... |
other arguments to pass to |
Details
See CDA.test
.
Value
method |
a character string indicating what type of tests were performed. |
data.name |
a character string giving the name(s) of the data. |
p.value |
table of results. |
p.adjust.method |
method for p-values correction. |
Author(s)
Maxime HERVE <maxime.herve@univ-rennes1.fr>
See Also
Examples
require(ade4)
data(perthi02)
CDA.test(perthi02$tab,perthi02$cla)
pairwise.CDA.test(perthi02$tab,perthi02$cla)
Pairwise comparisons for proportions using G-tests
Description
Performs pairwise comparisons between pairs of proportions with correction for multiple testing.
Usage
pairwise.G.test(x, p.method = "fdr")
Arguments
x |
matrix with 2 columns giving the counts of successes and failures, respectively. |
p.method |
method for p-values correction. See help of |
Details
Since a G-test is an approximate test, an exact test is preferable when the number of individuals is small (200 is a reasonable minimum). See fisher.multcomp
in that case.
Value
method |
name of the test. |
data.name |
a character string giving the name(s) of the data. |
p.adjust.method |
method for p-values correction. |
p.value |
table of results. |
See Also
Examples
x <- matrix(c(44,56,36,64,64,40),ncol=2,dimnames=list(c("Control","Treatment1","Treatment2"),
c("Alive","Dead")),byrow=TRUE)
G.test(x)
pairwise.G.test(x)
Pairwise permutation tests based on cross (model) validation
Description
Performs pairwise comparisons between group levels with corrections for multiple testing, using MVA.test
.
Usage
pairwise.MVA.test(X, fact, p.method = "fdr", cmv = FALSE, ncomp = 8,
kout = 7, kinn = 6, model = c("PLS-DA", "PPLS-DA", "LDA", "QDA",
"PLS-DA/LDA", "PLS-DA/QDA", "PPLS-DA/LDA","PPLS-DA/QDA"),
nperm = 999, progress = TRUE, ...)
Arguments
X |
a data frame of independent variables. |
fact |
grouping factor. |
p.method |
method for p-values correction. See help of |
cmv |
a logical indicating if the test statistic (NMC) should be generated through cross-validation (classical K-fold process) or cross model validation (inner + outer loops). |
ncomp |
an integer giving the number of components to be used to generate all submodels (cross-validation) or the maximal number of components to be tested in the inner loop (cross model validation). Can be re-set internally if needed. Does not concern LDA and QDA. |
kout |
an integer giving the number of folds (cross-validation) or the number of folds in the outer loop (cross-model validation). Can be re-set internally if needed. |
kinn |
an integer giving the number of folds in the inner loop (cross model validation only). Can be re-set internally if needed. Cannot be |
model |
the model to be fitted. |
nperm |
number of permutations. |
progress |
logical indicating if the progress bar should be displayed. |
... |
other arguments to pass to |
Details
The function deals with the limitted floating point precision, which can bias calculation of p-values based on a discrete test statistic distribution.
Value
method |
a character string indicating what type of tests were performed. |
data.name |
a character string giving the name(s) of the data. |
p.value |
table of results. |
p.adjust.method |
method for p-values correction. |
permutations |
number of permutations. |
Author(s)
Maxime HERVE <maxime.herve@univ-rennes1.fr>
See Also
Examples
require(pls)
data(mayonnaise)
# PPLS-DA
## Not run: pairwise.MVA.test(mayonnaise$NIR,factor(mayonnaise$oil.type),model="PPLS-DA")
# The function needs a long calculation time!
Pairwise comparisons of groups displayed on a factorial map
Description
Performs pairwise comparisons between group levels with corrections for multiple testing. Tests are computed using factorfit
.
Usage
pairwise.factorfit(ord, fact, xax = 1, yax = 2, nperm = 999,
p.method = "fdr", ...)
Arguments
ord |
any multivariate analysis handled by |
fact |
grouping factor. |
xax |
first axis of the factorial map. |
yax |
second axis of the factorial map. |
nperm |
number of permutations. |
p.method |
method for p-values correction. See help of |
... |
optional further agruments to |
Value
method |
a character string giving the name of the test. |
data.name |
a character string giving the name(s) of the data and the number of permutations. |
p.value |
table of results. |
p.adjust.method |
method for p-values correction. |
Author(s)
Maxime HERVE <maxime.herve@univ-rennes1.fr>
See Also
Examples
require(vegan)
data(iris)
PCA <- rda(iris[,1:4])
MVA.plot(PCA,fac=iris$Species,col=1:3)
# Global test
envfit(PCA~Species,data=iris)
# Pairwise comparisons
# (not enough permutations here but faster to run)
pairwise.factorfit(PCA,iris$Species,nperm=49)
Pairwise Mood's median tests
Description
Performs pairwise comparisons between group levels with corrections for multiple testing.
Usage
pairwise.mood.medtest(resp, fact, exact = NULL, p.method = "fdr")
Arguments
resp |
response vector. |
fact |
grouping factor. |
exact |
a logical indicating whether exact p-values should be computed. |
p.method |
method for p-values correction. See help of |
Details
If exact=NULL
, Fisher's exact tests are used if the number of data values is < 200; otherwise chi-square tests are used (with Yates continuity correction).
Value
method |
a character string indicating the name of the test. |
data.name |
a character string giving the name(s) of the data. |
p.value |
table of results. |
p.adjust.method |
method for p-values correction. |
Author(s)
Maxime HERVE <maxime.herve@univ-rennes1.fr>
See Also
Examples
set.seed(0904)
response <- c(rnorm(10),rnorm(10,0.8),rnorm(10,2))
fact <- gl(3,10,labels=LETTERS[1:3])
mood.medtest(response~fact)
pairwise.mood.medtest(response,fact)
Pairwise permutation MANOVAs
Description
Performs pairwise comparisons between group levels with corrections for multiple testing. These pairwise comparisons are relevant after a permutation MANOVA, such as performed by adonis2
.
Usage
pairwise.perm.manova(resp, fact, test = c("Pillai", "Wilks",
"Hotelling-Lawley", "Roy", "Spherical"), nperm = 999,
progress = TRUE, p.method = "fdr", F = FALSE, R2 = FALSE)
Arguments
resp |
response. Either a matrix (one column per variable; objects of class |
fact |
grouping factor. |
test |
choice of test statistic when |
nperm |
number of permutations. |
progress |
logical indicating if the progress bar should be displayed. |
p.method |
method for p-values correction. See help of |
F |
should the table of F values be returned? |
R2 |
should the table of R2 values be returned? For tests based on distance matrices only. |
Details
If resp
is a matrix, a classical MANOVA is performed and the distribution of the (pseudo-)F is computed through permutations. The function deals with the limitted floating point precision, which can bias calculation of p-values based on a discrete test statistic distribution.
If resp
is a distance matrix, adonis2
is used to perform each comparison.
Value
method |
a character string giving the name of the test. |
data.name |
a character string giving the name(s) of the data and the number of permutations. |
p.value |
table of results. |
p.adjust.method |
method for p-values correction. |
F.value |
table of F values (if required). |
R2.value |
table of R2 values (if required). |
Author(s)
Maxime HERVE <maxime.herve@univ-rennes1.fr>
See Also
Examples
require(vegan)
data(iris)
# permutation MANOVA
adonis2(iris[,1:4]~Species,data=iris,method="euclidean")
# Pairwise comparisons
# (not enough permutations here but faster to run)
pairwise.perm.manova(iris[,1:4],iris$Species,nperm=49)
# or
pairwise.perm.manova(dist(iris[,1:4],"euclidean"),iris$Species,nperm=49)
Pairwise permutation t tests
Description
Performs pairwise comparisons between group levels with corrections for multiple testing.
Usage
pairwise.perm.t.test(resp, fact, p.method = "fdr", paired = FALSE,
alternative = c("two.sided","less", "greater"), nperm = 999,
progress = TRUE)
Arguments
resp |
response vector. |
fact |
grouping factor. |
p.method |
method for p-values correction. See help of |
paired |
a logical indicating whether you want paired (permutation) t-tests. |
alternative |
a character string specifying the alternative hypothesis, must be one of |
nperm |
number of permutations. |
progress |
logical indicating if the progress bar should be displayed. |
Details
The function deals with the limitted floating point precision, which can bias calculation of p-values based on a discrete test statistic distribution.
Value
method |
a character string indicating what type of t-tests were performed. |
data.name |
a character string giving the name(s) of the data. |
p.value |
table of results. |
p.adjust.method |
method for p-values correction. |
permutations |
number of permutations. |
Author(s)
Maxime HERVE <maxime.herve@univ-rennes1.fr>
See Also
Examples
set.seed(1203)
response <- c(rnorm(5),rpois(5,0.5),rnorm(5,2,1))
fact <- gl(3,5,labels=LETTERS[1:3])
# Not enough permutations here but it runs faster
# permutation ANOVA
perm.anova(response~fact,nperm=49)
# Pairwise comparisons
pairwise.perm.t.test(response,fact,nperm=49)
Pairwise permutation F tests
Description
Performs pairwise comparisons between group levels with corrections for multiple testing.
Usage
pairwise.perm.var.test(resp, fact, p.method = "fdr",
alternative = c("two.sided","less", "greater"), nperm = 999,
progress = TRUE)
Arguments
resp |
response vector. |
fact |
grouping factor. |
p.method |
method for p-values correction. See help of |
alternative |
a character string specifying the alternative hypothesis, must be one of |
nperm |
number of permutations. |
progress |
logical indicating if the progress bar should be displayed. |
Details
The function deals with the limitted floating point precision, which can bias calculation of p-values based on a discrete test statistic distribution.
Value
method |
a character string giving the name of the test. |
data.name |
a character string giving the name(s) of the data. |
p.value |
table of results. |
p.adjust.method |
method for p-values correction. |
permutations |
number of permutations. |
Author(s)
Maxime HERVE <maxime.herve@univ-rennes1.fr>
See Also
Examples
set.seed(0921)
response <- c(rnorm(10),rpois(10,0.2),rnorm(10,,2))
fact <- gl(3,10,labels=LETTERS[1:3])
# Not enough permutations here but it runs faster
# permutation Bartlett test
perm.bartlett.test(response~fact,nperm=49)
# Pairwise comparisons
pairwise.perm.var.test(response,fact,nperm=49)
Pairwise F tests
Description
Performs pairwise comparisons between group levels with corrections for multiple testing.
Usage
pairwise.var.test(resp, fact, p.method = "fdr",
alternative = c("two.sided","less", "greater"))
Arguments
resp |
response vector. |
fact |
grouping factor. |
p.method |
method for p-values correction. See help of |
alternative |
a character string specifying the alternative hypothesis, must be one of |
Value
method |
a character string giving the name of the test. |
data.name |
a character string giving the name(s) of the data. |
p.value |
table of results. |
p.adjust.method |
method for p-values correction. |
Author(s)
Maxime HERVE <maxime.herve@univ-rennes1.fr>
See Also
Examples
require(graphics)
# Bartlett test
bartlett.test(count~spray,data=InsectSprays)
# Pairwise comparisons
pairwise.var.test(InsectSprays$count,InsectSprays$spray)
(Semi-)Partial correlation
Description
Computes the (semi-)partial correlation of x
and y
, controlling for z
.
Usage
pcor(x, y, z, semi = FALSE, use = "complete.obs", method = c("pearson",
"spearman"))
Arguments
x |
a numeric vector. |
y |
a numeric vector. |
z |
a numeric vector, matrix, data frame or list giving the controlling variables. For matrices, variables must be placed in columns. |
semi |
logical. If |
use |
same as |
method |
same as |
Author(s)
Maxime HERVE <maxime.herve@univ-rennes1.fr>
See Also
pcor.test
for confidence intervals (and tests).
Examples
set.seed(1444)
x <- 1:30
y <- 1:30+rnorm(30,0,2)
z1 <- runif(30,0,4)
z2 <- 30:1+rnorm(30,0,3)
pcor(x,y,z1)
pcor(x,y,list(z1,z2))
Tests for (semi-)partial association/correlation between paired samples
Description
Tests for (semi-)partial association between paired samples while controlling for other variables, using one of Pearson's product moment correlation coefficient or Spearman's rho.
Usage
pcor.test(x, y, z, semi = FALSE, conf.level = 0.95, nrep = 1000,
method = c("pearson", "spearman"))
Arguments
x |
a numeric vector. |
y |
a numeric vector. |
z |
a numeric vector, matrix, data frame or list giving the controlling variables. For matrices, variables must be placed in columns. |
semi |
logical. If |
conf.level |
confidence level for confidence interval.. |
nrep |
number of replicates for computation of the confidence interval of a Spearman's rank correlation coefficient (by bootstraping). |
method |
a character string indicating which correlation coefficient is to be used for the test. One of "pearson" or "spearman". |
Details
If method
is "pearson"
and if there are at least 4+k complete series of observation (where k is the number of controlling variables), an asymptotic confidence interval of the correlation coefficient is given based on Fisher's Z transform.
If method
is "spearman"
, the p-value is computed through the AS89 algorithm if the number of complete series of observation is less than 10, otherwise via the asymptotic t approximation (in both cases the pspearman
function is used). A confidence interval of the correlation coefficient, computed by bootstraping, is given.
Value
data.name |
a character string giving the name(s) of the data. |
alternative |
a character string describing the alternative hypothesis, always two-sided. |
method |
a character string indicating how the association was measured. |
conf.int |
a condidence interval for the measure of association. |
statistic |
the value of the test statistic. |
parameter |
the degrees of freedom of the test (only for a Pearson's correlation coefficient). |
p.value |
the p-value of the test. |
estimate |
the estimated measure of association, with name |
null.value |
he value of the association measure under the null hypothesis, always 0. |
Author(s)
Maxime HERVE <maxime.herve@univ-rennes1.fr>
See Also
Examples
set.seed(1444)
x <- 1:30
y <- 1:30+rnorm(30,0,2)
z1 <- runif(30,0,4)
z2 <- 30:1+rnorm(30,0,3)
pcor.test(x,y,z1)
pcor.test(x,y,list(z1,z2))
Permutation Analysis of Variance
Description
Performs a permutation analysis of variance for 1 to 3 factors. For 2 and 3 factors, experiment design must be balanced. For 2 factors, the factors can be crossed with or without interaction, or nested. The second factor can be a blocking (random) factor. For 3 factors, design is restricted to 2 fixed factors crossed (with or without interaction) inside blocks (third factor).
Usage
perm.anova(formula, nest.f2 = c("fixed", "random"), data, nperm = 999,
progress = TRUE)
Arguments
formula |
a formula of the form |
nest.f2 |
in case of 2 nested factors, precision is needed if the nested factor (factor2) is |
data |
an optional data frame containing the variables in the formula |
nperm |
number of permutations. |
progress |
logical indicating if the progress bar should be displayed. |
Details
For 2 factors, the formula can be:
response ~ factor1 + factor2
for 2 fixed factors without interaction
response ~ factor1 * factor2
for 2 fixed factors with interaction
response ~ factor1 / factor2
for 2 fixed factors with factor2 nested into factor1 (if factor2 is a random factor, argument nest.f2
must be changed from "fixed"
(default) to "random"
)
response ~ factor1 | factor2
for 1 fixed factor (factor1) and 1 blocking (random) factor (factor2).
For 3 factors, the formula can only be:
response ~ factor1 + factor2 | factor3
or
response ~ factor1 * factor2 | factor3
. The 2 factors are here fixed and crossed inside each level of the third, blocking (random), factor.
The function deals with the limitted floating point precision, which can bias calculation of p-values based on a discrete test statistic distribution.
Value
a data frame of class "anova"
.
Author(s)
Maxime HERVE <maxime.herve@univ-rennes1.fr>
Examples
set.seed(1203)
response <- c(rnorm(12),rpois(12,0.5),rnorm(12,2,1))
fact1 <- gl(3,12,labels=LETTERS[1:3])
fact2 <- gl(3,1,36,labels=letters[1:3])
fact3 <- gl(6,6,labels=letters[1:6])
block <- gl(2,6,36,labels=letters[1:2])
# Not enough permutations here but faster to run
# 2 crossed fixed factors with interaction
perm.anova(response~fact1*fact2,nperm=49)
# 2 nested fixed factors
perm.anova(response~fact1/fact2,nperm=49)
# 2 nested factors, fact2 being random
perm.anova(response~fact1/fact3,nest.f2="random",nperm=49)
# 1 fixed factor and 1 blocking (random) factor
perm.anova(response~fact1|block,nperm=49)
Permutation Bartlett's test of homogeneity of variances
Description
Performs a permutation Bartlett's test of homogeneity of k variances.
Usage
perm.bartlett.test(formula, data, nperm = 999, progress = TRUE)
Arguments
formula |
a formula of the form |
data |
an optional data frame containing the variables in the formula |
nperm |
number of permutations. |
progress |
logical indicating if the progress bar should be displayed. |
Details
The function deals with the limitted floating point precision, which can bias calculation of p-values based on a discrete test statistic distribution.
Value
method |
name of the test. |
data.name |
a character string giving the name(s) of the data. |
statistic |
test statistics of the parametric test. |
permutations |
number of permutations. |
p.value |
p-value of the permutation test. |
Author(s)
Maxime HERVE <maxime.herve@univ-rennes1.fr>
See Also
Examples
response <- c(rnorm(12),rpois(12,1),rnorm(12,2,1))
fact <- gl(3,12,labels=LETTERS[1:3])
perm.bartlett.test(response~fact)
Permutation Pearson's correlation test
Description
Performs a permutation Pearson's product-moment correlation test.
Usage
perm.cor.test(x, y, alternative = c("two.sided", "less", "greater"),
nperm = 999, progress = TRUE)
Arguments
x , y |
numeric vectors of data values. |
alternative |
a character string specifying the alternative hypothesis, must be one of |
nperm |
number of permutations. |
progress |
logical indicating if the progress bar should be displayed. |
Details
The function deals with the limitted floating point precision, which can bias calculation of p-values based on a discrete test statistic distribution.
Value
method |
name of the test. |
data.name |
a character string giving the name(s) of the data. |
statistic |
test statistics of the parametric test. |
permutations |
number of permutations. |
p.value |
p-value of the permutation test. |
estimate |
the estimated correlation coefficient. |
alternative |
a character string describing the alternative hypothesis. |
null.value |
the value of the association measure under the null hypothesis, always 0. |
Author(s)
Maxime HERVE <maxime.herve@univ-rennes1.fr>
See Also
Examples
x <- rnorm(50)
y <- runif(50)
perm.cor.test(x,y)
Permutation Student's t-test
Description
Performs a permutation Student's t-test.
Usage
perm.t.test(x, ...)
## Default S3 method:
perm.t.test(x, y, paired = FALSE, ...)
## S3 method for class 'formula'
perm.t.test(formula, data, alternative = c("two.sided", "less", "greater"),
paired = FALSE, nperm = 999, progress = TRUE, ...)
Arguments
x |
a numeric vector of data values. |
y |
a numeric vector of data values. |
paired |
a logical indicating whether you want a paired t-test. |
formula |
a formula of the form |
data |
an optional data frame containing the variables in the formula |
alternative |
a character string specifying the alternative hypothesis, must be one of |
nperm |
number of permutations. |
progress |
logical indicating if the progress bar should be displayed. |
... |
further arguments to be passed to or from other methods. |
Details
The function deals with the limitted floating point precision, which can bias calculation of p-values based on a discrete test statistic distribution.
Value
statistic |
test statistics of the parametric test. |
permutations |
number of permutations. |
p.value |
p-value of the permutation test. |
estimate |
the estimated mean or difference in means depending on whether it was a paired or not paired test. |
alternative |
a character string describing the alternative hypothesis. |
method |
a character string indicating what type of t-test was performed. |
data.name |
a character string giving the name(s) of the data. |
null.value |
the specified hypothesized value of the mean difference, always 0. |
Author(s)
Maxime HERVE <maxime.herve@univ-rennes1.fr>
See Also
Examples
response <- c(rnorm(5),rnorm(5,2,1))
fact <- gl(2,5,labels=LETTERS[1:2])
# Not enough permutations here but faster to run
# Unpaired test
perm.t.test(response~fact,nperm=49)
# Paired test
perm.t.test(response~fact,paired=TRUE,nperm=49)
Permutation F test to compare two variances
Description
Performs a permutation F test to compare two variances.
Usage
perm.var.test(x, ...)
## Default S3 method:
perm.var.test(x, y, ...)
## S3 method for class 'formula'
perm.var.test(formula, data, alternative = c("two.sided", "less",
"greater"), nperm = 999, progress = TRUE, ...)
Arguments
x |
a numeric vector of data values. |
y |
a numeric vector of data values. |
formula |
a formula of the form |
data |
an optional data frame containing the variables in the formula |
alternative |
a character string specifying the alternative hypothesis, must be one of |
nperm |
number of permutations. |
progress |
logical indicating if the progress bar should be displayed. |
... |
further arguments to be passed to or from other methods. |
Details
The function deals with the limitted floating point precision, which can bias calculation of p-values based on a discrete test statistic distribution.
Value
method |
name of the test. |
statistic |
test statistics of the parametric test. |
permutations |
number of permutations. |
p.value |
p-value of the permutation test. |
estimate |
the ratio of the two variances. |
alternative |
a character string describing the alternative hypothesis. |
data.name |
a character string giving the name(s) of the data. |
null.value |
the ratio of population variances under the null hypothesis, always 1. |
Author(s)
Maxime HERVE <maxime.herve@univ-rennes1.fr>
See Also
Examples
response <- c(rpois(8,1),rpois(8,3))
fact <- gl(2,8,labels=LETTERS[1:2])
perm.var.test(response~fact)
Simple analysis of model residuals
Description
Plots residuals of a model against fitted values and for some models a QQ-plot of these residuals. Optionally, a Shapiro-Wilk test can be performed on residuals. The function deals with lm
(including glm
, lmList
, lmList
, glm.nb
, mlm
and manova
), lmer
, glmer
, glmmPQL
, glmmadmb
, lme
, gls
, nls
, nlsList
, survreg
, least.rect
, betareg
or glmmTMB
models.
Usage
plotresid(model, shapiro = FALSE)
Arguments
model |
an object of class |
shapiro |
logical. If |
Details
Response residuals are used for linear models, non linear models and generalized linear models based on an identity link (except with "quasi"
distributions where response residuals are used only if variance="constant"
). Pearson or studentized residuals are used whenever there is a link function which is not identity (and with "quasi"
distributions when variance
is not "constant"
), except for betareg
models where standardized weighted residuals 2 are used (see residuals.betareg
).
QQ-plots and Shapiro-Wilk tests are available whenever the model is based on a Gaussian distribution (and with "quasi"
distributions when variance="constant"
).
With a mlm
or manova
model, only a multivariate QQ-plot is drawn. The test performed when shapiro=TRUE
is a Shapiro-Wilk test for multivariate normality.
Author(s)
Maxime HERVE <maxime.herve@univ-rennes1.fr>
See Also
lm
, lmList
, lmList
, glm
, glm.nb
, manova
, lmer
, glmer
, lmer
, glmer.nb
, lme
, glmmPQL
, glmmadmb
, glmmTMB
, gls
, nls
, nlsList
, survreg
, least.rect
, betareg
, qresiduals
, qqPlot
, shapiro.test
, mqqnorm
, mshapiro.test
Survivor curve
Description
Plots the survivor curve (log(survivors) against time) of a dataset to check for constancy of hazard.
Usage
plotsurvivors(x, status = rep(1, length(x)))
Arguments
x |
time to event. |
status |
status (1: event observed, 0: event not observed). |
Value
n |
initial number of individuals. |
time |
time of events. |
alive |
number of survivors at each time. |
Author(s)
Maxime HERVE <maxime.herve@univ-rennes1.fr>
Examples
# 'kidney' dataset of package 'survival'
require(survival)
data(kidney)
plotsurvivors(kidney$time,kidney$status)
Predict method for cross-validated CDA submodels
Description
Predicts response based on CDA (correspondence discriminant analysis) submodels generated by cross validation. The predicted class is given with its probability (computed from the values predicted by all submodels).
Usage
## S3 method for class 'CDA.cv'
predict(object, newdata, type = c("max", "all"), method = c("mahalanobis",
"euclidian"), ...)
Arguments
object |
object of class inheriting from |
newdata |
vector, matrix or data frame giving new individuals (one row per individual). |
type |
should the probability of the most probable class be given ( |
method |
criterion used to predict class membership. See |
... |
further arguments to be passed to or from other methods. |
Author(s)
Maxime HERVE <maxime.herve@univ-rennes1.fr>
See Also
Predict method for cross-validated submodels
Description
Predicts response based on submodels generated by cross (model) validation. For regression models (PLSR and CPPLS), the predicted value is given with its confidence interval. For discriminant analyses, the predicted class is given with its probability (computed from the values predicted by all submodels).
Usage
## S3 method for class 'MVA.cv'
predict(object, newdata, conf.level = 0.95, type.DA = c("max", "all"),
crit.DA = c("plug-in", "predictive", "debiased"), ...)
## S3 method for class 'MVA.cmv'
predict(object, newdata, conf.level = 0.95, type.DA = c("max", "all"),
crit.DA = c("plug-in", "predictive", "debiased"), ...)
Arguments
object |
object of class inheriting from |
newdata |
vector, matrix or data frame giving new individuals (one row per individual). |
conf.level |
confidence level for prediction of a quantitative response. |
type.DA |
for discriminant analyses, should the probability of the most probable class be given ( |
crit.DA |
criterion used to predict class membership when a LDA or QDA is used. See |
... |
further arguments to be passed to or from other methods. |
Author(s)
Maxime HERVE <maxime.herve@univ-rennes1.fr>
See Also
Predict method for CDA
Description
Predicts class of the grouping factor based on a Correspondence Discriminant Analysis (performed using discrimin.coa
).
Usage
## S3 method for class 'coadisc'
predict(object, newdata, method = c("mahalanobis", "euclidian"), ...)
Arguments
object |
object of class inheriting from |
newdata |
contingency table (either a |
method |
distance metric to be used for prediction. In all cases the predicted class corresponds to the minimum distance between the new individual and the centroid of each class. Default is Mahalanobis distance. |
... |
further arguments to be passed to or from other methods. |
Author(s)
Maxime HERVE <maxime.herve@univ-rennes1.fr>
See Also
Examples
require(ade4)
data(perthi02)
CDA <- discrimin.coa(perthi02$tab,perthi02$cla,scan=FALSE)
new <- matrix(c(17,45,32,17,17,52,28,29,6,10,7,7,7,5,10,4,37,34,23,9),ncol=20)
predict(CDA,new)
Pairwise comparisons after a test for given probabilities
Description
Performs pairwise comparisons after a global test for given response probabilities (i.e. when the response variable is a binary variable), by using exact binomial tests. The function is in fact a wrapper to pairwise comparisons of proportions to given values on a contingency table.
Usage
prop.bin.multcomp(formula, data, p, p.method = "fdr")
Arguments
formula |
a formula of the form |
data |
an optional data frame containing the variables in the formula |
p |
theoretical probabilities. |
p.method |
method for p-values correction. See help of |
Details
If the response is a 0/1 variable, the probability of the '1' group is tested. In any other cases, the response is transformed into a factor and the probability of the second level is tested.
Value
method |
name of the test. |
data.name |
a character string giving the name(s) of the data. |
observed |
observed probabilities. |
expected |
expected probabilities. |
p.adjust.method |
method for p-values correction. |
p.value2 |
corrected p-values. |
p.value |
table or results of pairwise comparisons. |
Author(s)
Maxime HERVE <maxime.herve@univ-rennes1.fr>
See Also
prop.multcomp
, chisq.theo.bintest
Examples
response <- c(rep(0:1,c(40,60)),rep(0:1,c(55,45)),rep(0:1,c(65,35)))
fact <- gl(3,100,labels=LETTERS[1:3])
p.theo <- c(0.5,0.45,0.2)
chisq.theo.bintest(response~fact,p=p.theo)
prop.bin.multcomp(response~fact,p=p.theo)
Pairwise comparisons after a test for given proportions
Description
Performs pairwise comparisons after a global test for given proportions, by using exact binomial tests.
Usage
prop.multcomp(x, p, p.method = "fdr")
Arguments
x |
contingency table. |
p |
theoretical proportions. |
p.method |
method for p-values correction. See help of |
Value
method |
name of the test. |
data.name |
a character string giving the name(s) of the data. |
observed |
observed proportions. |
expected |
expected proportions. |
p.adjust.method |
method for p-values correction. |
p.value2 |
corrected p-values. |
p.value |
table or results of pairwise comparisons. |
Author(s)
Maxime HERVE <maxime.herve@univ-rennes1.fr>
See Also
Examples
proportions <- sample(c(0,1),200,replace=TRUE)
populations <- sample(LETTERS[1:3],200,replace=TRUE)
tab.cont <- table(populations,proportions)
p.theo <- c(0.4,0.5,0.7)
prop.test(tab.cont,p=p.theo)
prop.multcomp(tab.cont,p=p.theo)
Proportions and standard errors
Description
Computes proportions (and their standard errors) when the number of classes is >= 2, based on predicted values of a model. The function is intended to be used parallel to a multinomial log-linear model.
Usage
prop.multinom(x)
Arguments
x |
either a factor or a matrix with K columns giving the counts for each of the K classes. |
Details
The proportions can be computed through the predict
function applied on a multinomial log-linear model (see multinom
). However, standard errors (or confidence intervals) cannot be obtained this way. The present function uses differents GLMs (in each case considering one category vs. the sum of all others) to obtain proportions and standard errors. Overdispersion is taken into account by default, using a quasibinomial law in all GLMs built.
Value
probs |
the calculated proportions. |
se |
the calculated standard errors. |
Author(s)
Maxime HERVE <maxime.herve@univ-rennes1.fr>
See Also
Examples
response <- data.frame(A=c(2,2,4,0,2,14,6,16,0,0),
B=c(2,0,0,0,6,2,10,6,0,0),
C=c(12,6,0,6,2,0,0,0,0,0),
D=c(0,0,0,14,0,0,0,0,2,0),
E=c(0,0,0,0,0,0,0,0,16,15))
prop.multinom(response)
Wald tests for comparison of proportions
Description
Performs pairwise comparisons of proportions when the number of classes is >= 2 with corrections for multiple testing.
Usage
prop.multinom.test(x, p.method = "fdr")
Arguments
x |
either a factor or a matrix with K columns giving the counts for each of the K classes. |
p.method |
method for p-values correction. See help of |
Details
The function builds multinomial log-linear models (using multinom
) and applies Wald tests to compare the intercepts to 0. All necessary models (each time using a different reference level/class) are built to get p-values of all possible comparisons among levels/classes.
Value
method |
a character string indicating the name of the test. |
data.name |
a character string giving the name(s) of the data. |
p.adjust.method |
method for p-values correction. |
p.value |
table of results. |
z.tab |
table of z values. |
Author(s)
Maxime HERVE <maxime.herve@univ-rennes1.fr>
See Also
Examples
response <- factor(rep(LETTERS[1:4],c(20,40,42,13)))
table(response)/length(response)
prop.multinom.test(response)
QAIC(c) calculation with GLMs of the quasi- family
Description
Allows for the calculation of QAIC(c) with glm
models built with a quasibinomial
or quasipoisson
distribution. The function is directly coming from QAIC.
Usage
quasibinomial.QAIC(link = "logit")
quasipoisson.QAIC(link = "log")
Arguments
link |
Author(s)
This function is only intended to make the suggestion of Kamil Barton in QAIC available.
Examples
# See ?QAIC from the MuMIn package
EMMeans for Cumulative Link (Mixed) Models
Description
Extracts EMMeans (produced by emmeans
) from Cumulative Link (Mixed) Models (produced by clm
or clmm
), with different possible formats.
Usage
rating.emmeans(emm, type = c("prob", "cumprob", "class1", "class2"), level = 0.9)
Arguments
emm |
|
type |
type of output to be returned: |
level |
used only for type |
Details
A factor named cut
must have been called in emmeans
, to compute EMMeans per cut point (i.e. rating). Additionally, the argument mode
of emmeans
must have been set to "linear.predictor". Finally, the call to emmeans
is typically like emmeans(model,~factor|cut,mode="linear.predictor")
where factor
is the factor (or interaction) giving levels for which EMMeans have to be computed.
Author(s)
Maxime HERVE <maxime.herve@univ-rennes1.fr>
See Also
Examples
if (require("ordinal",quietly=TRUE) & require("emmeans",quietly=TRUE)) {
model <- clm(rating~contact*temp,data=wine)
EMM <- emmeans(model,~contact:temp|cut,mode="linear.predictor")
# Probabilities
rating.emmeans(EMM)
# Cumulative probabilities
rating.emmeans(EMM,type="cumprob")
# Most probable rating
rating.emmeans(EMM,type="class1")
}
Observed rating frequencies
Description
Computes observed rating frequencies per level of a factor, in various formats.
Usage
rating.prob(x, g, type = c("prob", "cumprob", "class"))
Arguments
x |
ordered factor (ratings). |
g |
factor giving groups to be compared. |
type |
type of output to be returned: |
Author(s)
Maxime HERVE <maxime.herve@univ-rennes1.fr>
Examples
if (require("ordinal",quietly=TRUE)) {
data(wine,package="ordinal")
# Frequencies
rating.prob(wine$rating,wine$contact:wine$temp)
# Cumulative frequencies
rating.prob(wine$rating,wine$contact:wine$temp,type="cumprob")
# Most frequent rating
rating.prob(wine$rating,wine$contact:wine$temp,type="class")
}
Confidence intervals of a simple linear regression
Description
Computes and add to a graph the confidence interval of a simple regression line or of individual values.
Usage
reg.ci(model, conf.level = 0.95, type = c("mean", "ind"), ...)
Arguments
model |
|
conf.level |
confidence level. |
type |
interval type : |
... |
other agruments. See help of |
Author(s)
Maxime HERVE <maxime.herve@univ-rennes1.fr>
See Also
Examples
x <- 1:50
y <- 1:50+rnorm(50,0,4)
regression <- lm(y~x)
plot(x,y)
abline(regression)
reg.ci(regression,type="mean",col="red")
reg.ci(regression,type="ind",col="blue")
"Correlation" of variables to axes in MCA or mix analyses
Description
Represents the "correlation" of variables to axes in a MCA (from dudi.acm
) or a mix analysis (from dudi.hillsmith
or dudi.mix
).
Usage
scat.cr(dudi.obj, axis = 1)
Arguments
dudi.obj |
object obtained from |
axis |
axis to be represented (the first by default). |
Details
For quantitative variables, the squared correlation coefficient is displayed. For ordered factors, the squared multiple correlation coefficient is displayed. For unordered factors, the correlation ratio is displayed.
Author(s)
Maxime HERVE <maxime.herve@univ-rennes1.fr>, based on an idea of Stephane Champely.
See Also
dudi.acm
, dudi.hillsmith
, dudi.mix
Examples
require(ade4)
# Fictive dataset
age <- sample(15:60,50,replace=TRUE)
sex <- sample(c("M","F"),50,replace=TRUE)
size <- sample(155:190,50,replace=TRUE)
hair <- sample(c("Fair","Dark","Russet"),50,replace=TRUE)
eyes <- sample(c("Blue","Green","Brown"),50,replace=TRUE)
weight <- sample(50:85,50,replace=TRUE)
hand <- sample(c("Left.handed","Right.handed"),50,replace=TRUE)
tab <- data.frame(age,sex,size,weight,hand,eyes,hair,stringsAsFactors=TRUE)
amix <- dudi.hillsmith(tab,scannf=FALSE,nf=2)
scat.cr(amix)
Standard error
Description
Computes the standard error of a mean or of a proportion.
Usage
se(x, y = NULL)
Arguments
x |
numeric vector or number of successes. |
y |
number of trials. If |
Details
The function deals with missing values.
Author(s)
Maxime HERVE <maxime.herve@univ-rennes1.fr>
Examples
# Standard error of a mean
se(rnorm(30))
# Standard error of a proportion
se(9,25)
Sequence generation
Description
Generates a regular sequence from the minimum to the maximum of a vector.
Usage
seq2(x, int = 999)
Arguments
x |
numeric vector. |
int |
number of values to be generated ( |
Author(s)
Maxime HERVE <maxime.herve@univ-rennes1.fr>
See Also
Examples
seq2(rnorm(30))
Confidence interval of a Spearman's rank correlation coefficient
Description
Computes the confidence interval of a Spearman's rank correlation coefficient by bootstraping.
Usage
spearman.ci(var1, var2, nrep = 1000, conf.level = 0.95)
Arguments
var1 |
numeric vector (first variable). |
var2 |
nuermic verctor (second variable). |
nrep |
number of replicates for bootstraping. |
conf.level |
confidence level of the interval. |
Value
method |
name of the test. |
data.name |
a character string giving the name(s) of the data. |
conf.level |
confidence level. |
rep |
number of replicates. |
estimate |
Spearman's rank correlation coefficient. |
conf.int |
confidence interval. |
Author(s)
Maxime HERVE <maxime.herve@univ-rennes1.fr>
See Also
Examples
var1 <- sample(1:50,15,replace=TRUE)
var2 <- sample(1:50,15,replace=TRUE)
spearman.ci(var1,var2)
Comparison of several Spearman's rank correlation coefficients
Description
Computes Bonferroni-adjusted confidence intervals of a series of Spearman's rank correlation coefficients, for multiple comparisons. Confidence intervals are computed by bootstraping.
Usage
spearman.cor.multcomp(var1, var2, fact, alpha = 0.05, nrep = 1000)
Arguments
var1 |
numeric vector (first variable). |
var2 |
numeric vector (second variable). |
fact |
factor (groups). |
alpha |
significance level. |
nrep |
number of replicates for bootstraping. |
Details
Confidence intervals which do not overlap indicate correlation coefficients significantly different at alpha
.
Value
method |
name of the test. |
data.name |
a character string giving the name(s) of the data. |
tab |
data frame of correlation coefficients with confidence intervals |
alpha |
significance level. |
nrep |
number of replicates for bootstraping. |
Author(s)
Maxime HERVE <maxime.herve@univ-rennes1.fr>
See Also
Examples
set.seed(1510)
var1 <- c(1:15+rnorm(15,0,2),1:15+rnorm(15,0,2),1:15+rnorm(15,0,2))
var2 <- c(-1:-15+rnorm(15,0,2),1:15+rnorm(15,0,2),1:15+rnorm(15,0,2))
fact <- gl(3,15,labels=LETTERS[1:3])
spearman.cor.multcomp(var1,var2,fact)
# B and C similar but different from A
Divide into groups respecting relative proportions
Description
Divides a data frame randomly, but respecting the relative proportions of levels of a factor in the original data frame. Each subset has roughly the same number of individuals, and the same relative proportions in respect to levels of the given factor.
Usage
splitf(set, fac, k)
Arguments
set |
a data frame containing values to be divided into groups. |
fac |
a reference factor giving the relative proportions to be respected in each subset of |
k |
an integer giving the number of subsets to be generated. |
Value
A list of subsets of set
.
Author(s)
Maxime HERVE <maxime.herve@univ-rennes1.fr>
See Also
Examples
data(iris)
iris2 <- iris[c(1:50,51:80,101:120),]
# Proportions to be respected
table(iris2$Species)/nrow(iris2)
# Splitting
result <- splitf(iris2,iris2$Species,3)
# All subsets have the same size
lapply(result,nrow)
# And respect the initial proportions
lapply(result,function(x) table(x$Species)/nrow(x))
Standardization of a data frame based on another data frame
Description
Centers and scales a data frame. See Details.
Usage
stand(tab, ref.tab=NULL, center=NULL, scale=NULL)
Arguments
tab |
data frame to scale. |
ref.tab |
optional reference data frame, from which centering and scaling parameters are obtained (see Details). |
center |
optional vector of centering parameters (one per column of |
scale |
optional vector of scaling parameters (one per column of |
Details
If ref.tab
is not NULL
, centering and scaling parameters are looked for into this data frame. If it has a "scaled:center"
attribute, this one is used to center tab
. Otherwise means of ref.tab
's columns are used. The same happens for scaling parameters (with the "scaled:scale"
attribute and standard deviations).
If ref.tab
is NULL
, values of center
and scale
are used to standardize tab
.
If ref.tab
and center
are NULL
, means of tab
's columns are used for centering. If ref.tab
and scale
are NULL
, standard deviations of tab
's columns are used for scaling.
Author(s)
Maxime HERVE <maxime.herve@univ-rennes1.fr>
See Also
Examples
data(iris)
set.seed(1131)
iris.samp <- iris[sample(1:150,10),1:4]
# Centering parameters of the complete dataset
attr(scale(iris[,1:4]),"scaled:center")
# Centering parameters of the reduced dataset
attr(scale(iris.samp),"scaled:center")
# Standardization based on the reduced dataset only
attr(stand(iris.samp),"scaled:center")
# Standardization based on the complete dataset
attr(stand(iris.samp,iris[,1:4]),"scaled:center")
Significance tests of coefficients (multinomial regression)
Description
Tests for significance of coefficients associated with a given predictor of a model fitted with multinom
. Wald tests are used. All coefficients are generated and tested through the building of models using different reference classes (for the response but also for qualitative predictors with more than 2 levels).
Usage
test.multinom(model, variable)
Arguments
model |
object of class |
variable |
any predictor present in |
Author(s)
Maxime HERVE <maxime.herve@univ-rennes1.fr>
Synthesis quality of multivariate analyses
Description
Converts some ordinations performed with the vegan
package to objects compatible with coinertia
.
Usage
to.dudi(ord)
Arguments
ord |
an ordination (see Details). |
Details
The function supports:
- PCA computed from rda
. If data were scaled (prior to the analysis or using scale
of rda
) it is assumed that is was with the standard deviation using n-1
; As in dudi.pca
, to.dudi
rescales the data with the standard deviation using n
.
- PCoA computed from wcmdscale
, capscale
or dbrda
.
- CA computed from cca
.
Author(s)
Maxime HERVE <maxime.herve@univ-rennes1.fr>
User defined contrasts for EMMeans
Description
Returns a function usable by emmeans
for user defined contrasts.
Usage
user.cont(cont)
Arguments
cont |
any matrix of contrasts (see 'Details'). |
Details
In these matrices, each line is a comparison (= contrast) and each colum is a level of the factor. Rules for writing contrasts are:
- levels not involved in the comparison must have a null value
- levels to be compared must have opposite signs
- levels can be grouped (for example 2 -1 -1 give a comparison of the first level against the group composed by the two others)
- the sum of all values of a contrast must be null.
Value
user.cont.emmc |
the function to be called by |
Author(s)
Maxime HERVE <maxime.herve@univ-rennes1.fr>
See Also
Examples
require(car)
require(emmeans)
tab <- data.frame(
response <- c(rpois(30,1),rpois(30,3),rpois(30,10)) ,
fact <- gl(3,30,labels=LETTERS[1:3])
)
model <- glm(response~fact,family="poisson",data=tab)
Anova(model)
mat <- matrix(c(1,-1,0,0,1,-1,2,-1,-1),nrow=3,byrow=TRUE,dimnames=list(1:3,levels(fact)))
mat
cont.emmc <- user.cont(mat)
EMM <- emmeans(model,~fact)
contrast(EMM,"cont")
Wald tests for comparison of proportions to theoretical values
Description
Performs pairwise comparisons of proportions to theoretical values.
Usage
wald.ptheo.multinom.test(x, p, p.method = "fdr")
Arguments
x |
either a factor or a matrix with K columns giving the counts for each of the K classes. |
p |
theoretical proportions. |
p.method |
method for p-values correction. See help of |
Details
The function builds K logistic regressions (in each case considering one class vs. the sum of all others) and uses wald.ptheo.test
to test the hypothesis that the proportion of this class is equal to p[K]
.
Value
method |
name of the test. |
data.name |
a character string giving the name(s) of the data. |
observed |
observed proportions. |
expected |
theoretical proportions. |
p.adjust.method |
method for p-values correction. |
statistic |
statistics of each test. |
p.value2 |
corrected p-values. |
p.value |
data frame of results. |
Author(s)
Maxime HERVE <maxime.herve@univ-rennes1.fr>
See Also
wald.ptheo.test
, prop.multinom
Examples
response <- factor(rep(LETTERS[1:4],c(20,40,42,13)))
wald.ptheo.multinom.test(response,p=c(0.15,0.25,0.3,0.3))
Wald test for comparison of a proportion to a theoretical value
Description
Performs a Wald test for comparison of a proportion to a theoretical value.
Usage
wald.ptheo.test(y, blocks = NULL, p = 0.5)
Arguments
y |
either a binary response (numeric vector or factor, with only two possible values except NA) or a two-column matrix with the columns giving the numbers of successes (left) and failures (right). |
blocks |
optional blocking (random) factor. |
p |
hypothesized probability of success. |
Details
The function builds a logistic (mixed) regression and applies a Wald test to compare the estimated value of the intercept to its theoretical value under H0. Eventual overdispersion is taken into account, by using a quasi-binomial law in case of no blocks or by introducing an individual-level random factor if blocks are present.
If the response is a 0/1 vector, the probability of the '1' group is tested. With other vectors, the response is transformed into a factor and the probability of the second level is tested.
If the response is a two-column matrix, the probability of the left column is tested.
If the reponse is a vector and no blocking factor is present, the exact binomial test performed by binom.test
should be preferred since it is an exact test, whereas the Wald test is an approximate test.
Value
method |
name of the test. |
data.name |
a character string giving the name(s) of the data. |
statistic |
test statistics of the test. |
p.value |
p-value of the test. |
estimate |
the estimated proportion (calculated without taking into account the blocking factor, if present). |
alternative |
a character string describing the alternative hypothesis, always |
null.value |
the value of the proportion under the null hypothesis. |
parameter |
the degrees of freedom for the t-statistic, only whith overdispersion and no blocks. |
Author(s)
Maxime HERVE <maxime.herve@univ-rennes1.fr>
See Also
Examples
set.seed(2006)
response <- sample(0:1,60,replace=TRUE)
# Comparison to p=0.5
wald.ptheo.test(response)
# Comparison to p=0.8
wald.ptheo.test(response,p=0.8)
# With a blocking factor
require(lme4)
blocks <- gl(3,20)
wald.ptheo.test(response,blocks)
Non parametric pairwise comparisons for paired data
Description
Performs non parametric pairwise comparisons of paired samples by Wilcoxon signed rank tests for paired data.
Usage
wilcox.paired.multcomp(formula, data, p.method = "fdr")
Arguments
formula |
a formula of the form |
data |
an optional data frame containing the variables in the formula |
p.method |
method for p-values correction. See help of |
Value
method |
name of the test. |
data.name |
a character string giving the name(s) of the data. |
method |
a character string indicating the name of the test. |
p.adjust.method |
method for p-values correction. |
p.value |
table of results of pairwise comparisons. |
Author(s)
Maxime HERVE <maxime.herve@univ-rennes1.fr>
See Also
pairwise.wilcox.test
, wilcox.test
Examples
response <- c(rnorm(10,0,3),rnorm(10,5,3),rnorm(10,8,2))
fact <- gl(3,10,labels=LETTERS[1:3])
block <- gl(10,1,30,labels=letters[1:10])
friedman.test(response~fact|block)
wilcox.paired.multcomp(response~fact|block)
Wilcoxon sign test
Description
Performs a Wilcoxon sign test to compare medians of two paired samples or one median to a given value.
Usage
wilcox.signtest(x, ...)
## Default S3 method:
wilcox.signtest(x, y = NULL, mu = 0, conf.level = 0.95, ...)
## S3 method for class 'formula'
wilcox.signtest(formula, data, subset, ...)
Arguments
x |
a numeric vector of data values. |
y |
an optional numeric vector of data values (for paired two-sample test). |
mu |
theoretical median (one-sample test) or theoretical median of |
conf.level |
confidence level of the interval. |
formula |
a formula of the form |
data |
an optional data frame containing the variables in the formula |
subset |
an optional vector specifying a subset of observations to be used. |
... |
further arguments to be passed to or from other methods. |
Details
If zeroes (i.e. null differences with mu
) are present, the median of the data different from mu
is tested in the one-sample situation; the median of the x-y
differences different from mu
in the two-sample situation.
Value
method |
a character string indicating the name of the test. |
data.name |
a character string giving the name(s) of the data. |
null.value |
the specified hypothesized value of the median or median difference depending on the test performed. |
p.value |
the p-value of the test. |
alternative |
a character string giving the alternative hypothesis, always |
estimate |
the estimated median or median of |
conf.int |
a confidence interval for the median tested. |
Author(s)
Maxime HERVE <maxime.herve@univ-rennes1.fr>
See Also
Examples
set.seed(1706)
response <- c(rnorm(7,3,1.5),rnorm(7,5.5,2))
# Comparison of 2 samples
fact <- gl(2,7,labels=LETTERS[1:2])
wilcox.signtest(response~fact)
# Comparison to a given value
theo <- 4
wilcox.signtest(response,mu=theo)
Weighted arithmetic mean
Description
Computes the weighted arithmetic mean of a vector.
Usage
wmean(x, w = rep(1, length(x)), na.rm = TRUE)
Arguments
x |
numeric vector. |
w |
numeric vector of weights. |
na.rm |
a logical value indicating whether |
Author(s)
Maxime HERVE <maxime.herve@univ-rennes1.fr>
Examples
mean(1:10)
wmean(1:10,w=10:1)