Title: | Model Based Clustering for Mixed Data |
Version: | 1.2.1 |
Description: | Model-based clustering of mixed data (i.e. data which consist of continuous, binary, ordinal or nominal variables) using a parsimonious mixture of latent Gaussian variable models. |
Depends: | R (≥ 3.3.2) |
Imports: | ggplot2, mclust, reshape2, MASS, msm, mvtnorm, parallel, truncnorm, viridis, stats |
License: | GPL-2 |
LazyData: | true |
RoxygenNote: | 6.0.1 |
NeedsCompilation: | no |
Packaged: | 2017-05-08 16:35:03 UTC; damien |
Author: | Damien McParland [aut, cre], Isobel Claire Gormley [aut] |
Maintainer: | Damien McParland <damien.mcparland@ucd.ie> |
Repository: | CRAN |
Date/Publication: | 2017-05-08 17:19:20 UTC |
Model based clustering for mixed data: clustMD
Description
Model-based clustering of mixed data (i.e. data that consist of continuous, binary, ordinal or nominal variables) using a parsimonious mixture of latent Gaussian variable models.
Author(s)
Damien McParland
Damien McParland <damien.mcparland@ucd.ie> Isobel Claire Gormley <claire.gormley@ucd.ie>
References
McParland, D. and Gormley, I.C. (2016). Model based clustering for mixed data: clustMD. Advances in Data Analysis and Classification, 10 (2):155-169.
See Also
Byar prostate cancer data set.
Description
A data set consisting of variables of mixed type measured on a group of prostate cancer patients. Patients have either stage 3 or stage 4 prostate cancer.
Usage
Byar
Format
A data frame with 475 observations on the following 15 variables.
Age
a numeric vector indicating the age of the patient.
Weight
a numeric vector indicating the weight of the patient.
Performance.rating
an ordinal variable indicating how active the patient is: 0 - normal activity, 1 - in bed less than 50% of daytime, 2 - in bed more than 50% of daytime, 3 - confined to bed.
Cardiovascular.disease.history
a binary variable indicating if the patient has a history of cardiovascular disease: 0 - no, 1 - yes.
Systolic.Blood.pressure
a numeric vector indicating the systolic blood pressure of the patient in units of ten.
Diastolic.blood.pressure
a numeric vector indicating the diastolic blood pressure of the patient in units of ten.
Electrocardiogram.code
a nominal variable indicating the electorcardiogram code: 0 - normal, 1 - benign, 2 - rythmic disturbances and electrolyte changes, 3 - heart blocks or conduction defects, 4 - heart strain, 5 - old myocardial infarct, 6 - recent myocardial infarct.
Serum.haemoglobin
a numeric vector indicating the serum haemoglobin levels of the patient measured in g/100ml.
Size.of.primary.tumour
a numeric vector indicating the estimated size of the patient's primary tumour in centimeters squared.
Index.of.tumour.stage.and.histolic.grade
a numeric vector indicating the combined index of tumour stage and histolic grade of the patient.
Serum.prostatic.acid.phosphatase
a numeric vector indicating the serum prostatic acid phosphatase levels of the patient in King-Armstong units.
Bone.metastases
a binary vector indicating the presence of bone metastasis: 0 - no, 1 - yes.
Stage
the stage of the patient's prostate cancer.
Observation
a patient ID number.
SurvStat
the post trial survival status of the patient: 0 - alive, 1 - dead from prostatic cancer, 2 - dead from heart or vascular disease, 3 - dead from cerebrovascular accident, 3 - dead form pulmonary ebolus, 5 - dead from other cancer, 6 - dead from respiratory disease, 7 - dead from other specific non-cancer cause, 8 - dead from other unspecified non-cancer cause, 9 - dead from unknown cause.
Source
Byar, D.P. and Green, S.B. (1980). The choice of treatment for cancer patients based on covariate information: applications to prostate cancer. Bulletin du Cancer 67: 477-490.
Hunt, L., Jorgensen, M. (1999). Mixture model clustering using the multimix program. Australia and New Zealand Journal of Statistics 41: 153-171.
E-step of the (MC)EM algorithm
Description
Internal function.
Usage
E.step(N, G, D, CnsIndx, OrdIndx, zlimits, mu, Sigma, Y, J, K, norms, nom.ind.Z,
patt.indx, pi.vec, model, perc.cut)
Arguments
N |
number of observations. |
G |
number of mixture components. |
D |
dimension of the latent data. |
CnsIndx |
the number of continuous variables. |
OrdIndx |
the sum of the number of continuous and ordinal (including binary) variables. |
zlimits |
the truncation points for the latent data. |
mu |
a D x G matrix of means. |
Sigma |
a D x D x G array of covariance parameters. |
Y |
an N x J data matrix. |
J |
the number of observed variables. |
K |
the number of levels for each variable. |
norms |
a matrix of standard normal deviates. |
nom.ind.Z |
the latent dimensions corresponding to each nominal variable. |
patt.indx |
a list of length equal to the number of observed response patterns. Each entry of the list details the observations for which that response pattern was observed. |
pi.vec |
mixing weights. |
model |
the covariance model fitted to the data. |
perc.cut |
threshold parameters. |
Value
Output required for clustMD function.
See Also
M-step of the (MC)EM algorithm
Description
Internal function.
Usage
M.step(tau, N, sumTauEz, J, OrdIndx, D, G, Y, CnsIndx, sumTauS, model, a,
nom.ind.Z)
Arguments
tau |
a |
N |
number of observations. |
sumTauEz |
the sum across all observations of observed and expected latent continuous values mutiplied by the posterior probability of belonging to each cluster. |
J |
the number of variables. |
OrdIndx |
the sum of the number of continuous and ordinal (including binary) variables. |
D |
dimension of the latent data. |
G |
the number of mixture components. |
Y |
a |
CnsIndx |
the number of continuous variables. |
sumTauS |
the sum across all observations of outer product of observed and expected latent continuous values mutiplied by the posterior probability of belonging to each cluster. |
model |
which |
a |
a |
nom.ind.Z |
the latent dimensions corresponding to each nominal variable. |
Value
Output required for clustMD
function.
See Also
Approximates the observed log likelihood.
Description
Approximates the observed log likelihood.
Usage
ObsLogLikelihood(N, CnsIndx, G, Y, mu, Sigma, pi.vec, patt.indx, zlimits, J,
OrdIndx, probs.nom, model, perc.cut, K)
Arguments
N |
the number of observations. |
CnsIndx |
the number of continuous variables. |
G |
the number of mixture components. |
Y |
an |
mu |
a |
Sigma |
a |
pi.vec |
the mixing weights. |
patt.indx |
a list of length equal to the number of observed response patterns. Each entry of the list details the observations for which that response pattern was observed. |
zlimits |
the truncation points for the latent data. |
J |
the number of variables. |
OrdIndx |
the sum of the number of continuous and ordinal (including binary) variables. |
probs.nom |
an array containing the response probabilities for each nominal variable for each cluster |
model |
the covariance model fitted to the data. |
perc.cut |
threshold parameters. |
K |
the number of levels for each variable. |
Value
Output required for clustMD
function.
See Also
Model Based Clustering for Mixed Data
Description
A function that fits the clustMD model to a data set consisting of any combination of continuous, binary, ordinal and nominal variables.
Usage
clustMD(X, G, CnsIndx, OrdIndx, Nnorms, MaxIter, model, store.params = FALSE,
scale = FALSE, startCL = "hc_mclust", autoStop = FALSE, ma.band = 50,
stop.tol = NA)
Arguments
X |
a data matrix where the variables are ordered so that the continuous variables come first, the binary (coded 1 and 2) and ordinal variables (coded 1, 2, ...) come second and the nominal variables (coded 1, 2, ...) are in last position. |
G |
the number of mixture components to be fitted. |
CnsIndx |
the number of continuous variables in the data set. |
OrdIndx |
the sum of the number of continuous, binary and ordinal variables in the data set. |
Nnorms |
the number of Monte Carlo samples to be used for the intractable E-step in the presence of nominal data. Irrelevant if there are no nominal variables. |
MaxIter |
the maximum number of iterations for which the (MC)EM algorithm should run. |
model |
a string indicating which clustMD model is to be fitted. This
may be one of: |
store.params |
a logical argument indicating if the parameter estimates at each iteration should be saved and returned by the clustMD function. |
scale |
a logical argument indicating if the continuous variables should be standardised. |
startCL |
a string indicating which clustering method should be used to initialise the (MC)EM algorithm. This may be one of "kmeans" (K means clustering), "hclust" (hierarchical clustering), "mclust" (finite mixture of Gaussian distributions), "hc_mclust" (model-based hierarchical clustering) or "random" (random cluster allocation). |
autoStop |
a logical argument indicating whether the (MC)EM algorithm
should use a stopping criterion to decide if convergence has been
reached. Otherwise the algorithm will run for If only continuous variables are present the algorithm will use Aitken's
acceleration criterion with tolerance If categorical variables are present, the stopping criterion is based
on a moving average of the approximated log likelihood values. Let
|
ma.band |
the number of iterations to be included in the moving average calculation for the stopping criterion. |
stop.tol |
the tolerance of the (MC)EM stopping criterion. |
Value
An object of class clustMD is returned. The output components are as follows:
model |
The covariance model fitted to the data. |
G |
The number of clusters fitted to the data. |
Y |
The observed data matrix. |
cl |
The cluster to which each observation belongs. |
tau |
A |
means |
A |
A |
A |
Lambda |
A |
Sigma |
A |
BIChat |
The estimated Bayesian information criterion for the model fitted. |
ICLhat |
The estimated integrated classification likelihood criterion for the model fitted. |
paramlist |
If store.params is |
Varnames |
A character vector of names corresponding to the
columns of |
Varnames_sht |
A truncated version of |
likelihood.store |
A vector containing the estimated log likelihood at each iteration. |
References
McParland, D. and Gormley, I.C. (2016). Model based clustering for mixed data: clustMD. Advances in Data Analysis and Classification, 10 (2):155-169.
Examples
data(Byar)
# Transformation skewed variables
Byar$Size.of.primary.tumour <- sqrt(Byar$Size.of.primary.tumour)
Byar$Serum.prostatic.acid.phosphatase <- log(Byar$Serum.prostatic.acid.phosphatase)
# Order variables (Continuous, ordinal, nominal)
Y <- as.matrix(Byar[, c(1, 2, 5, 6, 8, 9, 10, 11, 3, 4, 12, 7)])
# Start categorical variables at 1 rather than 0
Y[, 9:12] <- Y[, 9:12] + 1
# Standardise continuous variables
Y[, 1:8] <- scale(Y[, 1:8])
# Merge categories of EKG variable for efficiency
Yekg <- rep(NA, nrow(Y))
Yekg[Y[,12]==1] <- 1
Yekg[(Y[,12]==2)|(Y[,12]==3)|(Y[,12]==4)] <- 2
Yekg[(Y[,12]==5)|(Y[,12]==6)|(Y[,12]==7)] <- 3
Y[, 12] <- Yekg
## Not run:
res <- clustMD(X = Y, G = 3, CnsIndx = 8, OrdIndx = 11, Nnorms = 20000,
MaxIter = 500, model = "EVI", store.params = FALSE, scale = TRUE,
startCL = "kmeans", autoStop= TRUE, ma.band=30, stop.tol=0.0001)
## End(Not run)
Model Based Clustering for Mixed Data
Description
A function that fits the clustMD model to a data set consisting of any
combination of continuous, binary, ordinal and nominal variables. This
function is a wrapper for clustMD
that takes arguments as a
list.
Usage
clustMDlist(arglist)
Arguments
arglist |
a list of input arguments for |
Value
A clustMD
object. See clustMD
.
References
McParland, D. and Gormley, I.C. (2016). Model based clustering for mixed data: clustMD. Advances in Data Analysis and Classification, 10 (2):155-169.
See Also
Examples
data(Byar)
# Transformation skewed variables
Byar$Size.of.primary.tumour <- sqrt(Byar$Size.of.primary.tumour)
Byar$Serum.prostatic.acid.phosphatase <-
log(Byar$Serum.prostatic.acid.phosphatase)
# Order variables (Continuous, ordinal, nominal)
Y <- as.matrix(Byar[, c(1, 2, 5, 6, 8, 9, 10, 11, 3, 4, 12, 7)])
# Start categorical variables at 1 rather than 0
Y[, 9:12] <- Y[, 9:12] + 1
# Standardise continuous variables
Y[, 1:8] <- scale(Y[, 1:8])
# Merge categories of EKG variable for efficiency
Yekg <- rep(NA, nrow(Y))
Yekg[Y[,12]==1] <- 1
Yekg[(Y[,12]==2)|(Y[,12]==3)|(Y[,12]==4)] <- 2
Yekg[(Y[,12]==5)|(Y[,12]==6)|(Y[,12]==7)] <- 3
Y[, 12] <- Yekg
argList <- list(X=Y, G=3, CnsIndx=8, OrdIndx=11, Nnorms=20000,
MaxIter=500, model="EVI", store.params=FALSE, scale=TRUE,
startCL="kmeans", autoStop=FALSE, ma.band=50, stop.tol=NA)
## Not run:
res <- clustMDlist(argList)
## End(Not run)
Run multiple clustMD models in parallel
Description
This function allows the user to run multiple clustMD models in parallel.
The inputs are similar to clustMD()
except G
is now a vector
containing the the numbers of components the user would like to fit and
models
is a vector of strings indicating the covariance models the
user would like to fit for each element of G. The user can specify the
number of cores to be used or let the function detect the number available.
Usage
clustMDparallel(X, CnsIndx, OrdIndx, G, models, Nnorms, MaxIter, store.params,
scale, startCL = "hc_mclust", Ncores = NULL, autoStop = FALSE,
ma.band = 50, stop.tol = NA)
Arguments
X |
a data matrix where the variables are ordered so that the continuous variables come first, the binary (coded 1 and 2) and ordinal variables (coded 1, 2,...) come second and the nominal variables (coded 1, 2,...) are in last position. |
CnsIndx |
the number of continuous variables in the data set. |
OrdIndx |
the sum of the number of continuous, binary and ordinal variables in the data set. |
G |
a vector containing the numbers of mixture components to be fitted. |
models |
a vector of strings indicating which clustMD models are to be
fitted. This may be one of: |
Nnorms |
the number of Monte Carlo samples to be used for the intractable E-step in the presence of nominal data. |
MaxIter |
the maximum number of iterations for which the (MC)EM algorithm should run. |
store.params |
a logical variable indicating if the parameter estimates
at each iteration should be saved and returned by the |
scale |
a logical variable indicating if the continuous variables should be standardised. |
startCL |
a string indicating which clustering method should be used to initialise the (MC)EM algorithm. This may be one of "kmeans" (K means clustering), "hclust" (hierarchical clustering), "mclust" (finite mixture of Gaussian distributions), "hc_mclust" (model-based hierarchical clustering) or "random" (random cluster allocation). |
Ncores |
the number of cores the user would like to use. Must be less than or equal to the number of cores available. |
autoStop |
a logical argument indicating whether the (MC)EM algorithm
should use a stopping criterion to decide if convergence has been
reached. Otherwise the algorithm will run for If only continuous variables are present the algorithm will use Aitken's
acceleration criterion with tolerance If categorical variables are present, the stopping criterion is based
on a moving average of the approximated log likelihood values. let $t$
denote the current interation. The average of the |
ma.band |
the number of iterations to be included in the moving average stopping criterion. |
stop.tol |
the tolerance of the (MC)EM stopping criterion. |
Value
An object of class clustMDparallel
is returned. The output
components are as follows:
BICarray |
A matrix indicating the estimated BIC values for each of the models fitted. |
results |
A list containing the output for each of the models
fitted. Each entry of this list is a |
References
McParland, D. and Gormley, I.C. (2016). Model based clustering for mixed data: clustMD. Advances in Data Analysis and Classification, 10 (2):155-169.
See Also
Examples
data(Byar)
# Transformation skewed variables
Byar$Size.of.primary.tumour <- sqrt(Byar$Size.of.primary.tumour)
Byar$Serum.prostatic.acid.phosphatase <-
log(Byar$Serum.prostatic.acid.phosphatase)
# Order variables (Continuous, ordinal, nominal)
Y <- as.matrix(Byar[, c(1, 2, 5, 6, 8, 9, 10, 11, 3, 4, 12, 7)])
# Start categorical variables at 1 rather than 0
Y[, 9:12] <- Y[, 9:12] + 1
# Standardise continuous variables
Y[, 1:8] <- scale(Y[, 1:8])
# Merge categories of EKG variable for efficiency
Yekg <- rep(NA, nrow(Y))
Yekg[Y[,12]==1] <- 1
Yekg[(Y[,12]==2)|(Y[,12]==3)|(Y[,12]==4)] <- 2
Yekg[(Y[,12]==5)|(Y[,12]==6)|(Y[,12]==7)] <- 3
Y[, 12] <- Yekg
## Not run:
res <- clustMDparallel(X = Y, G = 1:3, CnsIndx = 8, OrdIndx = 11, Nnorms = 20000,
MaxIter = 500, models = c("EVI", "EII", "VII"), store.params = FALSE, scale = TRUE,
startCL = "kmeans", autoStop= TRUE, ma.band=30, stop.tol=0.0001)
res$BICarray
## End(Not run)
Parallel coordinates plot adapted for clustMD
output
Description
Produces a parallel coordinates plot as parcoord
in the MASS
library with some minor adjustments.
Usage
clustMDparcoord(x, col = 1, xlabels = NULL, lty = 1, var.label = FALSE,
xlab = "", ylab = "", ...)
Arguments
x |
a matrix or data frame who columns represent variables. Missing values are allowed. |
col |
a vector of colours, recycled as necessary for each observation. |
xlabels |
a character vector of variable names for the x axis. |
lty |
a vector of line types, recycled as necessary for each observation. |
var.label |
if TRUE, each variable's axis is labelled with maximum and minimum values. |
xlab |
label for the X axis. |
ylab |
label for the Y axis. |
... |
further graphics parameters which are passed to |
Value
A parallel coordinates plot is drawn with one line for each cluster.
References
Wegman, E. J. (1990) Hyperdimensional data analysis using parallel coordinates. Journal of the American Statistical Association 85, 664-675.
Venables, W. N. and Ripley, B. D. (2002) Modern Applied Statistics with S. Fourth edition. Springer.
Return the mean and covariance matrix of a truncated multivariate normal distribution
Description
This function returns the mean and covariance matrix of a truncated multivariate normal distribution. It takes as inputs a vector of lower thresholds and another of upper thresholds along with the mean and covariance matrix of the untruncated distribution. This function follows the method proposed by Kan \& Robotti (2016).
Usage
dtmvnom(a, b, mu, S)
Arguments
a |
a vector of lower thresholds. |
b |
a vector of upper thresholds. |
mu |
the mean of the untruncated distribution. |
S |
the covariance matrix of the untruncated distribution. |
Value
Returns a list of two elements. The first element, tmean
, is
the mean of the truncated multivariate normal distribution. The second
element, tvar
, is the covariance matrix of the truncated
distribution.
References
Kan, R., & Robotti, C. (2016). On Moments of Folded and Truncated Multivariate Normal Distributions. Available at SSRN.
Extracts relevant output from clustMDparallel
object
Description
This function takes a clustMDparallel
object, a number of clusters
and a covariance model as inputs. It then returns the output corresponding
to that model. If the particular model is not contained in the
clustMDparallel
object then the function returns an error.
Usage
getOutput_clustMDparallel(resParallel, nClus, covModel)
Arguments
resParallel |
a |
nClus |
the number of clusters in the desired output. |
covModel |
the covariance model of the desired output. |
Value
A clustMD
object containing the output for the relevant
model.
Calculate the mode of a sample
Description
Calculate the mode of a sample
Usage
modal.value(x)
Arguments
x |
a vector containing the sample values. |
Value
The mode of the sample. In the case of a tie, the minimum is returned.
Calculates the number of free parameters for the clustMD
model.
Description
Internal function.
Usage
npars_clustMD(model, D, G, J, CnsIndx, OrdIndx, K)
Arguments
model |
the |
D |
the dimension of the latent data. |
G |
the number of mixture components. |
J |
the number of variables. |
CnsIndx |
the number of continuous variables. |
OrdIndx |
the sum of the number of continuous and ordinal (including binary) variables. |
K |
a vector indicating the number of levels of each categorical variable. |
Value
Output required for clustMD
function.
See Also
Check if response patterns are equal
Description
Checks whether response patterns are equal or not and returns TRUE
or FALSE
reprectively.
Usage
patt.equal(x, patt)
Arguments
x |
a numeric vector. |
patt |
a vector to compare |
Value
Returns TRUE
if x
and patt
are exactly the same
and FALSE
otherwise.
Note
Used internally in clustMD function.
Calculates the threshold parameters for ordinal variables.
Description
Calculates the threshold parameters for ordinal variables.
Usage
perc.cutoffs(CnsIndx, OrdIndx, Y, N)
Arguments
CnsIndx |
the number of continuous variables. |
OrdIndx |
the sum of the number of continuous and ordinal (including binary) variables. |
Y |
an |
N |
number of observations. |
Value
Output required for clustMD
function.
See Also
Plotting method for objects of class clustMD
Description
Plots a parallel coordinates plot and dot plot of the estimated cluster means, a barplot of the variances by cluster for diagonal covariance models or a heatmap of the covariance matrix for non-diagonal covariance structures, and a histogram of the clustering uncertainties for each observation.
Usage
## S3 method for class 'clustMD'
plot(x, ...)
Arguments
x |
a |
... |
further arguments passed to or from other methods. |
Value
Prints graphical summaries of the fitted model as detailed above.
References
McParland, D. and Gormley, I.C. (2016). Model based clustering for mixed data: clustMD. Advances in Data Analysis and Classification, 10 (2):155-169.
See Also
Summary plots for a clustMDparallel object
Description
Produces a line plot of the estimated BIC values corresponding to each covariance model against the number of clusters fitted. For the optimal model according to this criteria, a parallel coordinates plot of the cluster means is produced along with a barchart or heatmap of the covariance matrices for each cluster and a histogram of the clustering uncertainties.
Usage
## S3 method for class 'clustMDparallel'
plot(x, ...)
Arguments
x |
a |
... |
further arguments passed to or from other methods. |
Value
Produces a number of plots as detailed above.
Print basic details of clustMD
object.
Description
Prints a short summary of a clustMD
object to screen. Details the
number of clusters fitted as well as the covariance model and the estimated
BIC.
Usage
## S3 method for class 'clustMD'
print(x, ...)
Arguments
x |
a |
... |
further arguments passed to or from other methods. |
Value
Prints summary details, as described above, to screen.
See Also
Print basic details of clustMDparallel
object
Description
Prints basic details of clustMDparallel
object. Outputs the different
numbers of clusters and the different covariance structures fitted to the
data. It also states which model was optimal according to the estimated BIC
criterion.
Usage
## S3 method for class 'clustMDparallel'
print(x, ...)
Arguments
x |
a |
... |
further arguments passed to or from other methods. |
Value
Prints details described above to screen.
See Also
Helper internal function for dtmvnom()
Description
Internal function.
Usage
qfun(a, b, S)
Arguments
a |
a vector of lower thresholds. |
b |
a vector of upper thresholds. |
S |
the covariance matrix of the untruncated distribution. |
Value
Output required for dtmvnom function.
References
Kan, R., & Robotti, C. (2016). On Moments of Folded and Truncated Multivariate Normal Distributions. Available at SSRN.
Stable computation of the log of a sum
Description
Function takes a numeric vector and returns the log of the sum of the elements of that vector. Calculations are done on the log scale for stability.
Usage
stable.probs(s)
Arguments
s |
a numeric vector. |
Value
The log of the sum of the elements of s
Summarise clustMD
object
Description
Prints a summary of a clustMD
object to screen. Details the number
of clusters fitted as well as the covariance model and the estimated BIC.
Also prints a table detailing the number of observations in each cluster and
a matrix of the cluster means.
Usage
## S3 method for class 'clustMD'
summary(object, ...)
Arguments
object |
a |
... |
further arguments passed to or from other methods. |
Value
Prints summary of clustMD
object to screen, as detailed above.
See Also
Prints a summary of a clustMDparallel object to screen.
Description
Prints the different numbers of clusters and covariance models fitted and indicates the optimal model according to the estimated BIC criterion. The estimated BIC for the optimal model is printed to screen along with a table of the cluster membership and the matrix of cluster means for this optimal model.
Usage
## S3 method for class 'clustMDparallel'
summary(object, ...)
Arguments
object |
a |
... |
further arguments passed to or from other methods. |
Value
Prints a summary of the clustMDparallel
object to screen, as
detailed above.
See Also
Calculate the outer product of a vector with itself
Description
This function calculates the outer product of a vector with itself.
Usage
vec.outer(x)
Arguments
x |
a numeric vector. |
Value
Returns the outer product of the vector x
with itself.
Calculates the first and second moments of the latent data
Description
Internal function.
Usage
z.moments(D, G, N, CnsIndx, OrdIndx, zlimits, mu, Sigma, Y, J, K, norms,
nom.ind.Z, patt.indx)
Arguments
D |
dimension of the latent data. |
G |
number of mixture components. |
N |
number of observations. |
CnsIndx |
the number of continuous variables. |
OrdIndx |
the sum of the number of continuous and ordinal (including binary) variables. |
zlimits |
the truncation points for the latent data. |
mu |
a |
Sigma |
a |
Y |
an |
J |
the number of variables. |
K |
the number of levels for each variable. |
norms |
a matrix of standard normal deviates. |
nom.ind.Z |
the latent dimensions corresponding to each nominal variable. |
patt.indx |
a list of length equal to the number of observed response patterns. Each entry of the list details the observations for which that response pattern was observed. |
Value
Output required for clustMD
function.
See Also
Calculates the first and second moments of the latent data for diagonal models
Description
Internal function.
Usage
z.moments_diag(D, G, N, CnsIndx, OrdIndx, zlimits, mu, Sigma, Y, J, K, norms,
nom.ind.Z)
Arguments
D |
dimension of the latent data. |
G |
number of mixture components. |
N |
number of observations. |
CnsIndx |
the number of continuous variables. |
OrdIndx |
the sum of the number of continuous and ordinal (including binary) variables. |
zlimits |
the truncation points for the latent data. |
mu |
a |
Sigma |
a |
Y |
an |
J |
the number of variables. |
K |
the number of levels for each variable. |
norms |
a matrix of standard normal deviates. |
nom.ind.Z |
the latent dimensions corresponding to each nominal variable. |
Value
Output required for clustMD function.
See Also
Transforms Monte Carlo simulated data into categorical data. Calculates empirical moments of latent data given categorical responses.
Description
Internal function.
Usage
z.nom.diag(Z)
Arguments
Z |
a matrix of Monte Carlo simulated data. |
Value
Output required for clustMD
function.