| Title: | Covariate Dependent Graph Estimation | 
| Version: | 1.0.1 | 
| Date: | 2022-09-16 | 
| Language: | en-US | 
| BugReports: | https://github.com/JacobHelwig/covdepGE/issues | 
| URL: | https://github.com/JacobHelwig/covdepGE | 
| Description: | A covariate-dependent approach to Gaussian graphical modeling as described in Dasgupta et al. (2022). Employs a novel weighted pseudo-likelihood approach to model the conditional dependence structure of data as a continuous function of an extraneous covariate. The main function, covdepGE::covdepGE(), estimates a graphical representation of the conditional dependence structure via a block mean-field variational approximation, while several auxiliary functions (inclusionCurve(), matViz(), and plot.covdepGE()) are included for visualizing the resulting estimates. | 
| License: | GPL (≥ 3) | 
| Encoding: | UTF-8 | 
| RoxygenNote: | 7.2.1 | 
| LinkingTo: | Rcpp, RcppArmadillo | 
| Imports: | doParallel, foreach, ggplot2, glmnet, latex2exp, MASS, parallel, Rcpp, reshape2, stats | 
| Suggests: | testthat (≥ 3.0.0), covr, vdiffr | 
| Config/testthat/edition: | 3 | 
| NeedsCompilation: | yes | 
| Packaged: | 2022-09-16 15:25:55 UTC; jacob.a.helwig | 
| Author: | Jacob Helwig [cre, aut], Sutanoy Dasgupta [aut], Peng Zhao [aut], Bani Mallick [aut], Debdeep Pati [aut] | 
| Maintainer: | Jacob Helwig <jacob.a.helwig@tamu.edu> | 
| Repository: | CRAN | 
| Date/Publication: | 2022-09-16 15:56:08 UTC | 
covdepGE: Covariate Dependent Graph Estimation
Description
A covariate-dependent approach to Gaussian graphical modeling as described in Dasgupta et al. (2022). Employs a novel weighted pseudo-likelihood approach to model the conditional dependence structure of data as a continuous function of an extraneous covariate. The main function, covdepGE::covdepGE(), estimates a graphical representation of the conditional dependence structure via a block mean-field variational approximation, while several auxiliary functions (inclusionCurve(), matViz(), and plot.covdepGE()) are included for visualizing the resulting estimates.
Author(s)
Maintainer: Jacob Helwig jacob.a.helwig@tamu.edu
Authors:
Sutanoy Dasgupta sutanoy@stat.tamu.edu
Peng Zhao pzhao@stat.tamu.edu
Bani Mallick bmallick@stat.tamu.edu
Debdeep Pati debdeep@stat.tamu.edu
References
(1) Sutanoy Dasgupta, Peng Zhao, Prasenjit Ghosh, Debdeep Pati, and Bani Mallick. An approximate Bayesian approach to covariate-dependent graphical modeling. pages 1–59, 2022.
See Also
Useful links:
Report bugs at https://github.com/JacobHelwig/covdepGE/issues
Covariate Dependent Graph Estimation
Description
Model the conditional dependence structure of X as a function
of Z as described in (1)
Usage
covdepGE(
  X,
  Z = NULL,
  hp_method = "hybrid",
  ssq = NULL,
  sbsq = NULL,
  pip = NULL,
  nssq = 5,
  nsbsq = 5,
  npip = 5,
  ssq_mult = 1.5,
  ssq_lower = 1e-05,
  snr_upper = 25,
  sbsq_lower = 1e-05,
  pip_lower = 1e-05,
  pip_upper = NULL,
  tau = NULL,
  norm = 2,
  center_X = TRUE,
  scale_Z = TRUE,
  alpha_tol = 1e-05,
  max_iter_grid = 10,
  max_iter = 100,
  edge_threshold = 0.5,
  sym_method = "mean",
  parallel = FALSE,
  num_workers = NULL,
  prog_bar = TRUE
)
Arguments
X | 
 
  | 
Z | 
 
 Z <- rep(0, nrow(X)) If   | 
hp_method | 
 
 
 
  | 
ssq | 
 
 ssq <- seq(ssq_lower, ssq_upper, length.out = nssq) 
  | 
sbsq | 
 
 sbsq <- seq(sbsq_lower, sbsq_upper, length.out = nsbsq) 
  | 
pip | 
 
 pip <- seq(pip_lower, pi_upper, length.out = npip) 
  | 
nssq | 
 positive integer; number of points to generate for   | 
nsbsq | 
 positive integer; number of points to generate for   | 
npip | 
 positive integer; number of points to generate for   | 
ssq_mult | 
 positive numeric; if  ssq_upper <- ssq_mult * stats::var(X_j) Then,   | 
ssq_lower | 
 positive numeric; if   | 
snr_upper | 
 positive numeric; upper bound on the signal-to-noise ratio.
If  s2_sum <- sum(apply(X, 2, stats::var)) sbsq_upper <- snr_upper / (pip_upper * s2_sum) Then,   | 
sbsq_lower | 
 positive numeric; if   | 
pip_lower | 
 numeric in   | 
pip_upper | 
 
 lasso <- glmnet::cv.glmnet(X, X_j) non0 <- sum(glmnet::coef.glmnet(lasso, s = "lambda.1se")[-1] != 0) non0 <- min(max(non0, 1), p - 1) pip_upper <- non0 / p 
  | 
tau | 
 
  | 
norm | 
 numeric in   | 
center_X | 
 logical; if   | 
scale_Z | 
 logical; if   | 
alpha_tol | 
 positive numeric; end CAVI when the Frobenius norm of the
change in the alpha matrix is within   | 
max_iter_grid | 
 positive integer; if tolerance criteria has not been
met by   | 
max_iter | 
 positive integer; if tolerance criteria has not been met by
  | 
edge_threshold | 
 numeric in   | 
sym_method | 
 
  | 
parallel | 
 logical; if  doParallel::registerDoParallel(num_workers) 
  | 
num_workers | 
 
 num_workers <- floor(parallel::detectCores() / 2) 
  | 
prog_bar | 
 logical; if   | 
Value
Returns object of class covdepGE with the following values:
graphs | 
 list with the following values: 
  | 
variational_params | 
 list with the following values: 
  | 
hyperparameters | 
 list of  
  | 
model_details | 
 list with the following values: 
  | 
weights | 
 list with the following values: 
  | 
References
(1) Sutanoy Dasgupta, Peng Zhao, Prasenjit Ghosh, Debdeep Pati, and Bani Mallick. An approximate Bayesian approach to covariate-dependent graphical modeling. pages 1–59, 2022.
(2) Sutanoy Dasgupta, Debdeep Pati, and Anuj Srivastava. A Two-Step Geometric Framework For Density Modeling. Statistica Sinica, 30(4):2155–2177, 2020.
Examples
## Not run: 
library(ggplot2)
# get the data
set.seed(12)
data <- generateData()
X <- data$X
Z <- data$Z
interval <- data$interval
prec <- data$true_precision
# get overall and within interval sample sizes
n <- nrow(X)
n1 <- sum(interval == 1)
n2 <- sum(interval == 2)
n3 <- sum(interval == 3)
# visualize the distribution of the extraneous covariate
ggplot(data.frame(Z = Z, interval = as.factor(interval))) +
  geom_histogram(aes(Z, fill = interval), color = "black", bins = n %/% 5)
# visualize the true precision matrices in each of the intervals
# interval 1
matViz(prec[[1]], incl_val = TRUE) +
  ggtitle(paste0("True precision matrix, interval 1, observations 1,...,", n1))
# interval 2 (varies continuously with Z)
cat("\nInterval 2, observations ", n1 + 1, ",...,", n1 + n2, sep = "")
int2_mats <- prec[interval == 2]
int2_inds <- c(5, n2 %/% 2, n2 - 5)
lapply(int2_inds, function(j) matViz(int2_mats[[j]], incl_val = TRUE) +
         ggtitle(paste("True precision matrix, interval 2, observation", j + n1)))
# interval 3
matViz(prec[[length(prec)]], incl_val = TRUE) +
  ggtitle(paste0("True precision matrix, interval 3, observations ",
                 n1 + n2 + 1, ",...,", n1 + n2 + n3))
# fit the model and visualize the estimated graphs
(out <- covdepGE(X, Z))
plot(out)
# visualize the posterior inclusion probabilities for variables (1, 3) and (1, 2)
inclusionCurve(out, 1, 2)
inclusionCurve(out, 1, 3)
## End(Not run)
Generate Covariate-Dependent Data
Description
Generate a 1-dimensional extraneous covariate
and p-dimensional Gaussian data with a precision matrix that varies as
a continuous function of the extraneous covariate. This data is distributed
similar to that used in the simulation study from (1)
Usage
generateData(p = 5, n1 = 60, n2 = 60, n3 = 60, Z = NULL, true_precision = NULL)
Arguments
p | 
 positive integer; number of variables in the data matrix.   | 
n1 | 
 positive integer; number of observations in the first interval.
  | 
n2 | 
 positive integer; number of observations in the second interval.
  | 
n3 | 
 positive integer; number of observations in the third interval.
  | 
Z | 
 
  | 
true_precision | 
 
  | 
Value
Returns list with the following values:
X | 
 a   | 
Z | 
 a   | 
true_precision | 
 list of   | 
interval | 
 vector of length   | 
Extraneous Covariate
If Z = NULL, then the generation of Z is as follows:
The first n1 observations have z_i from from a uniform
distribution on the interval (-3, -1) (the first interval).
Observations n1 + 1 to n1 + n2 have z_i from from a uniform
distribution on the interval (-1, 1) (the second interval).
Observations n1 + n2 + 1 to n1 + n2 + n3 have z_i from a
uniform distribution on the interval (1, 3) (the third interval).
Precision Matrices
If true_precision = NULL, then the generation of the true precision
matrices is as follows:
All precision matrices have 2 on the diagonal and 1 in the
(2, 3)/ (3, 2) positions.
Observations in the first interval have a 1 in the
(1, 2) / (1, 2) positions, while observations in the third interval
have a 1 in the (1, 3)/ (3, 1) positions.
Observations in the second interval have 2 entries that vary as a
linear function of their extraneous covariate. Let
\beta = 1/2. Then, the (1, 2)/(2, 1) positions for
the i-th observation in the second interval are
\beta\cdot(1 - z_i), while the (1, 3)/ (3, 1)
entries are \beta\cdot(1 + z_i).
Thus, as z_i approaches -1 from the right, the associated
precision matrix becomes more similar to the matrix for observations in the
first interval. Similarly, as z_i approaches 1 from the left,
the matrix becomes more similar to the matrix for observations in the third
interval.
Examples
## Not run: 
library(ggplot2)
# get the data
set.seed(12)
data <- generateData()
X <- data$X
Z <- data$Z
interval <- data$interval
prec <- data$true_precision
# get overall and within interval sample sizes
n <- nrow(X)
n1 <- sum(interval == 1)
n2 <- sum(interval == 2)
n3 <- sum(interval == 3)
# visualize the distribution of the extraneous covariate
ggplot(data.frame(Z = Z, interval = as.factor(interval))) +
  geom_histogram(aes(Z, fill = interval), color = "black", bins = n %/% 5)
# visualize the true precision matrices in each of the intervals
# interval 1
matViz(prec[[1]], incl_val = TRUE) +
  ggtitle(paste0("True precision matrix, interval 1, observations 1,...,", n1))
# interval 2 (varies continuously with Z)
cat("\nInterval 2, observations ", n1 + 1, ",...,", n1 + n2, sep = "")
int2_mats <- prec[interval == 2]
int2_inds <- c(5, n2 %/% 2, n2 - 5)
lapply(int2_inds, function(j) matViz(int2_mats[[j]], incl_val = TRUE) +
         ggtitle(paste("True precision matrix, interval 2, observation", j + n1)))
# interval 3
matViz(prec[[length(prec)]], incl_val = TRUE) +
  ggtitle(paste0("True precision matrix, interval 3, observations ",
                 n1 + n2 + 1, ",...,", n1 + n2 + n3))
# fit the model and visualize the estimated graphs
(out <- covdepGE(X, Z))
plot(out)
# visualize the posterior inclusion probabilities for variables (1, 3) and (1, 2)
inclusionCurve(out, 1, 2)
inclusionCurve(out, 1, 3)
## End(Not run)
Plot PIP as a Function of Index
Description
Plot the posterior inclusion probability of an edge between two variables as a function of observation index
Usage
inclusionCurve(
  out,
  col_idx1,
  col_idx2,
  line_type = "solid",
  line_size = 0.5,
  line_color = "black",
  point_shape = 21,
  point_size = 1.5,
  point_color = "#500000",
  point_fill = "white"
)
Arguments
out | 
 object of class   | 
col_idx1 | 
 integer in   | 
col_idx2 | 
 integer in   | 
line_type | 
 linetype;   | 
line_size | 
 positive numeric; thickness of the interpolating line.
  | 
line_color | 
 color; color of interpolating line.   | 
point_shape | 
 shape; shape of the points denoting observation-specific
inclusion probabilities;   | 
point_size | 
 positive numeric; size of probability points.   | 
point_color | 
 color; color of probability points.   | 
point_fill | 
 color; fill of probability points. Only applies to select
shapes.   | 
Value
Returns ggplot2 visualization of inclusion probability curve
Examples
## Not run: 
library(ggplot2)
# get the data
set.seed(12)
data <- generateData()
X <- data$X
Z <- data$Z
interval <- data$interval
prec <- data$true_precision
# get overall and within interval sample sizes
n <- nrow(X)
n1 <- sum(interval == 1)
n2 <- sum(interval == 2)
n3 <- sum(interval == 3)
# visualize the distribution of the extraneous covariate
ggplot(data.frame(Z = Z, interval = as.factor(interval))) +
  geom_histogram(aes(Z, fill = interval), color = "black", bins = n %/% 5)
# visualize the true precision matrices in each of the intervals
# interval 1
matViz(prec[[1]], incl_val = TRUE) +
  ggtitle(paste0("True precision matrix, interval 1, observations 1,...,", n1))
# interval 2 (varies continuously with Z)
cat("\nInterval 2, observations ", n1 + 1, ",...,", n1 + n2, sep = "")
int2_mats <- prec[interval == 2]
int2_inds <- c(5, n2 %/% 2, n2 - 5)
lapply(int2_inds, function(j) matViz(int2_mats[[j]], incl_val = TRUE) +
         ggtitle(paste("True precision matrix, interval 2, observation", j + n1)))
# interval 3
matViz(prec[[length(prec)]], incl_val = TRUE) +
  ggtitle(paste0("True precision matrix, interval 3, observations ",
                 n1 + n2 + 1, ",...,", n1 + n2 + n3))
# fit the model and visualize the estimated graphs
(out <- covdepGE(X, Z))
plot(out)
# visualize the posterior inclusion probabilities for variables (1, 3) and (1, 2)
inclusionCurve(out, 1, 2)
inclusionCurve(out, 1, 3)
## End(Not run)
Visualize a matrix
Description
Create a visualization of a matrix
Usage
matViz(
  x,
  color1 = "white",
  color2 = "#500000",
  grid_color = "black",
  incl_val = FALSE,
  prec = 2,
  font_size = 3,
  font_color1 = "black",
  font_color2 = "white",
  font_thres = mean(x)
)
Arguments
x | 
 matrix; matrix to be visualized  | 
color1 | 
 color; color for low entries.   | 
color2 | 
 color; color for high entries.   | 
grid_color | 
 color; color of grid lines.   | 
incl_val | 
 logical; if   | 
prec | 
 positive integer; number of decimal places to round entries to if
  | 
font_size | 
 positive numeric; size of font if   | 
font_color1 | 
 color; color of font for low entries if   | 
font_color2 | 
 color; color of font for high entries if   | 
font_thres | 
 numeric; values less than   | 
Value
Returns ggplot2 visualization of matrix
Examples
## Not run: 
library(ggplot2)
# get the data
set.seed(12)
data <- generateData()
X <- data$X
Z <- data$Z
interval <- data$interval
prec <- data$true_precision
# get overall and within interval sample sizes
n <- nrow(X)
n1 <- sum(interval == 1)
n2 <- sum(interval == 2)
n3 <- sum(interval == 3)
# visualize the distribution of the extraneous covariate
ggplot(data.frame(Z = Z, interval = as.factor(interval))) +
  geom_histogram(aes(Z, fill = interval), color = "black", bins = n %/% 5)
# visualize the true precision matrices in each of the intervals
# interval 1
matViz(prec[[1]], incl_val = TRUE) +
  ggtitle(paste0("True precision matrix, interval 1, observations 1,...,", n1))
# interval 2 (varies continuously with Z)
cat("\nInterval 2, observations ", n1 + 1, ",...,", n1 + n2, sep = "")
int2_mats <- prec[interval == 2]
int2_inds <- c(5, n2 %/% 2, n2 - 5)
lapply(int2_inds, function(j) matViz(int2_mats[[j]], incl_val = TRUE) +
         ggtitle(paste("True precision matrix, interval 2, observation", j + n1)))
# interval 3
matViz(prec[[length(prec)]], incl_val = TRUE) +
  ggtitle(paste0("True precision matrix, interval 3, observations ",
                 n1 + n2 + 1, ",...,", n1 + n2 + n3))
# fit the model and visualize the estimated graphs
(out <- covdepGE(X, Z))
plot(out)
# visualize the posterior inclusion probabilities for variables (1, 3) and (1, 2)
inclusionCurve(out, 1, 2)
inclusionCurve(out, 1, 3)
## End(Not run)
Plot the Graphs Estimated by covdepGE
Description
Create a list of the unique graphs estimated by covdepGE
Usage
## S3 method for class 'covdepGE'
plot(x, graph_colors = NULL, title_sum = TRUE, ...)
Arguments
x | 
 object of class   | 
graph_colors | 
 
  | 
title_sum | 
 logical; if   | 
... | 
 additional arguments will be ignored  | 
Value
Returns list of ggplot2 visualizations of unique graphs estimated
by covdepGE
Examples
## Not run: 
library(ggplot2)
# get the data
set.seed(12)
data <- generateData()
X <- data$X
Z <- data$Z
interval <- data$interval
prec <- data$true_precision
# get overall and within interval sample sizes
n <- nrow(X)
n1 <- sum(interval == 1)
n2 <- sum(interval == 2)
n3 <- sum(interval == 3)
# visualize the distribution of the extraneous covariate
ggplot(data.frame(Z = Z, interval = as.factor(interval))) +
  geom_histogram(aes(Z, fill = interval), color = "black", bins = n %/% 5)
# visualize the true precision matrices in each of the intervals
# interval 1
matViz(prec[[1]], incl_val = TRUE) +
  ggtitle(paste0("True precision matrix, interval 1, observations 1,...,", n1))
# interval 2 (varies continuously with Z)
cat("\nInterval 2, observations ", n1 + 1, ",...,", n1 + n2, sep = "")
int2_mats <- prec[interval == 2]
int2_inds <- c(5, n2 %/% 2, n2 - 5)
lapply(int2_inds, function(j) matViz(int2_mats[[j]], incl_val = TRUE) +
         ggtitle(paste("True precision matrix, interval 2, observation", j + n1)))
# interval 3
matViz(prec[[length(prec)]], incl_val = TRUE) +
  ggtitle(paste0("True precision matrix, interval 3, observations ",
                 n1 + n2 + 1, ",...,", n1 + n2 + n3))
# fit the model and visualize the estimated graphs
(out <- covdepGE(X, Z))
plot(out)
# visualize the posterior inclusion probabilities for variables (1, 3) and (1, 2)
inclusionCurve(out, 1, 2)
inclusionCurve(out, 1, 3)
## End(Not run)