Title: | Multi-Omic Differentially Expressed Gene-Gene Pairs |
Version: | 1.0.0 |
Maintainer: | Elisabetta Sciacca <e.sciacca@qmul.ac.uk> |
Description: | Performs multi-omic differential network analysis by revealing differential interactions between molecular entities (genes, proteins, transcription factors, or other biomolecules) across the omic datasets provided. For each omic dataset, a differential network is constructed where links represent statistically significant differential interactions between entities. These networks are then integrated into a comprehensive visualization using distinct colors to distinguish interactions from different omic layers. This unified display allows interactive exploration of cross-omic patterns, such as differential interactions present at both transcript and protein levels. For each link, users can access differential statistical significance metrics (p values or adjusted p values, calculated via robust or traditional linear regression with interaction term) and differential regression plots. The methods implemented in this package are described in Sciacca et al. (2023) <doi:10.1093/bioinformatics/btad192>. |
License: | GPL-3 |
Encoding: | UTF-8 |
LazyData: | true |
LazyDataCompression: | gzip |
RoxygenNote: | 7.3.2 |
Language: | en-gb |
URL: | https://github.com/elisabettasciacca/multiDEGGs/ |
BugReports: | https://github.com/elisabettasciacca/multiDEGGs/issues |
Suggests: | qvalue, testthat (≥ 3.0.0) |
Imports: | DT, grDevices, graphics, knitr, MASS, magrittr, methods, parallel, pbapply, pbmcapply, rmarkdown, sfsmisc, shiny, shinydashboard, stats, utils, visNetwork |
Depends: | R (≥ 4.4.0) |
VignetteBuilder: | knitr |
Config/testthat/edition: | 3 |
NeedsCompilation: | no |
Packaged: | 2025-06-03 17:59:16 UTC; elisabetta |
Author: | Elisabetta Sciacca
|
Repository: | CRAN |
Date/Publication: | 2025-06-05 11:10:02 UTC |
Interactive visualisation of differential networks
Description
Explore differential networks and interactively select regression and box plots
Usage
View_diffNetworks(deggs_object, legend.arrow.width = 0.35, stepY_legend = 55)
Arguments
deggs_object |
an object of class |
legend.arrow.width |
width of the arrow used in the network legend. Default is 0.35. As the number of assayData matrices increases this parameter must be accordingly increased to avoid graphical errors in the legend. |
stepY_legend |
vertical space between legend arrows. It is used together
with |
Value
a shiny interface showing networks with selectable nodes and links
Examples
data("synthetic_metadata")
data("synthetic_rnaseqData")
data("synthetic_proteomicData")
data("synthetic_OlinkData")
assayData_list <- list("RNAseq" = synthetic_rnaseqData,
"Proteomics" = synthetic_proteomicData,
"Olink" = synthetic_OlinkData)
deggs_object <- get_diffNetworks(assayData = assayData_list,
metadata = synthetic_metadata,
category_variable = "response",
regression_method = "lm",
verbose = FALSE,
show_progressBar = FALSE,
cores = 1)
# the below function runs a shiny app, so can't be run during R CMD check
if(interactive()){
View_diffNetworks(deggs_object)
}
Calculate the p values for specific category network samples
Description
Calculate the p values for specific category network samples
Usage
calc_pvalues_network(
assayData,
metadata,
padj_method,
categories_length,
regression_method = "lm",
category_network
)
Arguments
assayData |
a matrix or data.frame (or list of matrices or data.frames for multi-omic analysis) containing normalised assay data. Sample IDs must be in columns and probe IDs (genes, proteins...) in rows. For multi omic analysis, it is highly recommended to use a named list of data. If unnamed, sequential names (assayData1, assayData2, etc.) will be assigned to identify each matrix or data.frame. |
metadata |
a named vector, matrix, or data.frame containing sample
annotations or categories. If matrix or data.frame, each row should
correspond to a sample, with columns representing different sample
characteristics (e.g., treatment group, condition, time point). The colname
of the sample characteristic to be used for differential analysis must be
specified in |
padj_method |
a character string indicating the p values correction
method for multiple test adjustment. It can be either one of the methods
provided by the |
categories_length |
integer number indicating the number of categories |
regression_method |
whether to use robust linear modelling to calculate link p values. Options are 'lm' (default) or 'rlm'. The lm implementation is faster and lighter. |
category_network |
network table for a specific category |
Value
a list of p values
Compute interaction p values for a single percentile value
Description
Compute interaction p values for a single percentile value
Usage
calc_pvalues_percentile(
assayData,
metadata,
categories_length,
category_median_list,
padj_method,
percentile,
contrasts,
regression_method,
edges,
sig_edges_count
)
Arguments
assayData |
a matrix or data.frame (or list of matrices or data.frames for multi-omic analysis) containing normalised assay data. Sample IDs must be in columns and probe IDs (genes, proteins...) in rows. For multi omic analysis, it is highly recommended to use a named list of data. If unnamed, sequential names (assayData1, assayData2, etc.) will be assigned to identify each matrix or data.frame. |
metadata |
a named vector, matrix, or data.frame containing sample
annotations or categories. If matrix or data.frame, each row should
correspond to a sample, with columns representing different sample
characteristics (e.g., treatment group, condition, time point). The colname
of the sample characteristic to be used for differential analysis must be
specified in |
categories_length |
integer number indicating the number of categories |
category_median_list |
list of category data.frames |
padj_method |
a character string indicating the p values correction
method for multiple test adjustment. It can be either one of the methods
provided by the |
percentile |
a float number indicating the percentile to use. |
contrasts |
data.frame containing the categories contrasts in rows |
regression_method |
whether to use robust linear modelling to calculate link p values. Options are 'lm' (default) or 'rlm'. The lm implementation is faster and lighter. |
edges |
network of biological interactions in the form of a table of class data.frame with two columns: "from" and "to". |
sig_edges_count |
number of significant edges (p < 0.05) |
Value
The list of float numbers of the significant pvalues for a single percentile
Generate multi-omic differential networks
Description
Generate a multi-layer differential network with interaction p values
Usage
get_diffNetworks(
assayData,
metadata,
category_variable = NULL,
regression_method = "lm",
category_subset = NULL,
network = NULL,
percentile_vector = seq(0.35, 0.98, by = 0.05),
padj_method = "bonferroni",
show_progressBar = TRUE,
verbose = TRUE,
cores = parallel::detectCores()/2
)
Arguments
assayData |
a matrix or data.frame (or list of matrices or data.frames for multi-omic analysis) containing normalised assay data. Sample IDs must be in columns and probe IDs (genes, proteins...) in rows. For multi omic analysis, it is highly recommended to use a named list of data. If unnamed, sequential names (assayData1, assayData2, etc.) will be assigned to identify each matrix or data.frame. |
metadata |
a named vector, matrix, or data.frame containing sample
annotations or categories. If matrix or data.frame, each row should
correspond to a sample, with columns representing different sample
characteristics (e.g., treatment group, condition, time point). The colname
of the sample characteristic to be used for differential analysis must be
specified in |
category_variable |
when metadata is a matrix or data.frame this is the
column name of |
regression_method |
whether to use robust linear modelling to calculate link p values. Options are 'lm' (default) or 'rlm'. The lm implementation is faster and lighter. |
category_subset |
optional character vector indicating a subset of
categories from the category variable. If not specified, all categories in
|
network |
network of biological interactions provided by the user. The
network must be provided in the form of a table of class data.frame with only
two columns named "from" and "to".
If NULL (default) a network of 10,537 molecular interactions obtained from
KEGG, mirTARbase, miRecords and transmiR will be used.
This has been obtained via the |
percentile_vector |
a numeric vector specifying the percentiles to be
used in the percolation analysis. By default, it is defined as
|
padj_method |
a character string indicating the p values correction
method for multiple test adjustment. It can be either one of the methods
provided by the |
show_progressBar |
logical. Whether to display a progress bar during execution. Default is TRUE. |
verbose |
logical. Whether to print detailed output messages during processing. Default is TRUE |
cores |
number of cores to use for parallelisation. |
Value
a deggs
object containing differential networks incorporating
p values or adjusted p values for each link.
Examples
# Single omic analysis:
data("synthetic_metadata")
data("synthetic_rnaseqData")
deggs_object <- get_diffNetworks(assayData = synthetic_rnaseqData,
metadata = synthetic_metadata,
category_variable = "response",
regression_method = "lm",
padj_method = "bonferroni",
verbose = FALSE,
show_progressBar = FALSE,
cores = 1)
# multi-omic analysis:
data("synthetic_metadata")
data("synthetic_rnaseqData")
data("synthetic_proteomicData")
data("synthetic_OlinkData")
assayData_list <- list("RNAseq" = synthetic_rnaseqData,
"Proteomics" = synthetic_proteomicData,
"Olink" = synthetic_OlinkData)
deggs_object <- get_diffNetworks(assayData = assayData_list,
metadata = synthetic_metadata,
category_variable = "response",
regression_method = "lm",
padj_method = "bonferroni",
verbose = FALSE,
show_progressBar = FALSE,
cores = 1)
# to use only certain categories for comparison:
# let's randomly add another level of response to the example metadata
indices <- sample(1:nrow(synthetic_metadata), 20, replace = FALSE)
synthetic_metadata$response[indices] <- "Moderate response"
deggs_object <- get_diffNetworks(assayData = assayData_list,
metadata = synthetic_metadata,
category_variable = "response",
category_subset = c("Responder",
"Non-responder"),
regression_method = "lm",
verbose = FALSE,
show_progressBar = FALSE,
cores = 1)
# to be more generous on the targets to be excluded, and lower the expression
# level threshold to the 25th percentile (or lower):
deggs_object <- get_diffNetworks(assayData = assayData_list,
metadata = synthetic_metadata,
category_variable = "response",
category_subset = c("Responder",
"Non-responder"),
regression_method = "lm",
percentile_vector = seq(0.25, 0.98, by = 0.05),
verbose = FALSE,
show_progressBar = FALSE,
cores = 1)
Generate differential networks for single omic analysis
Description
Generate differential networks for single omic analysis
Usage
get_diffNetworks_singleOmic(
assayData,
assayDataName,
metadata,
regression_method,
network,
percentile_vector,
padj_method,
show_progressBar,
verbose,
cores
)
Arguments
assayData |
a matrix or data.frame (or list of matrices or data.frames for multi-omic analysis) containing normalised assay data. Sample IDs must be in columns and probe IDs (genes, proteins...) in rows. For multi omic analysis, it is highly recommended to use a named list of data. If unnamed, sequential names (assayData1, assayData2, etc.) will be assigned to identify each matrix or data.frame. |
assayDataName |
name of the assayData, to identify which omic is. |
metadata |
a named vector, matrix, or data.frame containing sample
annotations or categories. If matrix or data.frame, each row should
correspond to a sample, with columns representing different sample
characteristics (e.g., treatment group, condition, time point). The colname
of the sample characteristic to be used for differential analysis must be
specified in |
regression_method |
whether to use robust linear modelling to calculate link p values. Options are 'lm' (default) or 'rlm'. The lm implementation is faster and lighter. |
network |
network of biological interactions provided by the user. The
network must be provided in the form of a table of class data.frame with only
two columns named "from" and "to".
If NULL (default) a network of 10,537 molecular interactions obtained from
KEGG, mirTARbase, miRecords and transmiR will be used.
This has been obtained via the |
percentile_vector |
a numeric vector specifying the percentiles to be
used in the percolation analysis. By default, it is defined as
|
padj_method |
a character string indicating the p values correction
method for multiple test adjustment. It can be either one of the methods
provided by the |
show_progressBar |
logical. Whether to display a progress bar during execution. Default is TRUE. |
verbose |
logical. Whether to print detailed output messages during processing. Default is TRUE |
cores |
number of cores to use for parallelisation. |
Value
a list of differential networks, one per category
Get a table of all significant interactions across categories
Description
Get a table of all significant interactions across categories
Usage
get_multiOmics_diffNetworks(deggs_object, sig_threshold = 0.05)
Arguments
deggs_object |
an object of class |
sig_threshold |
threshold for significance. Default 0.05. |
Value
a list of multilayer networks (as edge tables), one per category.
Examples
data("synthetic_metadata")
data("synthetic_rnaseqData")
data("synthetic_proteomicData")
data("synthetic_OlinkData")
assayData_list <- list("RNAseq" = synthetic_rnaseqData,
"Proteomics" = synthetic_proteomicData,
"Olink" = synthetic_OlinkData)
deggs_object <- get_diffNetworks(assayData = assayData_list,
metadata = synthetic_metadata,
category_variable = "response",
verbose = FALSE,
show_progressBar = FALSE,
cores = 2)
get_multiOmics_diffNetworks(deggs_object, sig_threshold = 0.05)
Get a table of all the significant interactions across categories
Description
Get a table of all the significant interactions across categories
Usage
get_sig_deggs(deggs_object, assayDataName = 1, sig_threshold = 0.05)
Arguments
deggs_object |
an object of class |
assayDataName |
name of the assayData of interest. If an unnamed list of
data was given to |
sig_threshold |
threshold for significance. Default 0.05. |
Value
a data.frame
listing all the significant differential interactions
found across categories for that particular omic data.
This list can also be used to substitute or integrate feature selection in
machine learning models for the prediction of the categories (see vignette).
Examples
data("synthetic_metadata")
data("synthetic_rnaseqData")
deggs_object <- get_diffNetworks(assayData = synthetic_rnaseqData,
metadata = synthetic_metadata,
category_variable = "response",
verbose = FALSE,
show_progressBar = FALSE,
cores = 2)
get_sig_deggs(deggs_object, sig_threshold = 0.05)
Internal function for colors
Description
This function return a color palette with the number of colors specified by n
Usage
my_palette(n)
Arguments
n |
number of colors needed |
Value
a vector with colors
Boxplots of single nodes (genes,proteins, etc.)
Description
This function is for internal use of View_diffnetworks
Usage
node_boxplot(gene, assayDataName = 1, deggs_object)
Arguments
gene |
gene name (must be in |
assayDataName |
name of the assayData of interest. If an unnamed list of
data was given to |
deggs_object |
an object of class |
Value
the boxplot
Plot differential regressions for a link
Description
Plot differential regressions for any target-target pair in an omic dataset
Usage
plot_regressions(
deggs_object,
assayDataName = 1,
gene_A,
gene_B,
title = NULL,
legend_position = "topright"
)
Arguments
deggs_object |
an object of class |
assayDataName |
name of the assayData of interest. If an unnamed list of
data was given to |
gene_A |
character. Name of the first target (gene, protein, metabolite, etc.) |
gene_B |
character. Name of the second target (gene, protein, metabolite, etc.) |
title |
plot title. If NULL (default), the name of the assayData will be used. Use empty character "" for no title. |
legend_position |
position of the legend in the plot. It can be
specified by keyword or in any parameter accepted by |
Value
base graphics plot showing differential regressions across categories. The p value of the interaction term of gene A ~ gene B \* category is reported on top.
Examples
data("synthetic_metadata")
data("synthetic_rnaseqData")
data("synthetic_proteomicData")
data("synthetic_OlinkData")
assayData_list <- list("RNAseq" = synthetic_rnaseqData,
"Proteomics" = synthetic_proteomicData,
"Olink" = synthetic_OlinkData)
deggs_object <- get_diffNetworks(assayData = assayData_list,
metadata = synthetic_metadata,
category_variable = "response",
regression_method = "lm",
padj_method = "bonferroni",
verbose = FALSE,
show_progressBar = FALSE,
cores = 1)
plot_regressions(deggs_object,
assayDataName = "RNAseq",
gene_A = "MTOR",
gene_B = "AKT2",
legend_position = "bottomright")
Synthetic RNA-seq count data
Description
Synthetic RNA-seq data after log2 normalisation
Format
A data frame with xx rows (proteins) xx columns (patients IDs).
Synthetic clinical data
Description
A dataset containing sample clinical data for 100 patients with 40% response rate
Format
A data frame with 100 rows and 4 columns (IDs are in rownames):
- patient_id
IDs matching the IDs used in the colnames of the assay data matrix/matrices.
- age
A column to simulate age of patients. Not used.
- gender
A column to simulate gender of patients. Not used.
- response
The response outcome, to be used for differential analysis
Synthetic RNA-seq count data
Description
Synthetic RNA-seq data after log2 normalisation
Format
A data frame with xx rows (proteins) xx columns (patients IDs).
Synthetic RNA-seq count data
Description
Synthetic RNA-seq data after log2 normalisation
Format
A data frame with xx rows (genes) xx columns (patients IDs, matching the metadata rownames).
Tidying up of metadata. Samples belonging to undesidered categories (if specified) will be removed as well as categories with less than five samples, and NAs.
Description
Tidying up of metadata. Samples belonging to undesidered categories (if specified) will be removed as well as categories with less than five samples, and NAs.
Usage
tidy_metadata(
category_subset = NULL,
metadata,
category_variable = NULL,
verbose = FALSE
)
Arguments
category_subset |
optional character vector indicating which categories
are used for comparison. If not specified, all categories in
|
metadata |
a named vector, matrix, or data.frame containing sample
annotations or categories. If matrix or data.frame, each row should
correspond to a sample, with columns representing different sample
characteristics (e.g., treatment group, condition, time point). The colname
of the sample characteristic to be used for differential analysis must be
specified in |
category_variable |
column name in |
verbose |
Logical. Whether to print detailed output messages during processing. Default is FALSE. |
Value
a tidy named factor vector of sample annotations.