Help for package MiMIR

Title:

Metabolomics-Based Models for Imputing Risk

Version:

1.5

Description:

Provides an intuitive framework for ad-hoc statistical analysis of 1H-NMR metabolomics by Nightingale Health. It allows to easily explore new metabolomics measurements assayed by Nightingale Health, comparing the distributions with a large Consortium (BBMRI-nl); project previously published metabolic scores [<doi:10.1016/j.ebiom.2021.103764>, <doi:10.1161/CIRCGEN.119.002610>, <doi:10.1038/s41467-019-11311-9>, <doi:10.7554/eLife.63033>, <doi:10.1161/CIRCULATIONAHA.114.013116>, <doi:10.1007/s00125-019-05001-w>]; and calibrate the metabolic surrogate values to a desired dataset.

License:

GPL-3

Encoding:

UTF-8

RoxygenNote:

7.1.2

Depends:

R (≥ 4.1.0)

Imports:

caret, DT, foreach, ggplot2, heatmaply, matrixStats, plotly, pROC, purrr, shiny, shinycssloaders, shinyFiles, shinydashboard, shinyjs, shinyWidgets, stats, survival, survminer, dplyr, fs

LazyData:

true

Suggests:

testthat (≥ 3.0.0), ggfortify, knitr, rmarkdown

Config/testthat/edition:

NeedsCompilation:

Packaged:

2024-02-01 08:31:19 UTC; danielebizzarri

Author:

Daniele Bizzarri

[aut, cre], Marcel Reinders

[aut, ths], Marian Beekman

[aut], Pieternella Eline Slagboom

[aut, ths], Erik van den Akker

[aut, ths]

Maintainer:

Daniele Bizzarri <d.bizzarri@lumc.nl>

Repository:

CRAN

Date/Publication:

2024-02-01 08:50:02 UTC

T2D-score Betas

Description

The coefficients used to compute the T2Diabetes score by Ahola Olli.

Usage

data("Ahola_Olli_betas")

Format

An object of class data.frame with 7 rows and 3 columns.

Details

Dataframe containing the abbreviation of the metabolites, the metabolites names and finally the Coefficients to compute the T2Diabetes score

References

Ahola-Olli,A.V. et al. (2019) Circulating metabolites and the risk of type 2 diabetes: a prospective study of 11,896 young adults from four Finnish cohorts. Diabetologia, 62, 2298-2309, doi:10.1007/s00125-019-05001-w

Examples

data("Ahola_Olli_betas")

BBMRI_hist

Description

Distributions of the Nightingale Health metabolic features in BBMRI-nl

Usage

data("BBMRI_hist")

Format

An object of class list of length 57.

Details

List containing the histograms of the metabolomics-features in BBMRI-nl.

Examples

data("BBMRI_hist")

multi_hist

Description

Function to plot the ~60 metabolites used for the metabolomics-based scores and compare them to to their distributions in BBMRI-nl

Usage

BBMRI_hist_plot(
  dat,
  x_name,
  color = MiMIR::c21,
  scaled = FALSE,
  datatype = "metabolite",
  main = "Comparison with the metabolites measures in BBMRI"
)

Arguments

dat

data.frame or matrix with the metabolites

x_name

string with the name of the selected variable

color

colors selected for all the variables

scaled

logical to z-scale the variables

datatype

a character vector indicating what data type is being plotted

main

title of the plot

Details

This function plots the distribution of a metabolic feature in the uploaded dataset, compared to their distributions in BBMRI-nl. The selection of features available is done following the metabolic scores features.

Value

plotly image with the histogram of the selected variable compared to the distributions in BBMRI-nl

References

The selection of metabolic features available is the one selected by the papers: Deelen,J. et al. (2019) A metabolic profile of all-cause mortality risk identified in an observational study of 44,168 individuals. Nature Communications, 10, 1-8, doi:10.1038/s41467-019-11311-9 Ahola-Olli,A.V. et al. (2019) Circulating metabolites and the risk of type 2 diabetes: a prospective study of 11,896 young adults from four Finnish cohorts. Diabetologia, 62, 2298-2309, doi:10.1007/s00125-019-05001-w Wurtz,P. et al. (2015) Metabolite profiling and cardiovascular event risk: a prospective study of 3 population-based cohorts. Circulation, 131, 774-785, doi:10.1161/CIRCULATIONAHA.114.013116 Bizzarri,D. et al. (2022) 1H-NMR metabolomics-based surrogates to impute common clinical risk factors and endpoints. EBioMedicine, 75, 103764, doi:10.1016/j.ebiom.2021.103764 van den Akker Erik B. et al. (2020) Metabolic Age Based on the BBMRI-NL 1H-NMR Metabolomics Repository as Biomarker of Age-related Disease. Circulation: Genomic and Precision Medicine, 13, 541-547, doi:10.1161/CIRCGEN.119.002610

Examples

library(plotly)
library(MiMIR)

#load the metabolites dataset
metabolic_measures <- synthetic_metabolic_dataset

BBMRI_hist_plot(metabolic_measures, x_name="alb", scaled=TRUE)

BBMRI_hist_scaled

Description

Z-scaled distributions of the Nightingale Health metabolic features in BBMRI-nl

Usage

data("BBMRI_hist_scaled")

Format

An object of class list of length 57.

Details

List containing the histograms of the scaled metabolomics-features in BBMRI-nl.

Examples

data("BBMRI_hist_scaled")

BMI_LDL_eGFR

Description

#' Function created to calculate: 1) BMI using height and weight; 2) LDL cholesterol using HDL cholesterol, triglycerides, totchol; 3) eGFR creatinine levels, sex and age.

Usage

BMI_LDL_eGFR(phenotypes, metabo_measures)

Arguments

phenotypes

data.frame containing height and weight, HDL cholesterol, triglycerides, totchol, sex and age

metabo_measures

numeric data-frame with Nightingale metabolomics quantifications containing creatinine levels (crea)

Value

phenotypes data.frame with the addition of BMI, LDL cholesterol and eGFR

References

This function is constructed to calculate BMI, LDL cholesterol and eGFR as in the following papers:

BMI: Flint AJ, Rexrode KM, Hu FB, Glynn RJ, Caspard H, Manson JE et al. Body mass index, waist circumference, and risk of coronary heart disease: a prospective study among men and women. Obes Res Clin Pract 2010; 4: e171-e181, doi:10.1016/j.orcp.2010.01.001

LDL-cholesterol: Friedewald WT, Levy RI, Fredrickson DS. Estimation of the Concentration of Low-Density Lipoprotein Cholesterol in Plasma, Without Use of the Preparative Ultracentrifuge. Clin Chem 1972; 18: 499-502, <doi.org/10.1093/clinchem/18.6.499>

eGFR: Carrero Juan Jesus, Andersson Franko Mikael, Obergfell Achim, Gabrielsen Anders, Jernberg Tomas. hsCRP Level and the Risk of Death or Recurrent Cardiovascular Events in Patients With Myocardial Infarction: a Healthcare-Based Study. J Am Heart Assoc 2019; 8: e012638, <doi: 10.1161/JAHA.119.012638>

Examples

library(MiMIR)

#load the dataset
metabolic_measures <- synthetic_metabolic_dataset
phenotypes <- synthetic_phenotypic_dataset
#Calculate BMI, LDL cholesterol and eGFR
phenotypes<-BMI_LDL_eGFR(phenotypes, metabolic_measures)

CVD-score betas

Description

The coefficients used to compute the CVD score by Wurtz et al.

Usage

data("CVD_score_betas")

Format

An object of class data.frame with 12 rows and 3 columns.

Details

Dataframe containing the abbreviation of the metabolites, the metabolites names and finally the Coefficients to compute the COVID score

References

Wurtz,P. et al. (2015) Metabolite profiling and cardiovascular event risk: a prospective study of 3 population-based cohorts. Circulation, 131, 774-785, doi:10.1161/CIRCULATIONAHA.114.013116

Examples

data("CVD_score_betas")

LOBOV_accuracies

Description

Function created to visualize the accuracies in the current dataset compared to the accuracies in the Leave One Biobank Out Validation in Bizzarri et al.

Usage

LOBOV_accuracies(surrogates, bin_phenotypes, bin_pheno_available, acc_LOBOV)

Arguments

surrogates

numeric data.frame containing the surrogate values by Bizzarri et al.

bin_phenotypes

numeric data.frame with the binarized phenotypes output of binarize_all_pheno

bin_pheno_available

vector of strings with the available phenotypes

acc_LOBOV

accuracy of LOBOV calculated in Bizzarri et al.

Details

Comparison of the AUCs of the surrogates in the updated dataset and the results of the Leave One Biobank Out Validation made in BBMRI-nl.

Value

Boxplot with the accuracies of the LOBOV

References

This function was made to vidualize the binarized variables calculated following the rules indicated in the article: Bizzarri,D. et al. (2022) 1H-NMR metabolomics-based surrogates to impute common clinical risk factors and endpoints. EBioMedicine, 75, 103764, doi:10.1016/j.ebiom.2021.103764

Examples

require(pROC)
require(plotly)
require(MiMIR)
require(foreach)
require(ggplot2)

#load the dataset
m <- synthetic_metabolic_dataset
p<- synthetic_phenotypic_dataset

#Calculating the binarized surrogates
b_p<-binarize_all_pheno(p)
#Apply a surrogate models and plot the ROC curve
sur<-calculate_surrogate_scores(m, p, MiMIR::PARAM_surrogates, bin_names=colnames(b_p))
p_avail<-colnames(b_p)[c(1:5)]
LOBOV_accuracies(sur$surrogates, b_p, p_avail, MiMIR::acc_LOBOV)

MOLEPI_LCBC_header

Description

helper function to create a header with the links to MOLEPI, LCBC, LUMC and BBMRI-nl

Usage

MOLEPI_LCBC_header()

Value

header for Rshiny app

MetaboWAS

Description

Function to calculate a Metabolome Wide Association study

Usage

MetaboWAS(met, pheno, test_variable, covariates, img = TRUE, adj_method = "BH")

Arguments

met

numeric data.frame with the metabolomics features

pheno

data.frame containing the phenotype of interest

test_variable

string vector with the name of the phenotype of interest

covariates

string vector with the name of the variables to be added as a covariate

img

logical indicating if the function should plot a Manhattan plot

adj_method

multiple testing correction method

Details

This is a function to compute linear associations individually for each variable in the first data.frame with the test variable and corrected for the selected covariates. This function to computes linear regression modelindividually for each variable in the first data.frame with the test variable and adjusted for potential confounders. False Discovery Rate (FDR) is applied to account for multiple testing correction. The user has the faculty to select the test variable and the potential covariates within the pool of variables in the phenotypic file input. The results of the associations are reported in a Manhattan plot

The p-value of the association is then corrected using Benjamini Hochberg. Finally we use plotly to plot a Manhattan Plot, which reports on the x-axis the list of metabolites reported in the Nightingale Health, divided in groups, and on the y-axis the -log (adjusted p-value).

Value

res= the results of the MetaboWAS, manhplot= the Manhattan plot made with plotly, N_hits= the number of significant hits

References

This method is also described and used in: Bizzarri,D. et al. (2022) 1H-NMR metabolomics-based surrogates to impute common clinical risk factors and endpoints. EBioMedicine, 75, 103764, doi:10.1016/j.ebiom.2021.103764

Examples

require(MiMIR)
require(plotly)
require(ggplot2)

#' #load the dataset
metabolic_measures <- synthetic_metabolic_dataset
phenotypes <- synthetic_phenotypic_dataset

#Computing a MetaboWAS for age corrected by sex
MetaboWAS(met=metabolic_measures, pheno=phenotypes, test_variable="age", covariates= "sex")

NA_message

Description

helper function to create a plot indicating a problem with a plotly image

Usage

NA_message(main = "Metabolites are missing, please check your upload!")

Arguments

main

Message to plot

Value

plot

PARAMETERS MetaboAge

Description

The coefficients used to compute the MetaboAge by van den Akker et al.

Usage

data("PARAM_metaboAge")

Format

An object of class list of length 8.

Details

List containing all the information to pre-process and compute the MetaboAge.

References

van den Akker Erik B. et al. (2020) Metabolic Age Based on the BBMRI-NL 1H-NMR Metabolomics Repository as Biomarker of Age-related Disease. Circulation: Genomic and Precision Medicine, 13, 541-547, doi:10.1161/CIRCGEN.119.002610

Examples

data("PARAM_metaboAge")

PARAMETERS surrogates

Description

The coefficients used to compute the metabolomics-based surrogate clinical variables by Bizzarri et al.

Usage

data("PARAM_surrogates")

Format

An object of class list of length 6.

Details

List containing all the information to pre-process and compute the surrogate clinical variables.

References

Bizzarri,D. et al. (2022) 1H-NMR metabolomics-based surrogates to impute common clinical risk factors and endpoints. EBioMedicine, 75, 103764, doi:10.1016/j.ebiom.2021.103764

Examples

data("PARAM_surrogates")

QCprep

Description

Helper function to pre-process the Nightingale Health metabolomics data-set before applying the MetaboAge score by van den Akker et al.

Usage

QCprep(mat, PARAM_metaboAge, quiet = TRUE, Nmax_zero = 1, Nmax_miss = 1)

Arguments

mat

numeric data-frame NH-metabolomics matrix.

PARAM_metaboAge

list containing all the parameters to compute the metaboAge (metabolic features list,BBMRI-nl means and SDs of the metabolic features, and coefficients)

quiet

logical to suppress the messages in the console

Nmax_zero

numberic value indicating the maximum number of zeros allowed per sample (Number suggested=1)

Nmax_miss

numberic value indicating the maximum number of missing values allowed per sample (Number suggested=1)

Value

Nightingale-metabolomics data-frame after pre-processing (checked for zeros, missing values, samples>5SD from the BBMRI-mean, imputing the missing values and z-scaled)

References

This function is constructed to be able to follow the pre-processing steps described in: van den Akker Erik B. et al. (2020) Metabolic Age Based on the BBMRI-NL 1H-NMR Metabolomics Repository as Biomarker of Age-related Disease. Circulation: Genomic and Precision Medicine, 13, 541-547, doi:10.1161/CIRCULATIONAHA.114.013116

Examples

library(MiMIR)

#load the Nightignale metabolomics dataset
metabolic_measures <- synthetic_metabolic_dataset

#Pre-process the metabolic features
prepped_met<-QCprep(as.matrix(metabolic_measures[,metabolites_subsets$MET63]), PARAM_metaboAge)

QCprep_surrogates

Description

Helper function to pre-process the Nightingale Health metabolomics data-set before applying metabolomics-based surrogates by Bizzarri et al.

Usage

QCprep_surrogates(
  mat,
  PARAM_surrogates,
  Nmax_miss = 1,
  Nmax_zero = 1,
  quiet = FALSE
)

Arguments

mat

numeric data-frame Nightingale metabolomics matrix.

PARAM_surrogates

is a list holding the parameters to compute the surrogates

Nmax_miss

numeric value indicating the maximum number of missing values allowed per sample (Number suggested=1)

Nmax_zero

numeric value indicating the maximum number of zeros allowed per sample (Number suggested=1)

quiet

logical to suppress the messages in the console

Details

Bizzarri et al. built multivariate models,using 56 metabolic features quantified by Nightingale, to predict the 19 binary characteristics of an individual. The binary variables are: sex, diabetes status, metabolic syndrome status, lipid medication usage, blood pressure lowering medication, current smoking, alcohol consumption, high age, middle age, low age, high hsCRP, high triglycerides, high ldl cholesterol, high total cholesterol, low hdl cholesterol, low eGFR, low white blood cells, low hemoglobin levels.

Value

Nightingale-metabolomics data-frame after pre-processing (checked for zeros, missing values, samples>5SD from the BBMRI-mean, imputing the missing values and z-scaled)

References

Examples

library(MiMIR)

#load the Nightignale metabolomics dataset
metabolic_measures <- synthetic_metabolic_dataset
#Pre-process the metabolic features
prepped_met<-QCprep_surrogates(as.matrix(metabolic_measures), MiMIR::PARAM_surrogates)

acc_LOBOV

Description

Accuracy of the Leave One Biobank Out Validation of the surrogate metabolic-modesl performed in BBMRI-nl

Usage

data("acc_LOBOV")

Format

An object of class list of length 20.

Details

Dataframe containing the accuracy obtained during the Leave One Biobank Out Validation of the surrogate metabolic-modesl in BBMRI-nl.

References

The method is described in: Bizzarri,D. et al. (2022) 1H-NMR metabolomics-based surrogates to impute common clinical risk factors and endpoints. EBioMedicine, 75, 103764, doi:10.1016/j.ebiom.2021.103764

Examples

data("acc_LOBOV")

activateButtn

Description

helper function to activate buttons based on 2 checks

Usage

activateButtn(check1, check2, button)

Value

button activation

apply.fit

Description

Function to compute the MetaboAge score made by van den Akker et al. on Nightingale metabolomics data-set.

Usage

apply.fit(mat, FIT)

Arguments

mat

numeric data-frame with Nightingale-metabolomics

FIT

The betas of the linear regression composing the MetaboAge by van den Akker et al.

Details

Multivariate model indicating the biological age of an individual, based on 56 metabolic features. It was trained using a linear regression in BBMRI-nl, a Consortium of 28 cohorts comprising ~25,000 individuals.

Value

data-frame containing the value of the MetaboAge by van den Akker et al.

References

This function is constructed to be able to apply the metaboAge as described in: van den Akker Erik B. et al. (2020) Metabolic Age Based on the BBMRI-NL 1H-NMR Metabolomics Repository as Biomarker of Age-related Disease. Circulation: Genomic and Precision Medicine, 13, 541-547, doi:10.1161/CIRCULATIONAHA.114.013116

Examples

library(MiMIR)

#load the Nightignale metabolomics dataset
metabolic_measures <- synthetic_metabolic_dataset
#Pre-process the metabolic features
prepped_met<-QCprep(as.matrix(metabolic_measures[,metabolites_subsets$MET63]), PARAM_metaboAge)
#Apply the metaboAge
metaboAge<-apply.fit(prepped_met, FIT=PARAM_metaboAge$FIT_COEF)

apply.fit_surro

Description

Function that apply on of the surrogates models to the NH-metabolomics concentrations

Usage

apply.fit_surro(mat, FIT, post = TRUE)

Arguments

mat

numeric data-frame with Nightingale-metabolomics

FIT

The betas of the logistic regressions composing the surrogates by Bizzarri et al.

post

logical to obtain posterior probabilities

Details

Value

numeric data.frame with the metabolomics-based surrogates by Bizzarri et al.

References

Examples

## Not run: 
library(MiMIR)

#load the Nightignale metabolomics dataset
metabolic_measures <- read.csv("Nightingale_file_path",header = TRUE, row.names = 1)
# Do the pre-processing steps to the metabolic measures
metabolic_measures<-QCprep_surrogates(as.matrix(metabolic_measures), Nmax_miss=1,Nmax_zero=1)

#load the phenotypic dataset
phenotypes <- read.csv("phenotypes_file_path",header = TRUE, row.names = 1)
#Calculating the binarized surrogates
bin_pheno<-binarize_all_pheno(phenotypes)

#Apply the surrogate models
surrogates<-foreach::foreach(i=MiMIR::phenotypes_names$out_surro, .combine="cbind") %do% {
pred<-apply.fit_surro(as.matrix(metabo_measures), 
PARAM_surrogates$models_betas[i,])}


## End(Not run)

apply.scale

Description

Helper function created to scale the NH-metabolomics matrix samples

Usage

apply.scale(dat, MEAN, SD, quiet = TRUE)

Arguments

dat

numeric data-frame with Nightingale-metabolomics

MEAN

numeric vector indicating the mean of the metabolites present in dat

SD

numeric vector indicating the standard deviations of the metabolites present in dat

quiet

Tlogical to suppress the messages in the console

Value

The matrix z-scaling the Nightingale-metabolomics dataset using the given Means and SDs

References

Examples

## Not run: 
library(MiMIR)

#load the Nightignale metabolomics dataset
metabolic_measures <- read.csv("Nightingale_file_path",header = TRUE, row.names = 1)
#Apply the scaling to the metabolic features
mat <- apply.scale(metabolic_measures, MEAN=PARAM_metaboAge$MEAN, SD=PARAM_metaboAge$SD)

## End(Not run)

binarize_all_pheno

Description

Helper function created to binarize the phenotypes used to calculate the metabolomics based surrogate made by Bizzarri et al.

Usage

binarize_all_pheno(data)

Arguments

data

phenotypes data.frame containing some of the following variables (with the same namenclature): "sex","diabetes", "lipidmed", "blood_pressure_lowering_med", "current_smoking", "metabolic_syndrome", "alcohol_consumption", "age","BMI", "ln_hscrp","waist_circumference", "weight","height", "triglycerides", "ldl_chol", "hdlchol", "totchol", "eGFR","wbc","hgb"

Details

Value

The phenotypic variables binarized following the thresholds in in the metabolomics surrogates made by by Bizzarri et al.

References

This function was made to binarize the variables following the same rules indicated in the article: Bizzarri,D. et al. (2022) 1H-NMR metabolomics-based surrogates to impute common clinical risk factors and endpoints. EBioMedicine, 75, 103764, doi:10.1016/j.ebiom.2021.103764

Examples

library(MiMIR)

#load the phenotypes dataset
phenotypes <- synthetic_phenotypic_dataset
#Calculate BMI, LDL cholesterol and eGFR
binarized_phenotypes<-binarize_all_pheno(phenotypes)

c21

Description

Colors attributed to each metabolomics-based model in MiMIR

Usage

data("c21")

Format

An object of class character of length 21.

Examples

data("c21")

calculate_surrogate_scores

Description

Function to compute the surrogate scores by Bizzarri et al. from the Nightingale metabolomics matrix

Usage

calculate_surrogate_scores(
  met,
  pheno,
  PARAM_surrogates,
  bin_names = c("sex", "diabetes"),
  Nmax_miss = 1,
  Nmax_zero = 1,
  post = TRUE,
  roc = FALSE,
  quiet = FALSE
)

Arguments

met

numeric data-frame with Nightingale-metabolomics

pheno

phenotypic data.frame including this clinical variables (with the same nomenclature): "sex","diabetes", "lipidmed", "blood_pressure_lowering_med", "current_smoking", "metabolic_syndrome", "alcohol_consumption", "age","BMI", "ln_hscrp","waist_circumference", "weight","height", "triglycerides", "ldl_chol", "hdlchol", "totchol", "eGFR","wbc","hgb"

PARAM_surrogates

list containing the parameters to compute the metabolomics-based surrogates

bin_names

vector of strings containing the names of the binary variables

Nmax_miss

numeric value indicating the maximum number of missing values allowed per sample (Number suggested=1)

Nmax_zero

numeric value indicating the maximum number of zeros allowed per sample (Number suggested=1)

post

logical to indicate if the function should calculate the posterior probabilities

roc

logical to plot ROC curves for the metabolomics surrogate (available only for the phenotypes included)

quiet

logical to suppress the messages in the console

Details

Value

if pheno is not available: list with the surrogates and the Nightingale metabolomics matrix after QC. if pheno is available: list with the surrogates, ROC curves, phenotypes, binarized phenotypes and the Nightingale metabolomics matrix after QC,

References

Examples

require(MiMIR)
require(foreach)
require(pROC)
require(foreach)

#load dataset
m <- synthetic_metabolic_dataset
p <- synthetic_phenotypic_dataset
#Apply the surrogates
sur<-calculate_surrogate_scores(met=m,pheno=p,MiMIR::PARAM_surrogates,bin_names=c("sex","diabetes"))

calib_data_frame

Description

helper function that creates a data.frame with the Platt Calibrations

Usage

calib_data_frame(calibrations, bin_phenotypes, bin_pheno_available)

Arguments

calibrations

list result of calibration_surro

bin_phenotypes

data.frame with binary phenotypes resulting form binarize_all_pheno

bin_pheno_available

string with the names of the binarize clinical variables available in the dataset

Value

data.frame with the calibrated surrogates

calibration_surro

Description

helper function that calculates the Platt Calibrations for all the surrogates

Usage

calibration_surro(
  bin_phenotypes,
  surrogates,
  bin_names,
  bin_pheno_available,
  pl = FALSE,
  nbins = 10
)

Arguments

bin_phenotypes

data.frame with binary phenotypes resulting form binarize_all_pheno

surrogates

data.frame with surrogates resulting from calculate_surrogate_scores

bin_names

string with the names of the binarize clinical variables

bin_pheno_available

string with the names of the binarize clinical variables available in the dataset

pl

TRUE/FALSE. If TRUE creates the calibration plots

nbins

number of bins for the plots

Value

list with the calibrated surrogates

comp.CVD_score

Description

Function to compute CVD-score made by Peter Wurtz et al. made by Deelen et al. on Nightingale metabolomics data-set.

Usage

comp.CVD_score(met, phen, betas, quiet = FALSE)

Arguments

met

numeric data-frame with Nightingale-metabolomics

phen

data-frame containing phenotypic information of the samples (specifically: sex, systolic_blood_pressure, current_smoking, diabetes, blood_pressure_lowering_med, lipidmed, totchol, and hdlchol)

betas

The betas of the linear regression composing the CVD-score

quiet

logical to suppress the messages in the console

Value

data-frame containing the value of the CVD-score on the uploaded data-set

References

This function is constructed to be able to apply the CVD-score as described in: Wurtz,P. et al. (2015) Metabolite profiling and cardiovascular event risk: a prospective study of 3 population-based cohorts. Circulation, 131, 774-785, doi:10.1161/CIRCULATIONAHA.114.013116

Examples

library(MiMIR)

#load the dataset
met <- synthetic_metabolic_dataset
phen<-synthetic_phenotypic_dataset
#Prepare the metabolic features fo the mortality score
CVDscore<-comp.CVD_score(met= met, phen=phen, betas=MiMIR::CVD_score_betas, quiet=TRUE)

comp.T2D_Ahola_Olli

Description

Function to compute the T2D score made by Ahola Olli et al. on Nightingale metabolomics data-set.

Usage

comp.T2D_Ahola_Olli(met, phen, betas, quiet = FALSE)

Arguments

met

numeric data-frame with Nightingale-metabolomics

phen

data-frame containing phenotypic information of the samples (in particular: sex, age, BMI and the clinically measured glucose)

betas

The betas of the linear regression composing the T2D-score

quiet

logical to suppress the messages in the console

Details

This metabolomics-based score is associated with incident Type 2 Diabetes, made by Ahola-Olli et al. It is constructed using phe, l_vldl_ce_percentage and l_hdl_fc quantified by Nightingale Health, and some phenotypic information: sex, age, BMI, fasting glucose. It was trained using a stepwise logistic regression on 3 cohorts.

Value

data-frame containing the value of the T2D-score on the uploaded data-set

References

This function is constructed to be able to apply the T2D-score as described in: Ahola-Olli,A.V. et al. (2019) Circulating metabolites and the risk of type 2 diabetes: a prospective study of 11,896 young adults from four Finnish cohorts. Diabetologia, 62, 2298-2309, doi:10.1007/s00125-019-05001-w

Examples

library(MiMIR)

#load the dataset
met <- synthetic_metabolic_dataset
phen<-synthetic_phenotypic_dataset
#Prepare the metabolic features fo the mortality score
T2Dscore<-comp.T2D_Ahola_Olli(met= met, phen=phen,betas=MiMIR::Ahola_Olli_betas, quiet=TRUE)

comp.mort_score

Description

Function to compute the mortality score made by Deelen et al. on Nightingale metabolomics data-set.

Usage

comp.mort_score(dat, betas = mort_betas, quiet = FALSE)

Arguments

dat

numeric data-frame with Nightingale-metabolomics

betas

data.frame containing the coefficients used for the regression of the mortality score

quiet

logical to suppress the messages in the console

Details

This multivariate model predicts all-cause mortality at 5 or 10 years better than clinical variables normally associated with mortality. It is constituted of 14 metabolic features quantified by Nightingale Health. It was originally trained using a stepwise Cox regression analysis in a meta-analysis on 12 cohorts composed by 44,168 individuals.

Value

data-frame containing the value of the mortality score on the uploaded data-set

References

This function is constructed to be able to apply the mortality score as described in: Deelen,J. et al. (2019) A metabolic profile of all-cause mortality risk identified in an observational study of 44,168 individuals. Nature Communications, 10, 1-8, doi:10.1038/s41467-019-11311-9

Examples

library(MiMIR)

#load the Nightignale metabolomics dataset
metabolic_measures <- synthetic_metabolic_dataset
#Prepare the metabolic features fo the mortality score
mortScore<-comp.mort_score(metabolic_measures,quiet=TRUE)

comp_covid_score

Description

Function to compute the COVID severity score made by Nightingale Health UK Biobank Initiative et al. on Nightingale metabolomics data-set.

Usage

comp_covid_score(dat, betas = MiMIR::covid_betas, quiet = FALSE)

Arguments

dat

numeric data-frame with Nightingale-metabolomics

betas

data.frame containing the coefficients used for the regression of the COVID-score

quiet

logical to suppress the messages in the console

Details

Multivariate model predicting the risk of severe COVID-19 infection. It is based on 37 metabolic features and trained using LASSO regression on 52,573 samples from the UK-biobanks.

Value

data-frame containing the value of the COVID-score on the uploaded data-set

References

This function is constructed to be able to apply the COVID-score as described in: Nightingale Health UK Biobank Initiative et al. (2021) Metabolic biomarker profiling for identification of susceptibility to severe pneumonia and COVID-19 in the general population. eLife, 10, e63033, doi:10.7554/eLife.63033

Examples

library(MiMIR)

#load the Nightignale metabolomics dataset
metabolic_measures <- synthetic_metabolic_dataset

#Compute the mortality score
mortScore<-comp_covid_score(dat=metabolic_measures, quiet=TRUE)

cor_assoc

Description

Function to calulate the correlation between 2 matrices

Usage

cor_assoc(dat1, dat2, feat1, feat2, method = "pearson", quiet = FALSE)

Arguments

dat1

matrix 1

dat2

matrix 2

feat1

vector of strings with the names of the selected variables in dat

feat2

vector if strings with the names of the selected variables in dat2

method

indicates which methods of the correlation to use

quiet

logical to suppress the messages in the console

Value

correlations of the selected variables in the 2 martrices

Examples

library(stats)

#load the dataset
m <- as.matrix(synthetic_metabolic_dataset)

#Compute the pearson correlation of all the variables in the data.frame metabolic_measures
cors<-cor_assoc(m, m, MiMIR::metabolites_subsets$MET63,MiMIR::metabolites_subsets$MET63)

COVID-score betas

Description

The coefficients used to compute the COVID score by Nightingale Health UK Biobank Initiative et al.

Usage

data("covid_betas")

Format

An object of class data.frame with 25 rows and 3 columns.

Details

Dataframe containing the abbreviation of the metabolites, the metabolites names and finally the Coefficients to compute the COVID score

References

Nightingale Health UK Biobank Initiative et al. (2021) Metabolic biomarker profiling for identification of susceptibility to severe pneumonia and COVID-19 in the general population. eLife, 10, e63033, doi:10.7554/eLife.63033

Examples

data("covid_betas")

Helper function to compute MetaboWASs

Description

This helper function is called when doing Metabolites wide association analysis. It reports the results of linear regression models to study the association of a test variable to each metabolites individually and corrected for the covariates indicated.

Usage

do_metabowas(
  phen,
  dat,
  test_variable = "age",
  covariates = c("sex"),
  adj_method = "BH",
  quiet = TRUE
)

Arguments

phen

phenotypes data.frame

dat

metabolites data.frame

test_variable

the variable to be investigated

covariates

the covariates that you want to add

adj_method

correction method.

quiet

if FALSE it will plot the amount of people avaialble

Value

results= the results of the MetaboWAS (estimate, tstatistics, pvalue, BH corrected pvalue)

find_BBMRI_names

Description

Function to translate Nightingale metabolomics alternative metabolite names to the ones used in BBMRI-nl

Usage

find_BBMRI_names(names)

Arguments

names

vector of strings with the metabolic features names to be translated

Value

data.frame with the uploaded metabolites names on the first column and the BBMRI names on the second column.

References

This is a function originally created for the package ggforestplot and modified ad hoc for our package (https://nightingalehealth.github.io/ggforestplot/articles/index.html).

Examples

library(MiMIR)
library(purrr)

#load the Nightignale metabolomics dataset
metabolic_measures <- synthetic_metabolic_dataset
#Find the metabolites names used in BBMRI-nl
nam<-find_BBMRI_names(colnames(metabolic_measures))

get.p

Description

helper function for extracting pvalues from cor.assoc results

Usage

get.p(res)

Arguments

res

Results of cor.assoc

Value

the matrix of the pvalues of the associations

get.s

Description

helper function for extracting statistics from cor.assoc results

Usage

get.s(res)

Arguments

res

Results of cor.assoc

Value

the matrix of associations

getECE

Description

helper function to calculate the ECE of calibrations

Usage

getECE(actual, predicted, n_bins = 10)

Arguments

actual

observed binary phenotype

predicted

predicted values

n_bins

the number of bins

Value

ECE value

helper function to calculate the MCE of the calibrations

Description

helper function to calculate the MCE of the calibrations

Usage

getMCE(actual, predicted, n_bins = 10)

Arguments

actual

real values of the variables

predicted

predicted values by one of the surrogates

n_bins

the number of bins

Value

MCE value

getvol

Description

helper function to retrieve the volumes for the download from the Rshiny application

Usage

getvol()

Value

table

hist_plots

Description

#' Function to plot the histograms for all the variables in dat

Usage

hist_plots(
  dat,
  x_name,
  color = MiMIR::c21,
  scaled = FALSE,
  datatype = "metabolic score",
  main = "Predictors Distributions"
)

Arguments

dat

data.frame or matrix with the variables to plot

x_name

string with the names of the selected variables in dat

color

colors selected for all the variables

scaled

logical to z-scale the variables

datatype

a character vector indicating what data type is beeing plotted

main

title of the plot

Value

plotly image with the histograms of the selected variables

Examples

require(MiMIR)
require(plotly)
require(matrixStats)
#load the metabolites dataset
m <- synthetic_metabolic_dataset

#Apply a surrogate models and plot the ROC curve
surrogates<-calculate_surrogate_scores(m, PARAM_surrogates=MiMIR::PARAM_surrogates, roc=FALSE)
#Plot the histogram of the surrogate sex values scaled 
hist_plots(surrogates$surrogates, x_name="s_sex", scaled=TRUE)

hist_plots_mortality

Description

#' Function to plot the histogram of the mortality score separated for different age ranges as a plotly image

Usage

hist_plots_mortality(mort_score, phenotypes)

Arguments

mort_score

data.frame containing the mortality score

phenotypes

data.frame containing age

Value

plotly image with the histogram of the mortality score separated in 3 age ranges

Examples

library(MiMIR)
library(plotly)
#' #load the dataset
metabolic_measures <- synthetic_metabolic_dataset
phenotypes <- synthetic_phenotypic_dataset

#Compute the mortality score
mortScore<-comp.mort_score(metabolic_measures,quiet=TRUE)
#Plot the mortality score histogram at different ages
hist_plots_mortality(mortScore, phenotypes)

impute_miss

Description

Helper function that subsets the NH-metabolomics matrix to the samples with less than Nmax zeros

Usage

impute_miss(x)

Arguments

x

numeric data-frame with Nightingale-metabolomics

Details

Function created that subsets the NH-metabolomics matrix samples to the ones for which the metabolites included in MetaboAge for which the log of the metabolic concentrations are not more than 5SD away from their mean

Value

matrix of the Nightingale-metabolomics dataset with missing values imputed to zero

References

Examples

## Not run: 
library(MiMIR)

#load the Nightignale metabolomics dataset
metabolic_measures <- read.csv("Nightingale_file_path",header = TRUE, row.names = 1)
#Imputing missing values
mat <- impute_miss(metabolic_measures)

## End(Not run)

is.sym

Description

Defining a helper function to check whether a supplied matrix is symmetric: helper function to check whether a supplied matrix is symmetric

Usage

is.sym(res)

Arguments

res

Results of cor.assoc

Value

TRUE/FALSE

kapmeier_scores

Description

#' Function that creates a Kaplan Meier comparing first and last tertile of a metabolic score

Usage

kapmeier_scores(predictors, pheno, score, Eventname = "Event")

Arguments

predictors

The data.frame containing the predictors

pheno

The data.frame containing the phenotypes

score

a character string indicating which predictor to use

Eventname

a character string with the name of the event to print on the plot

Value

plotly with a Kaplan Meier comparing first and last tertile of a metabolic score

Examples

require(MiMIR)
require(plotly)
require(survminer)
require(ggfortify)
require(ggplot2)

#load the dataset
metabolic_measures <- synthetic_metabolic_dataset
phenotypes <- synthetic_phenotypic_dataset

#Compute the mortality score
mortScore<-comp.mort_score(metabolic_measures,quiet=TRUE)

#Plot a Kaplan Meier
kapmeier_scores(predictors=mortScore, pheno=phenotypes, score="mortScore")

withSpinner

Description

helper function to show that the dataset is loading

Usage

loading()

Value

loader

loading_spin

Description

helper function to create a loading spinner for the calibration

Usage

loading_spin(plot)

Value

loading spinner

metabolomics feature nomenclatures

Description

Translator of the names of the metabolomics-features to the ones used in BBMRI-nl

Usage

data("metabo_names_translator")

Format

An object of class data.frame with 228 rows and 9 columns.

References

This is a list originally created for the package ggforestplot and modified ad-hoc for our package (https://nightingalehealth.github.io/ggforestplot/articles/index.html).

Examples

data("metabo_names_translator")

metabolomics feature subsets

Description

List containing all the subset of the metabolomics-based features used for our models

Usage

data("metabolites_subsets")

Format

An object of class list of length 8.

References

Examples

data("metabolites_subsets")

model_coeff_heat

Description

Function to plot the scaled coefficients of the metabolic scores

Usage

model_coeff_heat(
  mort_betas,
  metaboAge_betas,
  surrogates_betas,
  Ahola_Olli_betas,
  CVD_score_betas,
  COVID_score_betas
)

Arguments

mort_betas

dataframe withthe coefficients of the mortality score

metaboAge_betas

dataframe with the coefficients of the metaboAge

surrogates_betas

dataframe with the coefficients of the surrogates

Ahola_Olli_betas

dataframe with the coefficients of the T2D score

CVD_score_betas

dataframe with the coefficients of the CVD score

COVID_score_betas

ataframe with the coefficients of the COVID_score

Value

heatmapply with the scaled coefficients of the metabolic scores

Mortality score betas

Description

The coefficients used to compute the mortality score by Deelen et al.

Usage

data("mort_betas")

Format

An object of class data.frame with 14 rows and 3 columns.

Details

Dataframe containing the abbreviation of the metabolites, the metabolites names and finally the Coefficients to compute the mortality score

References

Deelen,J. et al. (2019) A metabolic profile of all-cause mortality risk identified in an observational study of 44,168 individuals. Nature Communications, 10, 1-8, doi:10.1038/s41467-019-11311-9

Examples

data("mort_betas")

multi_hist

Description

#' Function to plot the histograms for all the variables in dat

Usage

multi_hist(dat, color = MiMIR::c21, scaled = FALSE)

Arguments

dat

data.frame or matrix with the variables to plot

color

colors selected for all the variables

scaled

logical to z-scale the variables

Value

plotly image with the histograms for all the variables in dat

Examples


library(plotly)
library(MiMIR)

#load the dataset
metabolic_measures <- synthetic_metabolic_dataset

multi_hist(metabolic_measures[,MiMIR::metabolites_subsets$MET14], scaled=T)

pheno_barplots

Description

#' Function created to binarize the phenotypes used to calculate the metabolomics based surrogate made by Bizzarri et al.

Usage

pheno_barplots(bin_phenotypes)

Arguments

bin_phenotypes

Details

Value

The phenotypic variables binarized following the thresholds in in the metabolomics surrogates made by by Bizzarri et al.

References

Examples

require(MiMIR)
require(foreach)

#load the phenotypes dataset
phenotypes <- synthetic_phenotypic_dataset

#Calculate BMI, LDL cholesterol and eGFR
binarized_phenotypes<-binarize_all_pheno(phenotypes)
#Plot the variables
pheno_barplots(binarized_phenotypes)

phenotypic features names

Description

List containing all the subsets of phenotypics variables used in the app

Usage

data("phenotypes_names")

Format

An object of class list of length 5.

Examples

data("phenotypes_names")

Function that plots the Platt Calibrations using plotly

Description

Function that plots the Platt Calibrations using plotly

Usage

plattCalib_evaluation(
  r,
  p,
  p.orig,
  name,
  nbins = 10,
  annot_x = c(1, 1),
  annot_y = c(0.1, 0.3)
)

Arguments

r

binary real data

p

predicted probabilities

p.orig

the uncalibrated posterior probabilities

name

character string indicating the name of the calibrated variable

nbins

number of bins to create the plots

annot_x

integer indicating the x axis points in which the ECE and MCE values will be plotted

annot_y

integer indicating the y axis points in which the ECE and MCE values will be plotted

Value

list with Reliability diagram and histogram with calibrations and original predictions

plattCalibration

Description

Function that calculates the Platt Calibrations

Usage

plattCalibration(r.calib, p.calib, nbins = 10, pl = FALSE)

Arguments

r.calib

observed binary phenotype

p.calib

predicted probabilities

nbins

number of bins to create the plots

pl

logical indicating if the function should plot the Reliability diagram and histogram of the calibrations

Details

Many popular machine learning algorithms produce inaccurate predicted probabilities, especially when applied on a dataset different than the training set. Platt (1999) proposed an adjustment, in which the original probabilities are used as a predictor in a single-variable logistic regression to produce more accurate adjusted predicted probabilities. The function will also help the evaluation of the calibration, by plotting: reliability diagrams and distributions of the calibrated and non-calibrated probabilities. The reliability diagrams plots the mean predicted value within a certain range of posterior probabilities, against the fraction of accurately predicted values. Finally, we also report accuracy measures for the calibrations: the ECE, MCE and the Log-Loss of the probabilities before and after calibration.

Value

list with samples, responses, calibrations, ECE, MCE and calibration plots if save==T

References

This is a function originally created for the package in eRic, under the name prCalibrate and modified ad hoc for our purposes (Github)

J. C. Platt, 'Probabilistic Outputs for Support Vector Machines and Comparisons to Regularized Likelihood Methods', in Advances in Large Margin Classifiers, 1999, pp. 61-74.

Examples

library(stats)
library(plotly)

#load the dataset
met <- synthetic_metabolic_dataset
phen <- synthetic_phenotypic_dataset

#Calculating the binarized surrogates
b_phen<-binarize_all_pheno(phen)
#Apply a surrogate models and plot the ROC curve
surr<-calculate_surrogate_scores(met, phen,MiMIR::PARAM_surrogates, bin_names=colnames(b_phen))
#Calibration of the surrogate sex
real_data<-as.numeric(b_phen$sex)
pred_data<-surr$surrogates[,"s_sex"]
plattCalibration(r.calib=real_data, p.calib=pred_data, nbins = 10, pl=TRUE)

plot_corply

Description

Function creating plottig the correlation between 2 datasets, dat1 x dat2 on basis of (partial) correlations

Usage

plot_corply(
  res,
  main = NULL,
  zlim = NULL,
  reorder.x = FALSE,
  reorder.y = reorder.x,
  resort_on_p = FALSE,
  abs = FALSE,
  cor.abs = FALSE,
  reorder_dend = FALSE
)

Arguments

res

associations obtained with cor.assoc

main

title of the plot

zlim

max association to plot

reorder.x

logical indicating if the function should reorder the x axis based on clustering

reorder.y

logical indicating if the function should reorder the y axis based on clustering

resort_on_p

logical indicating if the function should reorder x and y axis based on the pvalues of the associations

abs

logical indicating if the function should reorder based the absolute values

cor.abs

logical indicating if the function should reorder the plot base on the absolute values

reorder_dend

Tlogical indicating if the function should reorder the plot based on dendrogram

Value

heatmap with the results of cor.assoc

Examples

library(stats)

#load the dataset
m <- as.matrix(synthetic_metabolic_dataset)

#Compute the pearson correlation of all the variables in the data.frame metabolic_measures
cors<-cor_assoc(m, m, MiMIR::metabolites_subsets$MET63,MiMIR::metabolites_subsets$MET63)
#Plot the correlations
plot_corply(cors, main="Correlations metabolites")

plot_na_heatmap

Description

Function plotting information about missing & zero values on the indicated matrix.

Usage

plot_na_heatmap(dat)

Arguments

dat

The matrix or data.frame

Details

This heatmap indicates the available values in grey and missing or zeros in white. On the sides two bar plots on the sides, one showing the missingn or zero values per row and another to show the missing or zeroes per column.

Value

Plot with a central heatmap and two histogram on the sides

Examples

library(graphics)
library(MiMIR)

#load the metabolites dataset
metabolic_measures <- synthetic_metabolic_dataset
#Plot the missing values in the metabolomics matrix
plot_na_heatmap(metabolic_measures)

plotly_NA_message

Description

helper function to create a plotly indicating a problem with a plotly image

Usage

plotly_NA_message(main = "Phenotype not available!")

Arguments

main

Message to plot

Value

plotly image

predictions_surrogates

Description

Helper function that apply a surrogate model and plot a ROC curve the accuracy

Usage

predictions_surrogates(FIT, data, title_img = FALSE, plot = TRUE)

Arguments

FIT

numeric vector with betas of the logistic regressions composing the surrogates by Bizzarri et al.

data

numeric data-frame with Nightingale-metabolomics and the binarized phenotype to predict

title_img

string with title of the image

plot

logical to obtain the ROC curve

Details

Value

If plot==TRUE The surrogate predictions and the roc curve. If plot==F only the surrogate predictions

References

Examples

## Not run: 
library(MiMIR)

#load the Nightignale metabolomics dataset
metabolic_measures <- read.csv("Nightingale_file_path",header = TRUE, row.names = 1)
# Do the pre-processing steps to the metabolic measures
metabolic_measures<-QCprep_surrogates(as.matrix(metabolic_measures), Nmax_miss=1,Nmax_zero=1)

#load the phenotypic dataset
phenotypes <- read.csv("phenotypes_file_path",header = TRUE, row.names = 1)
#Calculating the binarized surrogates
bin_pheno<-binarize_all_pheno(phenotypes)

#Apply a surrogate models and plot the ROC curve
data<-data.frame(out=factor(phenotypes_names$bin_names[,1]), metabo_measures)
colnames(data)[1]<-"out"
pred<-predictions_surrogates(PARAM_surrogates$models_betas["s_sex",], data=data, title_img="s_sex")


## End(Not run)

prep_data_COVID_score

Description

Helper function to pre-process the Nightingale Health metabolomics data-set before applying the COVID score.

Usage

prep_data_COVID_score(
  dat,
  featID = c("gp", "dha", "crea", "mufa", "apob_apoa1", "tyr", "ile", "sfa_fa", "glc",
    "lac", "faw6_faw3", "phe", "serum_c", "faw6_fa", "ala", "pufa", "glycine", "his",
    "pufa_fa", "val", "leu", "alb", "faw3", "ldl_c", "serum_tg"),
  quiet = FALSE
)

Arguments

dat

numeric data-frame with Nightingale-metabolomics

featID

vector of strings with the names of metabolic features included in the COVID-score

quiet

logical to suppress the messages in the console

Value

The Nightingale-metabolomics data-frame after pre-processing (checked for zeros, z-scaled and log-transformed) according to what has been done by the authors of the original papers.

References

This function is constructed to be able to follow the pre-processing steps described in: Nightingale Health UK Biobank Initiative et al. (2021) Metabolic biomarker profiling for identification of susceptibility to severe pneumonia and COVID-19 in the general population. eLife, 10, e63033, doi:10.7554/eLife.63033

Examples

require(MiMIR)
require(matrixStats)

#load the Nightignale metabolomics dataset
metabolic_measures <- synthetic_metabolic_dataset
#Prepare the metabolic features fo the mortality score
prepped_met <- prep_data_COVID_score(dat=metabolic_measures)

prep_met_for_scores

Description

Helper function to pre-process the Nightingale Health metabolomics data-set before applying the mortality, Type-2-diabetes and CVD scores.

Usage

prep_met_for_scores(dat, featID, plusone = FALSE, quiet = FALSE)

Arguments

dat

numeric data-frame with Nightingale-metabolomics

featID

vector of strings with the names of metabolic features included in the score selected

plusone

logical to determine if a value of 1.0 should be added to all metabolic features (TRUE) or only to the ones featuring zeros before log-transforming (FALSE)

quiet

logical to suppress the messages in the console

Value

The Nightingale-metabolomics data-frame after pre-processing (checked for zeros, zscale and log-transformed) according to what has been done by the authors of the original papers.

References

This function is constructed to be able to follow the pre-processing steps described in: Deelen,J. et al. (2019) A metabolic profile of all-cause mortality risk identified in an observational study of 44,168 individuals. Nature Communications, 10, 1-8, doi:10.1038/s41467-019-11311-9.

Wurtz,P. et al. (2015) Metabolite profiling and cardiovascular event risk: a prospective study of 3 population-based cohorts. Circulation, 131, 774-785, doi:10.1161/CIRCULATIONAHA.114.013116

Examples

library(MiMIR)

#load the Nightingale metabolomics dataset
metabolic_measures <- synthetic_metabolic_dataset
#Prepare the metabolic features fo the mortality score
prepped_met <- prep_met_for_scores(metabolic_measures,featID=MiMIR::mort_betas$Abbreviation)

rendertable

Description

helper function to create a table for an Rshiny app

Usage

rendertable(data)

Arguments

data

the dataset to show

Value

table

report.dim

Description

Helper function to report on the console the dimension of the NH metabolomics matrix

Usage

report.dim(x, header, trailing = "0")

Arguments

x

numeric data-frame with Nightingale-metabolomics

header

string describing the sub-sampling of the NH-metabolomics matrix

trailing

number of digits to show

Value

The report of the NH-metabolomics matrix dimension

References

Examples

## Not run: 
library(MiMIR)

#load the Nightignale metabolomics dataset
metabolic_measures <- read.csv("Nightingale_file_path",header = TRUE, row.names = 1)
#Apply the scaling to the metabolic features
cat(report.dim(x, header=paste0("Pruning samples on 5SD")))

## End(Not run)

resort.on.p

Description

helper function for clustering cor.assoc results reordering based on the pvalues

Usage

resort.on.p(res, abs = FALSE)

Arguments

res

Results of cor.assoc

abs

TRUE/FALSE. If TRUE it will cluster the absolute values

Value

Returns the clustered order of the associations based on the pvalues

resort.on.s

Description

helper function for clustering cor.assoc results reordering based on the associations

Usage

resort.on.s(res, abs = FALSE)

Arguments

res

Results of cor.assoc

abs

TRUE/FALSE. If TRUE it will cluster the absolute values

Value

Returns the clustered order of the associations

roc_surro

Description

Function that creates a ROC curve of the selected metabolic surrogates as a plotly image

Usage

roc_surro(surrogates, bin_phenotypes, x_name)

Arguments

surrogates

numeric data.frame of metabolomics-based surrogate values by Bizzarri et al.

bin_phenotypes

logic data.frame of binarized phenotypes

x_name

vector of strings with the names of the selected binary phenotypes for the roc

Value

plotly image with the ROC curves for one or more selected variables

Examples

require(pROC)
require(plotly)
require(foreach)
require(MiMIR)

#load the dataset
met <- synthetic_metabolic_dataset
phen<- synthetic_phenotypic_dataset

#Calculating the binarized surrogates
b_phen<-binarize_all_pheno(phen)
#Apply a surrogate models and plot the ROC curve
surr<-calculate_surrogate_scores(met, phen, MiMIR::PARAM_surrogates, colnames(b_phen))
#Plot the ROC curves
roc_surro(surr$surrogates, b_phen, "sex")

roc_surro_subplots

Description

Function that plots the ROCs of the surrogates of all the available surrogate models as plotly sub-plots

Usage

roc_surro_subplots(surrogates, bin_phenotypes)

Arguments

surrogates

numeric data.frame containing the surrogate values by Bizzarri et al.

bin_phenotypes

numeric data.frame with the binarized phenotypes output of binarize_all_pheno

Value

plotly image with all the ROCs for all the available clinical variables

Examples

library(pROC)
library(plotly)
library(MiMIR)

#load the dataset
met <- synthetic_metabolic_dataset
phen<- synthetic_phenotypic_dataset

#Calculating the binarized surrogates
b_phen<-binarize_all_pheno(phen)
#Apply a surrogate models and plot the ROC curve
surr<-calculate_surrogate_scores(met, phen, MiMIR::PARAM_surrogates, colnames(b_phen))

roc_surro_subplots(surr$surrogates, b_phen)

scatterplot_predictions

Description

Function to visualize a scatter-plot comparing two variables

Usage

scatterplot_predictions(x, p, title, xname = "x", yname = "predicted x")

Arguments

x

numeric vector

p

second numeric vector

title

string vector with the title

xname

string vector with the name of the variable on the x axis

yname

string vector with the name of the variable on the y axis

Value

plotly image with the scatterplot

Examples

library(plotly)
#load the dataset
metabolic_measures <- synthetic_metabolic_dataset
phenotypes <- synthetic_phenotypic_dataset

#Pre-process the metabolic features
prepped_met<-QCprep(as.matrix(metabolic_measures), MiMIR::PARAM_metaboAge)
#Apply the metaboAge
metaboAge<-apply.fit(prepped_met, FIT=PARAM_metaboAge$FIT_COEF)

age<-data.frame(phenotypes$age)
rownames(age)<-rownames(phenotypes)
scatterplot_predictions(age, metaboAge, title="Chronological Age vs MetaboAge")

startMiMIR

Description

Start the application MiMIR.

Usage

startApp(launch.browser = TRUE)

Arguments

launch.browser

TRUE/FALSE

Details

This function starts the R-Shiny tool called MiMIR (Metabolomics-based Models for Imputing Risk), a graphical user interface that provides an intuitive framework for ad-hoc statistical analysis of Nightingale Health's 1H-NMR metabolomics data and allows for the projection and calibration of 24 pre-trained metabolomics-based models, without any pre-required programming knowledge.

Value

Opens application. If launch.browser=TRUE in default web browser

References

Deelen,J. et al. (2019) A metabolic profile of all-cause mortality risk identified in an observational study of 44,168 individuals. Nature Communications, 10, 1-8, doi: 10.1038/s41467-019-11311-9. Ahola-Olli,A.V. et al. (2019) Circulating metabolites and the risk of type 2 diabetes: a prospective study of 11,896 young adults from four Finnish cohorts. Diabetologia, 62, 2298-2309, doi: 10.1007/s00125-019-05001-w Wurtz,P. et al. (2015) Metabolite profiling and cardiovascular event risk: a prospective study of 3 population-based cohorts. Circulation, 131, 774-785, doi: 10.1161/CIRCULATIONAHA.114.013116 Bizzarri,D. et al. (2022) 1H-NMR metabolomics-based surrogates to impute common clinical risk factors and endpoints. EBioMedicine, 75, 103764, doi: 10.1016/j.ebiom.2021.103764 van den Akker Erik B. et al. (2020) Metabolic Age Based on the BBMRI-NL 1H-NMR Metabolomics Repository as Biomarker of Age-related Disease. Circulation: Genomic and Precision Medicine, 13, 541-547, doi:10.1161/CIRCGEN.119.002610

subset_metabolites_overlap

Description

Helper function that subsets the NH-metabolomics matrix features to the selection in metabolites needed for the metabolic score

Usage

subset_metabolites_overlap(x, metabos, quiet = FALSE)

Arguments

x

numeric data-frame with Nightingale-metabolomics

metabos

vector of strings containing the names of the metabolic features to be selected

quiet

logical to suppress the messages in the console

Value

matrix with the selected Nightingale-metabolomics features

References

Examples

## Not run: 
library(MiMIR)

#load the Nightignale metabolomics dataset
metabolic_measures <- read.csv("Nightingale_file_path",header = TRUE, row.names = 1)
#Select the metabolic features
mat <- subset_metabolites_overlap(x=metabolic_measures,metabos=PARAM_metaboAge$MET)

## End(Not run)

subset_samples_miss

Description

Helper function that subsets the NH-metabolomics matrix to the samples with less than Nmax missing values

Usage

subset_samples_miss(x, Nmax = 1, quiet = FALSE)

Arguments

x

numeric data-frame with Nightingale-metabolomics

Nmax

integer indicating the max number of missing values allowed per sample (N suggested= 1)

quiet

logical to suppress the messages in the console

Value

matrix with the samples with limited amount of missing values in the Nightingale-metabolomics dataset

References

Examples

## Not run: 
library(MiMIR)

#load the Nightignale metabolomics dataset
metabolic_measures <- read.csv("Nightingale_file_path",header = TRUE, row.names = 1)
#Select the samples with only 1 missing value
mat <- subset_samples_miss(x=metabolic_measures, Nmax=1)


## End(Not run)

subset_samples_sd

Description

Helper function that subsets the NH-metabolomics matrix to the samples with limited numbers of outliers

Usage

subset_samples_sd(x, MEAN, SD, quiet = FALSE)

Arguments

x

numeric data-frame with Nightingale-metabolomics

MEAN

numeric vector indicating the mean of the metabolites in x

SD

numeric vector indicating the standard deviations of the metabolites in x

quiet

logical to suppress the messages in the console

Value

matrix with the samples with limited amount of outliers in the Nightingale-metabolomics dataset

References

Examples

## Not run: 
library(MiMIR)

#load the Nightignale metabolomics dataset
metabolic_measures <- read.csv("Nightingale_file_path",header = TRUE, row.names = 1)
#Select the samples with low outliers
mat <- subset_samples_sd(x=metabolic_measures, Nmax=1)

## End(Not run)

subset_samples_sd_surrogates

Description

Helper function that subsets the NH-metabolomics matrix to the samples with limited numbers of outliers

Usage

subset_samples_sd_surrogates(x, MEAN, SD, N = 5, quiet = FALSE)

Arguments

x

numeric data-frame with Nightingale-metabolomics

MEAN

numeric vector indicating the mean of the metabolites in x

SD

numeric vector indicating the standard deviations of the metabolites in x

N

numeric vector indicating the amount of standard deviations away from the mean after which we consider an outlier (N suggested=5)

quiet

logical to suppress the messages in the console

Details

Value

matrix with the samples with limited amount of outliers in the Nightingale-metabolomics dataset

References

Examples

## Not run: 
library(MiMIR)

#load the Nightignale metabolomics dataset
metabolic_measures <- read.csv("Nightingale_file_path",header = TRUE, row.names = 1)
#Select the samples with low outliers
mat <- subset_samples_sd_surrogates(x=metabolic_measures, Nmax=1)

## End(Not run)

subset_samples_miss

Description

Helper function that subsets the NH-metabolomics matrix to the samples with less than Nmax zeros

Usage

subset_samples_zero(x, Nmax = 1, quiet = FALSE)

Arguments

x

numeric data-frame with Nightingale-metabolomics

Nmax

integer indicating the max number of missing values allowed per sample (N suggested= 1)

quiet

logical to suppress the messages in the console

Value

matrix with the samples with limited amount of zeros in the Nightingale-metabolomics dataset

References

Examples

## Not run: 
library(MiMIR)

#load the Nightignale metabolomics dataset
metabolic_measures <- read.csv("Nightingale_file_path",header = TRUE, row.names = 1)
#Select samples with only 1 zero
mat <- subset_samples_zero(x=metabolic_measures, Nmax=1)


## End(Not run)

synthetic metabolomics dataset

Description

Data.frame containing a synthetic dataset of the Nightingale Metabolomics dataset created with the package synthpop from the LLS_PAROFF dataset.

Usage

data("synthetic_metabolic_dataset")

Format

An object of class data.frame with 500 rows and 229 columns.

References

M. Schoenmaker et al., 'Evidence of genetic enrichment for exceptional survival using a family approach: the Leiden Longevity Study', Eur. J. Hum. Genet., vol. 14, no. 1, Art. no. 1, Jan. 2006, doi:10.1038/sj.ejhg.5201508 B. Nowok, G. M. Raab, and C. Dibben, 'synthpop: Bespoke Creation of Synthetic Data in R', J. Stat. Softw., vol. 74, no. 1, Art. no. 1, Oct. 2016, doi:10.18637/jss.v074.i11

Examples

data("synthetic_metabolic_dataset")

synthetic metabolomics dataset

Description

Data.frame containing a synthetic dataset of phenotypic dataset created with the package synthpop from the LLS_PAROFF dataset.

Usage

data("synthetic_metabolic_dataset")

Format

An object of class data.frame with 500 rows and 24 columns.

References

Examples

data("synthetic_metabolic_dataset")

ttest_scores

Description

#' Function that creates a boxplot with a continuous variable split using the binary variable

Usage

ttest_scores(dat, pred, pheno)

Arguments

dat

The data.frame containing the 2 variables

pred

character indicating the y variable

pheno

character indicating the binary variable

Value

plotly boxplot with the continuous variable split using the binary variable

Examples

library(MiMIR)
library(plotly)

#load the dataset
metabolic_measures <- synthetic_metabolic_dataset
phenotypes <- synthetic_phenotypic_dataset

#Compute the mortality score
mortScore<-comp.mort_score(metabolic_measures,quiet=TRUE)
dat<-data.frame(predictor=mortScore, pheno=phenotypes$sex)
colnames(dat)<-c("predictor","pheno")
ttest_scores(dat = dat, pred= "mortScore", pheno="sex")

ttest_surrogates

Description

Function that calculates a t-test and a plotly image of the selected surrogates

Usage

ttest_surrogates(surrogates, bin_phenotypes)

Arguments

surrogates

numeric data.frame containing the surrogate values by Bizzarri et al.

bin_phenotypes

numeric data.frame with the binarized phenotypes output of binarize_all_pheno

Details

Barplot and T-test indicating if the surrogate variables could split accordingly the real value of the binary clinical variables.

Value

plotly image with all the ROCs for all the available clinical variables

Examples

require(pROC)
require(plotly)
require(MiMIR)
require(foreach)

#load the dataset
m <- synthetic_metabolic_dataset
p <- synthetic_phenotypic_dataset

#Calculating the binarized surrogates
b_p<-binarize_all_pheno(p)
#Apply a surrogate models and plot the ROC curve
surr<-calculate_surrogate_scores(met=m, pheno=p, MiMIR::PARAM_surrogates, bin_names=colnames(b_p))
ttest_surrogates(surr$surrogates, b_p)