Type: Package
Title: Microbiome Analysis Tools
Version: 0.1.3
Description: We provide functions for identifying the core community phylogeny in any microbiome, drawing phylogenetic Venn diagrams, calculating the core Faith’s PD for a set of communities, and calculating the core UniFrac distance between two sets of communities. All functions rely on construction of a core community phylogeny, which is a phylogeny where branches are defined based on their presence in multiple samples from a single type of habitat. Our package provides two options for constructing the core community phylogeny, a tip-based approach, where the core community phylogeny is identified based on incidence of leaf nodes and a branch-based approach, where the core community phylogeny is identified based on incidence of individual branches. We suggest use of the microViz package.
License: GPL-2
Encoding: UTF-8
biocViews: Software, phyloseq
Imports: ape, phyloseq, phytools, ggplot2, dplyr, utils, tibble, castor, vegan, data.table
RoxygenNote: 7.3.2
NeedsCompilation: no
Packaged: 2025-07-21 22:31:04 UTC; sbewick
Author: Sharon Bewick [aut, cre], Benjamin Camper [aut], National Science Foundation Division of Integrative Organismal Systems (award #2104605) [fnd]
Maintainer: Sharon Bewick <sbewick@clemson.edu>
Repository: CRAN
Date/Publication: 2025-07-21 22:50:02 UTC

Branch-based Core Community Phylogeny Using Thresholds

Description

Called internally. Identifies all edges and core edges of the core community phylogeny using a branch-based approach with core thresholds.

Usage

basic_branch(xv, nt, cf, abt1, abt2, abt3,rt)

Arguments

xv

(Required) The phyloseq object passed from main functions and containing microbial community data.

nt

(Required) The microbial phylogeny passed from main functions.

cf

(Required) The fraction of samples that a microbial taxon must be found in to be considered part of the 'core' microbiome.

abt1

(Required) The threshold for mean abundance across samples.

abt2

(Required) The threshold for the maximum abundance in any sample.

abt3

(Required) The threshold for the minimum abundance across sample.

rt

Whether to include the root of the phylogeny. The default is TRUE, meaning that the root is necessarily included in all phylogenies. This requires that the input tree be rooted.

Details

basic_branch is used internally in the holobiont package to identify core edges of a microbial phylogeny for the branch-based approach using thresholds.

Value

This function returns a list of all edges and core edges.

Examples

#Test with enterotype dataset
library(phyloseq)
library(ape)
library(phytools)
data(enterotype)

set.seed(1)

#Generate an example tree and label it with the names of the microbial taxa
enterotype_tree<-rtree(length(taxa_names(enterotype)))
enterotype_tree$tip.label<-taxa_names(enterotype)

#Create a phyloseq object with a tree
example_phyloseq<-phyloseq(otu_table(enterotype),phy_tree(as.phylo(enterotype_tree)))

basic_branch(example_phyloseq, phy_tree(example_phyloseq), 0.5, 0, 0, 0,TRUE)


Non-phylogenetic Core Community Phylogeny Using Thresholds

Description

Called internally. Identifies all edges and core edges of the core community phylogeny using a non-phylogenetic approach with core thresholds.

Usage

basic_np(xv, cf, abt1, abt2, abt3)

Arguments

xv

(Required) The phyloseq object passed from main functions and containing microbial community data.

cf

(Required) The fraction of samples that a microbial taxon must be found in to be considered part of the 'core' microbiome.

abt1

(Required) The threshold for mean abundance across all samples.

abt2

(Required) The threshold for maximum abundance in any sample.

abt3

(Required) The threshold for the minimum abundance across all samples.

Details

basic_np is used internally in the holobiont package to identify all taxa and all core taxa for the non-phylogenetic approach using thresholds.

Value

This function returns a list of all taxa and all core taxa.

Examples

#Test with enterotype dataset
library(phyloseq)
library(ape)
library(phytools)
data(enterotype)

set.seed(1)

#Generate an example tree and label it with the names of the microbial taxa
enterotype_tree<-rtree(length(taxa_names(enterotype)))
enterotype_tree$tip.label<-taxa_names(enterotype)

#Create a phyloseq object with a tree
example_phyloseq<-phyloseq(otu_table(enterotype),phy_tree(as.phylo(enterotype_tree)))

basic_np(example_phyloseq, 0.5, 0, 0, 0)


Tip-based Core Community Phylogeny Using Thresholds

Description

Called internally. Identifies all edges and core edges of the core community phylogeny using a tip-based approach with core thresholds.

Usage

basic_tip(xv, nt, cf, abt1, abt2, abt3,rt)

Arguments

xv

(Required) The phyloseq object passed from main functions and containing microbial community data.

nt

(Required) The microbial phylogeny passed from main functions.

cf

(Required) The fraction of samples that a microbial taxon must be found in to be considered part of the 'core' microbiome.

abt1

(Required) The threshold for mean abundance across all samples.

abt2

(Required) The threshold for maximum abundance in any sample.

abt3

(Required) The threshold for the minimum abundance across all samples.

rt

Whether to include the root of the phylogeny. The default is TRUE, meaning that the root is necessarily included in all phylogenies. This requires that the input tree be rooted.

Details

basic_tip is used internally in the holobiont package to identify core edges of a microbial phylogeny for the tip-based approach using thresholds.

Value

This function returns a list of all edges and core edges, as well as all taxa and core taxa.

Examples

#Test with enterotype dataset
library(phyloseq)
library(ape)
library(phytools)
data(enterotype)

set.seed(1)

#Generate an example tree and label it with the names of the microbial taxa
enterotype_tree<-rtree(length(taxa_names(enterotype)))
enterotype_tree$tip.label<-taxa_names(enterotype)

#Create a phyloseq object with a tree
example_phyloseq<-phyloseq(otu_table(enterotype),phy_tree(as.phylo(enterotype_tree)))

basic_tip(example_phyloseq, phy_tree(example_phyloseq), 0.5, 0, 0, 0,TRUE)



Edges of the Tree Included in the Core Community Phylogeny

Description

Finds the edges of a phylogenetic tree that are part of the core microbiome based on either the tip-based or the branch-based core community phylogeny.

Usage

coreEdges(x, core_fraction = 0.5, ab_threshold1 = 0,
ab_threshold2 = 0, ab_threshold3 = 0, mode='branch',
selection='basic', max_tax = NULL, increase_cutoff = 2,
initial_branches = NULL, rooted=TRUE, remove_zeros=TRUE)

Arguments

x

(Required) Microbial community data. This must be in the form of a phyloseq object and must contain, at a minimum, an OTU abundance table and a phylogeny.

core_fraction

The fraction of samples that a microbial taxon must be found in to be considered part of the 'core' microbiome. This variable is only used when selection = 'basic' and is ignored when selection = 'shade'. The default value is 0.5.

ab_threshold1

The threshold for mean relative abundance across all samples. This variable is only used when selection = 'basic' and is ignored when selection = 'shade'. The default value is 0.

ab_threshold2

The threshold for maximum relative abundance in any sample. This variable is only used when selection = 'basic' and is ignored when selection = 'shade'. The default value is 0.

ab_threshold3

The threshold for minimum relative abundance across all samples. This variable is only used when selection = 'basic' and is ignored when selection = 'shade'. The default value is 0.

mode

Whether to build a tip-based ('tip') or a branch-based ('branch') phylogeny. The default is 'branch'.

selection

Whether to use thresholds ('basic') or the Shade and Stopnisek method ('shade') to define the core community. The default is 'basic'.

max_tax

The maximum number of branches to add sequentially, as a percentage of the total branches when using the Shade and Stopnisek method. This variable is only used when selection = 'shade' and is ignored when selection = 'basic'.

increase_cutoff

The threshold for the percent increase in beta diversity used to identify the taxon at which point adding more taxa yields diminishing returns in explanatory power. This variable is only used when selection = 'shade' and is ignored when selection = 'basic'.

initial_branches

The number of branches to include prior to testing for increases in beta diversity. The default is to use all branches that are present in every sample and to begin testing branches that are missing from at least one sample. This variable is only used when selection = 'shade' and is ignored when selection = 'basic'.

rooted

Whether to include the root of the phylogeny. The default is TRUE, meaning that the root is necessarily included in all phylogenies. This requires that the input tree be rooted.

remove_zeros

Whether or not to remove taxa that are no reads associated across any of the samples.

Details

coreEdges identifies the edges of the core community phylogeny based on either the tip-based or the branch-based core community phylogeny using either basic thresholds or a modification of the Shade and Stopnisek (2019) algorithm. For more details, see Bewick and Camper (2025).

Value

This function returns a list of the edges of the phylogenetic tree associated with the core microbiome.

References

Shade, Ashley, and Nejc Stopnisek. "Abundance-occupancy distributions to prioritize plant core microbiome membership." Current opinion in microbiology 49 (2019): 50-58.

Examples

#Test with enterotype dataset
library(phyloseq)
library(ape)
library(phytools)
data(enterotype)

set.seed(1)

#Generate an example tree and label it with the names of the microbial taxa
enterotype_tree<-rtree(length(taxa_names(enterotype)))
enterotype_tree$tip.label<-taxa_names(enterotype)
enterotype_tree$node.label<-as.character(1:1:enterotype_tree$Nnode)

#Create a phyloseq object with a tree
example_phyloseq<-phyloseq(otu_table(enterotype),phy_tree(as.phylo(enterotype_tree)))

coreEdges(example_phyloseq)


Faiths PD of the Core Microbiome

Description

Calculates Faith's phylogenetic diversity (PD) of a core microbiome based on either the tip-based or the branch-based core community phylogeny.

Usage

coreFaithsPD(x, core_fraction = 0.5, ab_threshold1 = 0,
ab_threshold2 = 0, ab_threshold3 = 0, mode='branch',
selection='basic', max_tax = NULL, increase_cutoff = 2,
initial_branches = NULL, rooted=TRUE)

Arguments

x

(Required) Microbial community data. This must be in the form of a phyloseq object and must contain, at a minimum, an OTU abundance table and a phylogeny.

core_fraction

The fraction of samples that a microbial taxon must be found in to be considered part of the 'core' microbiome. This variable is only used when selection = 'basic' and is ignored when selection = 'shade'. The default value is 0.5.

ab_threshold1

The threshold for mean relative abundance across all samples. This variable is only used when selection = 'basic' and is ignored when selection = 'shade'. The default value is 0.

ab_threshold2

The threshold for maximum relative abundance in any sample. This variable is only used when selection = 'basic' and is ignored when selection = 'shade'. The default value is 0.

ab_threshold3

The threshold for minimum relative abundance across all samples. This variable is only used when selection = 'basic' and is ignored when selection = 'shade'. The default value is 0.

mode

Whether to build a tip-based ('tip') or a branch-based ('branch') phylogeny. The default is 'branch'.

selection

Whether to use thresholds ('basic') or the Shade and Stopnisek method ('shade') to define the core community. The default is 'basic'.

max_tax

The maximum number of branches to add sequentially, as a percentage of the total branches when using the Shade and Stopnisek method. This variable is only used when selection = 'shade' and is ignored when selection = 'basic'.

increase_cutoff

The threshold for the percent increase in beta diversity used to identify the taxon at which point adding more taxa yields diminishing returns in explanatory power. This variable is only used when selection = 'shade' and is ignored when selection = 'basic'.

initial_branches

The number of branches to include prior to testing for increases in beta diversity. The default is to use all branches that are present in every sample and to begin testing branches that are missing from at least one sample. This variable is only used when selection = 'shade' and is ignored when selection = 'basic'.

rooted

Whether to include the root of the phylogeny. The default is TRUE, meaning that the root is necessarily included in all phylogenies. This requires that the input tree be rooted.

Details

coreFaithsPD calculates Faith's PD (Faith, 1992) based on either the tip-based or the branch-based core community phylogeny using either basic thresholds or a modification of the Shade and Stopnisek (2019) algorithm. In both cases, Faith's PD is calculated as the sum of the branch lengths in the core community phylogeny. For more details, see Bewick and Camper (2025).

Value

This function returns the numeric value of Faith's PD for the core microbiome.

References

Shade, Ashley, and Nejc Stopnisek. "Abundance-occupancy distributions to prioritize plant core microbiome membership." Current opinion in microbiology 49 (2019): 50-58.

Examples

#Test with enterotype dataset
library(phyloseq)
library(ape)
library(phytools)
data(enterotype)

set.seed(1)

#Generate an example tree and label it with the names of the microbial taxa
enterotype_tree<-rtree(length(taxa_names(enterotype)))
enterotype_tree$tip.label<-taxa_names(enterotype)
enterotype_tree$node.label<-as.character(1:1:enterotype_tree$Nnode)

#Create a phyloseq object with a tree
example_phyloseq<-phyloseq(otu_table(enterotype),phy_tree(as.phylo(enterotype_tree)))

coreFaithsPD(example_phyloseq)


Jaccard Distance Between Core Microbiomes from Two Habitats

Description

Calculates the Jaccard distance between the core microbiomes from two different types of habitats based on either the tip-based or the branch-based core community phylogeny.

Usage

coreJaccard(x, grouping, core_fraction = 0.5, ab_threshold1 = 0,
ab_threshold2 = 0, ab_threshold3 = 0, selection='basic',
max_tax = NULL, increase_cutoff = 2)

Arguments

x

(Required) Microbial community data. This must be in the form of a phyloseq object and must contain, at a minimum, an OTU abundance table.

grouping

(Required) A vector specifying which samples belong to which habitat type.

core_fraction

The fraction of samples that a microbial taxon must be found in to be considered part of the 'core' microbiome. This variable is only used when selection = 'basic' and is ignored when selection = 'shade'. The default value is 0.5.

ab_threshold1

The threshold for mean relative abundance across all samples. This variable is only used when selection = 'basic' and is ignored when selection = 'shade'. The default value is 0.

ab_threshold2

The threshold for maximum relative abundance in any sample. This variable is only used when selection = 'basic' and is ignored when selection = 'shade'. The default value is 0.

ab_threshold3

The threshold for minimum relative abundance across all samples. This variable is only used when selection = 'basic' and is ignored when selection = 'shade'. The default value is 0.

selection

Whether to use thresholds ('basic') or the Shade and Stopnisek method ('shade') to define the core community. The default is 'basic'.

max_tax

The maximum number of branches to add sequentially, as a percentage of the total branches when using the Shade and Stopnisek method. This variable is only used when selection = 'shade' and is ignored when selection = 'basic'.

increase_cutoff

The threshold for the percent increase in beta diversity used to identify the taxon at which point adding more taxa yields diminishing returns in explanatory power. This variable is only used when selection = 'shade' and is ignored when selection = 'basic'.

Details

coreJaccard calculates the Jaccard distance (Jaccard, 1901 A and B) between the core microbiomes from two different types of habitats using either basic thresholds or a modification of the Shade and Stopnisek (2019) algorithm. Briefly, the Jaccard distance is calculated as the number of unique taxa in the core communities from each of the two habitats, divided by the total number of taxa in the two habitats combined. For more details, see Bewick and Camper (2025).

Value

This function returns the numeric value of the Jaccard distance between two core microbiomes.

References

Bewick, S.A. and Benjamin T. Camper. "Phylogenetic Measures of the Core Microbiome" <doi:TBD>

Jaccard, P. (1901A) Étude comparative de la distribution florale dans une portion des Alpes et des Jura. Bulletin de la Société Vaudoise des Sciences Naturelles 37, 547-579.

Jaccard, P. (1901B) Distribution de la flore alpine dans le bassin des Dranses et dans quelques régions voisines. Bulletin de la Société Vaudoise des Sciences Naturelles 37, 241-272.

McMurdie, Paul J., and Susan Holmes. "phyloseq: an R package for reproducible interactive analysis and graphics of microbiome census data." PloS one 8.4 (2013): e61217.

McMurdie, Paul J., and Susan Holmes. "Phyloseq: a bioconductor package for handling and analysis of high-throughput phylogenetic sequence data." Biocomputing 2012. 2012. 235-246.

Shade, Ashley, and Nejc Stopnisek. "Abundance-occupancy distributions to prioritize plant core microbiome membership." Current opinion in microbiology 49 (2019): 50-58.

Examples

#Test with enterotype dataset
library(phyloseq)
library(ape)
library(phytools)
data(enterotype)

set.seed(1)

#Generate an example tree and label it with the names of the microbial taxa
enterotype_tree<-rtree(length(taxa_names(enterotype)))
enterotype_tree$tip.label<-taxa_names(enterotype)

#keep only those samples with gender identified
gendered<-which(!(is.na(sample_data(enterotype)$Gender)))
enterotypeMF<-prune_samples(sample_names(enterotype)[gendered],enterotype)

#Create a phyloseq object with a tree
example_phyloseq<-phyloseq(otu_table(enterotypeMF))

coreJaccard(example_phyloseq,grouping=sample_data(enterotypeMF)$Gender)


Phylogenetic Venn Diagram Comparing Core Microbiomes from Multiple Habitats

Description

Generates a Venn diagram comparing and contrasting the shared and unique phylogenetic branch lengths of core microbiomes from two or more (up to seven) different types of habitats based on either the tip-based or the branch-based core community phylogeny.

Usage

corePhyloVenn(x, grouping, core_fraction = 0.5, ab_threshold1 = 0,
ab_threshold2 = 0, ab_threshold3 = 0, mode='branch',  selection='basic', max_tax = NULL,
increase_cutoff = 2, initial_branches = NULL, rooted=TRUE,
ordered_groups=NULL,show_percentage=TRUE,decimal=2,
fill_color=c('red','orange','yellow','green','blue','purple','black'),
fill_alpha=0.5, stroke_color='black',stroke_alpha = 1, stroke_size = 1,
stroke_linetype = "solid", set_name_color = "black", set_name_size = 6,
text_color = "black",text_size = 4)

Arguments

x

(Required) Microbial community data. This must be in the form of a phyloseq object and must contain, at a minimum, an OTU abundance table and a phylogeny.

grouping

(Required) A vector specifying which samples belong to which habitat type.

core_fraction

The fraction of samples that a microbial taxon must be found in to be considered part of the 'core' microbiome. This variable is only used when selection = 'basic' and is ignored when selection = 'shade'. The default value is 0.5.

ab_threshold1

The threshold for mean relative abundance across all samples. This variable is only used when selection = 'basic' and is ignored when selection = 'shade'. The default value is 0.

ab_threshold2

The threshold for maximum relative abundance in any sample. This variable is only used when selection = 'basic' and is ignored when selection = 'shade'. The default value is 0.

ab_threshold3

The threshold for minimum relative abundance across all samples. This variable is only used when selection = 'basic' and is ignored when selection = 'shade'. The default value is 0.

mode

Whether to build a tip-based ('tip') or a branch-based ('branch') phylogeny. The default is 'branch'.

selection

Whether to use thresholds ('basic') or the Shade and Stopnisek method ('shade') to define the core community. The default is 'basic'.

max_tax

The maximum number of branches to add sequentially, as a percentage of the total branches when using the Shade and Stopnisek method. This variable is only used when selection = 'shade' and is ignored when selection = 'basic'.

increase_cutoff

The threshold for the percent increase in beta diversity used to identify the taxon at which point adding more taxa yields diminishing returns in explanatory power. This variable is only used when selection = 'shade' and is ignored when selection = 'basic'.

initial_branches

The number of branches to include prior to testing for increases in beta diversity. The default is to use all branches that are present in every sample and to begin testing branches that are missing from at least one sample. This variable is only used when selection = 'shade' and is ignored when selection = 'basic'.

rooted

Whether to include the root of the phylogeny. The default is TRUE, meaning that the root is necessarily included in all phylogenies. This requires that the input tree be rooted.

ordered_groups

When provided, specifies the order in which different habitats should be plotted. This order is matched to the color order specified in fill_color (see below).

show_percentage

If TRUE the Venn diagram is shown as percentages of the total branch length in each compartment. If FALSE the Venn diagram is shown as absolute branch lengths.

decimal

The number of decimal points to report in the Venn diagram compartments (either for percentages or absolute branch lengths).

fill_color

A vector specifying the colors to use for each of the different habitats in the Venn diagram. The default is to use the six colors of the rainbow, followed by black.

fill_alpha

The transparency of the fill colors in the Venn diagram.

stroke_color

The color of the outlines in the Venn diagram.

stroke_alpha

The transparency of the outlines in the Venn diagram.

stroke_size

The width of the outlines in the Venn diagram.

stroke_linetype

The style of the outlines in the Venn diagram.

set_name_color

The color of the font used to label each habitat.

set_name_size

The size of the font used to label each habitat

text_color

The color of the font used to report the value in each compartment of the Venn diagram

text_size

The size of the font used to report the value in each compartment of the Venn diagram

Details

corePhyloVenn generates a Venn diagram showing either the percentage of the phylogenetic branch length or the absolute phylogenetic branch length shared between the core community phylogenies from all possible combinations of up to seven different habitat types. For more details, see Bewick and Camper (2025).

Value

This function returns a plot of the phylogenetic Venn diagram.

References

Bewick, S.A. and Benjamin T. Camper. "Phylogenetic Measures of the Core Microbiome" <doi:TBD>

McMurdie, Paul J., and Susan Holmes. "phyloseq: an R package for reproducible interactive analysis and graphics of microbiome census data." PloS one 8.4 (2013): e61217.

McMurdie, Paul J., and Susan Holmes. "Phyloseq: a bioconductor package for handling and analysis of high-throughput phylogenetic sequence data." Biocomputing 2012. 2012. 235-246.

Shade, Ashley, and Nejc Stopnisek. "Abundance-occupancy distributions to prioritize plant core microbiome membership." Current opinion in microbiology 49 (2019): 50-58.

Examples

#Test with enterotype dataset
library(phyloseq)
library(ape)
library(phytools)
library(ggplot2)
data(enterotype)

set.seed(1)

#Generate an example tree and label it with the names of the microbial taxa
enterotype_tree<-rtree(length(taxa_names(enterotype)))
enterotype_tree$tip.label<-taxa_names(enterotype)
enterotype_tree$node.label<-as.character(1:1:enterotype_tree$Nnode)

#keep only those samples with gender identified
gendered<-which(!(is.na(sample_data(enterotype)$Gender)))
enterotypeMF<-prune_samples(sample_names(enterotype)[gendered],enterotype)

#Create a phyloseq object with a tree
example_phyloseq<-phyloseq(otu_table(enterotypeMF),phy_tree(as.phylo(enterotype_tree)))

corePhyloVenn(example_phyloseq,grouping=sample_data(enterotypeMF)$Gender)


Richness of the Core Microbiome

Description

Calculates richness of a core microbiome using a non-phylogenetic approach.

Usage

coreRichness(x, core_fraction = 0.5, ab_threshold1 = 0, ab_threshold2 = 0,
ab_threshold3 = 0, selection='basic', max_tax = NULL, increase_cutoff = 2)

Arguments

x

(Required) Microbial community data. This must be in the form of a phyloseq object and must contain, at a minimum, an OTU abundance table and a phylogeny.

core_fraction

The fraction of samples that a microbial taxon must be found in to be considered part of the 'core' microbiome. This variable is only used when selection = 'basic' and is ignored when selection = 'shade'. The default value is 0.5.

ab_threshold1

The threshold for mean relative abundance across all samples. This variable is only used when selection = 'basic' and is ignored when selection = 'shade'. The default value is 0.

ab_threshold2

The threshold for maximum relative abundance in any sample. This variable is only used when selection = 'basic' and is ignored when selection = 'shade'. The default value is 0.

ab_threshold3

The threshold for minimum relative abundance across all samples. This variable is only used when selection = 'basic' and is ignored when selection = 'shade'. The default value is 0.

selection

Whether to use thresholds ('basic') or the Shade and Stopnisek method ('shade') to define the core community. The default is 'basic'.

max_tax

The maximum number of branches to add sequentially, as a percentage of the total branches when using the Shade and Stopnisek method. This variable is only used when selection = 'shade' and is ignored when selection = 'basic'.

increase_cutoff

The threshold for the percent increase in beta diversity used to identify the taxon at which point adding more taxa yields diminishing returns in explanatory power. This variable is only used when selection = 'shade' and is ignored when selection = 'basic'.

Details

coreRichness calculates richness (a count of microbial taxa) using either basic thresholds or a modification of the Shade and Stopnisek (2019) algorithm.

Value

This function returns the numeric value of richness for the core microbiome.

References

Shade, Ashley, and Nejc Stopnisek. "Abundance-occupancy distributions to prioritize plant core microbiome membership." Current opinion in microbiology 49 (2019): 50-58.

Examples

#Test with enterotype dataset
library(phyloseq)
library(ape)
library(phytools)
data(enterotype)

set.seed(1)

#Generate an example tree and label it with the names of the microbial taxa
enterotype_tree<-rtree(length(taxa_names(enterotype)))
enterotype_tree$tip.label<-taxa_names(enterotype)

#Create a phyloseq object with a tree
example_phyloseq<-phyloseq(otu_table(enterotype),phy_tree(as.phylo(enterotype_tree)))

coreRichness(example_phyloseq)


Taxa Included in the Core Microbiome

Description

Finds the microbial taxa that are part of the core microbiome.

Usage

coreTaxa(x, core_fraction = 0.5, ab_threshold1 = 0, ab_threshold2 = 0,
ab_threshold3 = 0, selection='basic', max_tax = NULL,
increase_cutoff = 2,  remove_zeros=TRUE)

Arguments

x

(Required) Microbial community data. This must be in the form of a phyloseq object and must contain, at a minimum, an OTU abundance table.

core_fraction

The fraction of samples that a microbial taxon must be found in to be considered part of the 'core' microbiome. This variable is only used when selection = 'basic' and is ignored when selection = 'shade'. The default value is 0.5.

ab_threshold1

The threshold for mean relative abundance across all samples. This variable is only used when selection = 'basic' and is ignored when selection = 'shade'. The default value is 0.

ab_threshold2

The threshold for maximum relative abundance in any sample. This variable is only used when selection = 'basic' and is ignored when selection = 'shade'. The default value is 0.

ab_threshold3

The threshold for minimum relative abundance across all samples. This variable is only used when selection = 'basic' and is ignored when selection = 'shade'. The default value is 0.

selection

Whether to use thresholds ('basic') or the Shade and Stopnisek method ('shade') to define the core community. The default is 'basic'.

max_tax

The maximum number of taxa to add sequentially, as a percentage of the total taxa when using the Shade and Stopnisek method. This variable is only used when selection = 'shade' and is ignored when selection = 'basic'.

increase_cutoff

The threshold for the percent increase in beta diversity used to identify the taxon at which point adding more taxa yields diminishing returns in explanatory power. This variable is only used when selection = 'shade' and is ignored when selection = 'basic'.

remove_zeros

Whether or not to remove taxa that are no reads associated across any of the samples.

Details

coreTaxa identifies the microbial taxa of the core microbiome based on either the tip-based or the branch-based core community phylogeny using either basic thresholds or a modification of the Shade and Stopnisek (2019) algorithm. For more details, see Bewick and Camper (2025).

Value

This function returns a list of the microbial taxa associated with the core microbiome.

References

Shade, Ashley, and Nejc Stopnisek. "Abundance-occupancy distributions to prioritize plant core microbiome membership." Current opinion in microbiology 49 (2019): 50-58.

Examples

#Test with enterotype dataset
library(phyloseq)
library(ape)
library(phytools)
data(enterotype)

set.seed(1)

#Generate an example tree and label it with the names of the microbial taxa
enterotype_tree<-rtree(length(taxa_names(enterotype)))
enterotype_tree$tip.label<-taxa_names(enterotype)
enterotype_tree$node.label<-as.character(1:1:enterotype_tree$Nnode)

#Create a phyloseq object with a tree
example_phyloseq<-phyloseq(otu_table(enterotype),phy_tree(as.phylo(enterotype_tree)))

coreTaxa(example_phyloseq)


The Core Community Phylogeny

Description

Identifies and plots the tip-based or the branch-based core community phylogeny based on the occurrence of abundance of different microbial lineages in a set of samples from a common habitat (e.g., type of host or environment).

Usage

coreTree(x, core_fraction = 0.5, ab_threshold1 = 0,
ab_threshold2 = 0,ab_threshold3 = 0, mode='branch',
selection = 'basic',max_tax=NULL, increase_cutoff = 2,
initial_branches = NULL, NCcol = 'black', Ccol='red',rooted=TRUE,
branch.width=4,label.tips=FALSE, scaled = FALSE, remove_zeros=TRUE,
plot.chronogram=FALSE)

Arguments

x

(Required) Microbial community data. This must be in the form of a phyloseq object and must contain, at a minimum, an OTU abundance table and a phylogeny.

core_fraction

The fraction of samples that a microbial taxon must be found in to be considered part of the 'core' microbiome. This variable is only used when selection = 'basic' and is ignored when selection = 'shade'. The default value is 0.5.

ab_threshold1

The threshold for mean relative abundance across all samples. This variable is only used when selection = 'basic' and is ignored when selection = 'shade'. The default value is 0.

ab_threshold2

The threshold for maximum relative abundance in any sample. This variable is only used when selection = 'basic' and is ignored when selection = 'shade'. The default value is 0.

ab_threshold3

The threshold for minimum relative abundance across all samples. This variable is only used when selection = 'basic' and is ignored when selection = 'shade'. The default value is 0.

mode

Whether to build a tip-based ('tip') or a branch-based ('branch') phylogeny. The default is 'branch'.

selection

Whether to use thresholds ('basic') or the Shade and Stopnisek method ('shade') to define the core community. The default is 'basic'.

max_tax

The maximum number of branches to add sequentially, as a percentage of the total branches when using the Shade and Stopnisek method. This variable is only used when selection = 'shade' and is ignored when selection = 'basic'.

increase_cutoff

The threshold for the percent increase in beta diversity used to identify the taxon at which point adding more taxa yields diminishing returns in explanatory power. This variable is only used when selection = 'shade' and is ignored when selection = 'basic'.

initial_branches

The number of branches to include prior to testing for increases in beta diversity. The default is to use all branches that are present in every sample and to begin testing branches that are missing from at least one sample. This variable is only used when selection = 'shade' and is ignored when selection = 'basic'.

rooted

Whether to include the root of the phylogeny. The default is TRUE, meaning that the root is necessarily included in all phylogenies. This requires that the input tree be rooted.

NCcol

The color to plot all branches of the phylogeny that are NOT part of the core community phylogeny. The default is black.

Ccol

The color to plot all branches of the phylogeny that are ARE part of the core community phylogeny. The default is red.

branch.width

The width to use when plotting the branches of the phylogeny. The default is 4.

label.tips

Whether or not to label the tips of the phylogeny with the microbial taxon names. The default is FALSE.

scaled

Whether or not to scale the branch lengths. The default is FALSE.

remove_zeros

Whether or not to remove taxa that are missing from all samples prior to drawing the phylogeny. The default is TRUE.

plot.chronogram

Whether to plot a phylogeny or a chronogram. The default is FALSE.

Details

coreTree identifies either the tip-based or the branch-based core community phylogeny. For the tip-based core community phylogeny, individual microbial taxa are retained based on being present in a threshold number of samples or at a threshold abundance. Once core microbial taxa have been identified, they are used to reconstruct the core community phylogeny. For the branch-based core community phylogeny, the phylogenetic tree for the entire dataset is examined, branch-by-branch, to determine which branches should be retained based on being present in a threshold number of samples or at a threshold abundance. If rooted = TRUE, branches are counted based on individual sample phylogenies that include the root node. Likewise, the tip-based tree is forced to include the root. If rooted = FALSE, branches are counted based on individual sample phylogenies that span all taxa present in the sample. Similarly, the tip-based phylogeny is the tree that spans all core taxa, and may not include the root. For more details, see Bewick and Camper (2025).

Value

This function plots the phylogeny for the entire dataset in black and colors the branches that are part of the core community phylogeny in red. These colors can be altered using the NCcol and Ccol variables.

References

Bewick, S.A. and Benjamin T. Camper. "Phylogenetic Measures of the Core Microbiome" <doi:TBD>

McMurdie, Paul J., and Susan Holmes. "phyloseq: an R package for reproducible interactive analysis and graphics of microbiome census data." PloS one 8.4 (2013): e61217.

McMurdie, Paul J., and Susan Holmes. "Phyloseq: a bioconductor package for handling and analysis of high-throughput phylogenetic sequence data." Biocomputing 2012. 2012. 235-246.

Examples

#Test with enterotype dataset
library(phyloseq)
library(ape)
library(phytools)
data(enterotype)

set.seed(1)

#Generate an example tree and label it with the names of the microbial taxa
enterotype_tree<-rtree(length(taxa_names(enterotype)))
enterotype_tree$tip.label<-taxa_names(enterotype)

#Create a phyloseq object with a tree
example_phyloseq<-phyloseq(otu_table(enterotype),phy_tree(as.phylo(enterotype_tree)))

coreTree(example_phyloseq,0.5)


UniFrac Distance Between Core Microbiomes from Two Habitats

Description

Calculates the UniFrac distance between the core microbiomes from two different types of habitats based on either the tip-based or the branch-based core community phylogeny.

Usage

coreUniFrac(x, grouping, core_fraction = 0.5, ab_threshold1 = 0,
ab_threshold2 = 0, ab_threshold3 = 0, mode='branch', selection='basic',
max_tax = NULL, increase_cutoff = 2, initial_branches = NULL, rooted=TRUE)

Arguments

x

(Required) Microbial community data. This must be in the form of a phyloseq object and must contain, at a minimum, an OTU abundance table and a phylogeny.

grouping

(Required) A vector specifying which samples belong to which habitat type.

core_fraction

The fraction of samples that a microbial taxon must be found in to be considered part of the 'core' microbiome. This variable is only used when selection = 'basic' and is ignored when selection = 'shade'. The default value is 0.5.

ab_threshold1

The threshold for mean relative abundance across all samples. This variable is only used when selection = 'basic' and is ignored when selection = 'shade'. The default value is 0.

ab_threshold2

The threshold for maximum relative abundance in any sample. This variable is only used when selection = 'basic' and is ignored when selection = 'shade'. The default value is 0.

ab_threshold3

The threshold for minimum relative abundance across all samples. This variable is only used when selection = 'basic' and is ignored when selection = 'shade'. The default value is 0.

mode

Whether to build a tip-based ('tip') or a branch-based ('branch') phylogeny. The default is 'branch'.

selection

Whether to use thresholds ('basic') or the Shade and Stopnisek method ('shade') to define the core community. The default is 'basic'.

max_tax

The maximum number of branches to add sequentially, as a percentage of the total branches when using the Shade and Stopnisek method. This variable is only used when selection = 'shade' and is ignored when selection = 'basic'.

increase_cutoff

The threshold for the percent increase in beta diversity used to identify the taxon at which point adding more taxa yields diminishing returns in explanatory power. This variable is only used when selection = 'shade' and is ignored when selection = 'basic'.

initial_branches

The number of branches to include prior to testing for increases in beta diversity. The default is to use all branches that are present in every sample and to begin testing branches that are missing from at least one sample. This variable is only used when selection = 'shade' and is ignored when selection = 'basic'.

rooted

Whether to include the root of the phylogeny. The default is TRUE, meaning that the root is necessarily included in all phylogenies. This requires that the input tree be rooted.

Details

coreUniFrac calculates the UniFrac distance (Lozupone and Knight, 2005) between the core microbiomes from two different types of habitats based on either their tip-based or their branch-based core community phylogenies and using either basic thresholds or a modification of the Shade and Stopnisek (2019) algorithm. In both cases, the UniFrac distance is calculated as the sum of the unique branch lengths in the core community phylogenies from each of the two habitats, divided by the total branch length of all branches in the core community phylogenies from the two habitats combined. For more details, see Bewick and Camper (2025).

Value

This function returns the numeric value of the UniFrac distance between two core microbiomes.

References

Bewick, S.A. and Benjamin T. Camper. "Phylogenetic Measures of the Core Microbiome" <doi:TBD>

Lozupone, Catherine, and Rob Knight. "UniFrac: a new phylogenetic method for comparing microbial communities." Applied and environmental microbiology 71.12 (2005): 8228-8235.

McMurdie, Paul J., and Susan Holmes. "phyloseq: an R package for reproducible interactive analysis and graphics of microbiome census data." PloS one 8.4 (2013): e61217.

McMurdie, Paul J., and Susan Holmes. "Phyloseq: a bioconductor package for handling and analysis of high-throughput phylogenetic sequence data." Biocomputing 2012. 2012. 235-246.

Shade, Ashley, and Nejc Stopnisek. "Abundance-occupancy distributions to prioritize plant core microbiome membership." Current opinion in microbiology 49 (2019): 50-58.

Examples

#Test with enterotype dataset
library(phyloseq)
library(ape)
library(phytools)
data(enterotype)

set.seed(1)

#Generate an example tree and label it with the names of the microbial taxa
enterotype_tree<-rtree(length(taxa_names(enterotype)))
enterotype_tree$tip.label<-taxa_names(enterotype)
enterotype_tree$node.label<-as.character(1:1:enterotype_tree$Nnode)

#keep only those samples with gender identified
gendered<-which(!(is.na(sample_data(enterotype)$Gender)))
enterotypeMF<-prune_samples(sample_names(enterotype)[gendered],enterotype)

#Create a phyloseq object with a tree
example_phyloseq<-phyloseq(otu_table(enterotypeMF),phy_tree(as.phylo(enterotype_tree)))

coreUniFrac(example_phyloseq,grouping=sample_data(enterotypeMF)$Gender)


Venn Diagram Comparing (Non-Phylogenetic) Core Microbiomes from Multiple Habitats

Description

Generates a Venn diagram comparing and contrasting the shared and unique taxa of core microbiomes from two or more (up to seven) different types of habitats using a non-phylogenetic approach.

Usage

coreVenn(x, grouping, core_fraction = 0.5, ab_threshold1 = 0,
ab_threshold2 = 0, ab_threshold3 = 0, selection='basic', max_tax = NULL,
increase_cutoff = 2,ordered_groups=NULL,show_percentage=TRUE,decimal=2,
fill_color=c('red','orange','yellow','green','blue','purple','black'),fill_alpha=0.5,
stroke_color='black',stroke_alpha = 1, stroke_size = 1,stroke_linetype = "solid",
set_name_color = "black", set_name_size = 6,text_color = "black",text_size = 4)

Arguments

x

(Required) Microbial community data. This must be in the form of a phyloseq object and must contain, at a minimum, an OTU abundance table and a phylogeny.

grouping

(Required) A vector specifying which samples belong to which habitat type.

core_fraction

The fraction of samples that a microbial taxon must be found in to be considered part of the 'core' microbiome. This variable is only used when selection = 'basic' and is ignored when selection = 'shade'. The default value is 0.5.

ab_threshold1

The threshold for mean relative abundance across all samples. This variable is only used when selection = 'basic' and is ignored when selection = 'shade'. The default value is 0.

ab_threshold2

The threshold for maximum relative abundance in any sample. This variable is only used when selection = 'basic' and is ignored when selection = 'shade'. The default value is 0.

ab_threshold3

The threshold for minimum relative abundance across all samples. This variable is only used when selection = 'basic' and is ignored when selection = 'shade'. The default value is 0.

selection

Whether to use thresholds ('basic') or the Shade and Stopnisek method ('shade') to define the core community. The default is 'basic'.

max_tax

The maximum number of branches to add sequentially, as a percentage of the total branches when using the Shade and Stopnisek method. This variable is only used when selection = 'shade' and is ignored when selection = 'basic'.

increase_cutoff

The threshold for the percent increase in beta diversity used to identify the taxon at which point adding more taxa yields diminishing returns in explanatory power. This variable is only used when selection = 'shade' and is ignored when selection = 'basic'.

ordered_groups

When provided, specifies the order in which different habitats should be plotted. This order is matched to the color order specified in fill_color (see below).

show_percentage

If TRUE the Venn diagram is shown as percentages of the total branch length in each compartment. If FALSE the Venn diagram is shown as absolute branch lengths.

decimal

The number of decimal points to report in the Venn diagram compartments (either for percentages or absolute branch lengths).

fill_color

A vector specifying the colors to use for each of the different habitats in the Venn diagram. The default is to use the six colors of the rainbow, followed by black.

fill_alpha

The transparency of the fill colors in the Venn diagram.

stroke_color

The color of the outlines in the Venn diagram.

stroke_alpha

The transparency of the outlines in the Venn diagram.

stroke_size

The width of the outlines in the Venn diagram.

stroke_linetype

The style of the outlines in the Venn diagram.

set_name_color

The color of the font used to label each habitat.

set_name_size

The size of the font used to label each habitat

text_color

The color of the font used to report the value in each compartment of the Venn diagram

text_size

The size of the font used to report the value in each compartment of the Venn diagram

Details

coreVenn generates a Venn diagram showing either the percentage of taxa or the taxon count shared between the core communities from all possible combinations of up to seven different habitat types. For more details, see Bewick and Camper (2025).

Value

This function returns a plot of the Venn diagram.

References

Bewick, S.A. and Benjamin T. Camper. "Phylogenetic Measures of the Core Microbiome" <doi:TBD>

McMurdie, Paul J., and Susan Holmes. "phyloseq: an R package for reproducible interactive analysis and graphics of microbiome census data." PloS one 8.4 (2013): e61217.

McMurdie, Paul J., and Susan Holmes. "Phyloseq: a bioconductor package for handling and analysis of high-throughput phylogenetic sequence data." Biocomputing 2012. 2012. 235-246.

Shade, Ashley, and Nejc Stopnisek. "Abundance-occupancy distributions to prioritize plant core microbiome membership." Current opinion in microbiology 49 (2019): 50-58.

Examples

#Test with enterotype dataset
library(phyloseq)
library(ape)
library(phytools)
library(ggplot2)
data(enterotype)

set.seed(1)

#Generate an example tree and label it with the names of the microbial taxa
enterotype_tree<-rtree(length(taxa_names(enterotype)))
enterotype_tree$tip.label<-taxa_names(enterotype)
enterotype_tree$node.label<-as.character(1:1:enterotype_tree$Nnode)

#keep only those samples with gender identified
gendered<-which(!(is.na(sample_data(enterotype)$Gender)))
enterotypeMF<-prune_samples(sample_names(enterotype)[gendered],enterotype)

#Create a phyloseq object with a tree
example_phyloseq<-phyloseq(otu_table(enterotypeMF),phy_tree(as.phylo(enterotype_tree)))

coreVenn(example_phyloseq,grouping=sample_data(enterotypeMF)$Gender)


Phylogeny with Branches Colored Based on Venn Diagram

Description

Plots a phylogeny with branches colored based on the compartment that they are associated with in the Venn diagram generated by the corePhyloVenn function.

Usage

coreVennTree(x, grouping, core_fraction = 0.5, ab_threshold1 = 0,
ab_threshold2 = 0, ab_threshold3 = 0, mode='branch',
selection='basic', max_tax = NULL,increase_cutoff = 2, initial_branches = NULL,
rooted=TRUE, ordered_groups=NULL,branch_color=NULL,remove_zeros=TRUE,
plot.chronogram=FALSE,branch.width = 4, label.tips = FALSE, scaled = FALSE)

Arguments

x

(Required) Microbial community data. This must be in the form of a phyloseq object and must contain, at a minimum, an OTU abundance table and a phylogeny.

grouping

(Required) A vector specifying which samples belong to which habitat type.

core_fraction

The fraction of samples that a microbial taxon must be found in to be considered part of the 'core' microbiome. This variable is only used when selection = 'basic' and is ignored when selection = 'shade'. The default value is 0.5.

ab_threshold1

The threshold for mean relative abundance across all samples. This variable is only used when selection = 'basic' and is ignored when selection = 'shade'. The default value is 0.

ab_threshold2

The threshold for maximum relative abundance in any sample. This variable is only used when selection = 'basic' and is ignored when selection = 'shade'. The default value is 0.

ab_threshold3

The threshold for minimum relative abundance across all samples. This variable is only used when selection = 'basic' and is ignored when selection = 'shade'. The default value is 0.

mode

Whether to build a tip-based ('tip') or a branch-based ('branch') phylogeny. The default is 'branch'.

selection

Whether to use thresholds ('basic') or the Shade and Stopnisek method ('shade') to define the core community. The default is 'basic'.

max_tax

The maximum number of branches to add sequentially, as a percentage of the total branches when using the Shade and Stopnisek method. This variable is only used when selection = 'shade' and is ignored when selection = 'basic'.

increase_cutoff

The threshold for the percent increase in beta diversity used to identify the taxon at which point adding more taxa yields diminishing returns in explanatory power. This variable is only used when selection = 'shade' and is ignored when selection = 'basic'.

initial_branches

The number of branches to include prior to testing for increases in beta diversity. The default is to use all branches that are present in every sample and to begin testing branches that are missing from at least one sample. This variable is only used when selection = 'shade' and is ignored when selection = 'basic'.

rooted

Whether to include the root of the phylogeny. The default is TRUE, meaning that the root is necessarily included in all phylogenies. This requires that the input tree be rooted.

ordered_groups

When provided, specifies the order in which different habitats should be plotted. This order is matched to the color order specified in fill_color (see below).

branch_color

A vector specifying what colors to use for branches associated with each of the different habitat combinations in the Venn diagram. This vector must be as long as the number of possible habitat combinations (number of compartments in the Venn diagram plus one for branches not included in the core of any habitats). When no colors are specified or a vector of the wrong length is specified, the default is to use a range of colors from blue to red.

branch.width

The width to use when plotting the branches of the phylogeny. The default is 4.

label.tips

Whether or not to label the tips of the phylogeny with the microbial taxon names. The default is FALSE.

scaled

Whether or not to scale the branch lengths. The default is FALSE.

remove_zeros

Whether or not to remove taxa that are missing from all samples prior to drawing the phylogeny. The default is TRUE.

plot.chronogram

Whether to plot a phylogeny or a chronogram. The default is FALSE.

Details

coreVennTree generates a phylogeny with branches colored according to the compartments of an associated Venn diagram as generated using the coreVenn function. For more details, see Bewick and Camper (2025).

Value

This function returns a color coded plot of the phylogeny. When a vector of colors is specified, the color key is printed out in the console.

References

Bewick, S.A. and Benjamin T. Camper. "Phylogenetic Measures of the Core Microbiome" <doi:TBD>

McMurdie, Paul J., and Susan Holmes. "phyloseq: an R package for reproducible interactive analysis and graphics of microbiome census data." PloS one 8.4 (2013): e61217.

McMurdie, Paul J., and Susan Holmes. "Phyloseq: a bioconductor package for handling and analysis of high-throughput phylogenetic sequence data." Biocomputing 2012. 2012. 235-246.

Examples

#Test with enterotype dataset
library(phyloseq)
library(ape)
library(phytools)
library(ggplot2)
data(enterotype)

set.seed(1)

#Generate an example tree and label it with the names of the microbial taxa
enterotype_tree<-rtree(length(taxa_names(enterotype)))
enterotype_tree$tip.label<-taxa_names(enterotype)

#keep only those samples with gender identified
gendered<-which(!(is.na(sample_data(enterotype)$Gender)))
enterotypeMF<-prune_samples(sample_names(enterotype)[gendered],enterotype)

#Create a phyloseq object with a tree
example_phyloseq<-phyloseq(otu_table(enterotypeMF),phy_tree(as.phylo(enterotype_tree)))

#Define the groups
bygender<-sample_data(enterotypeMF)$Gender

#Define the colors for group combinations
clist<-c('black','red','orange','yellow')

#Plot the tree
coreVennTree(example_phyloseq,grouping=bygender,0.5,branch_color=clist)


A modified version of functions from the ggvenn package for plotting phylogenetic venn diagrams.

Description

A modified version of functions from the ggvenn package for plotting phylogenetic venn diagrams.

Usage

ggvenn2(
  data,
  columns = NULL,
  show_elements = FALSE,
  show_percentage = TRUE,
  digits = 1,
  fill_color = c("blue", "yellow", "green", "red"),
  fill_alpha = 0.5,
  stroke_color = "black",
  stroke_alpha = 1,
  stroke_size = 1,
  stroke_linetype = "solid",
  set_name_color = "black",
  set_name_size = 6,
  text_color = "black",
  text_size = 4,
  label_sep = ",",
  count_column = NULL,
  show_outside = c("auto", "none", "always"),
  auto_scale = FALSE,
  comma_sep = FALSE
)

Arguments

data

A data.frame or a list as input data.

columns

A character vector use as index to select columns/elements.

show_elements

Show set elements instead of count/percentage.

show_percentage

Show percentage for each set.

digits

The desired number of digits after the decimal point

fill_color

Filling colors in circles.

fill_alpha

Transparency for filling circles.

stroke_color

Stroke color for drawing circles.

stroke_alpha

Transparency for drawing circles.

stroke_size

Stroke size for drawing circles.

stroke_linetype

Line type for drawing circles.

set_name_color

Text color for set names.

set_name_size

Text size for set names.

text_color

Text color for intersect contents.

text_size

Text size for intersect contents.

label_sep

Separator character for displaying elements.

count_column

Specify column for element repeat count.

show_outside

Show outside elements (not belongs to any set).

auto_scale

Allow automatically resizing circles according to element counts.

comma_sep

Whether to use comma as separator for displaying numbers.

Value

The ggplot object to print or save to file.

References

Yan, Linlin, and Maintainer Linlin Yan. "Package “ggvenn.”." (2021).

See Also

geom_venn, ggvenn

Examples


# use list as input
a <- list(`Set 1` = c(1, 3, 5, 7),
          `Set 2` = c(1, 5, 9),
          `Set 3` = c(1, 2, 8),
          `Set 4` = c(6, 7))
ggvenn2(a, c("Set 1", "Set 2"))
ggvenn2(a, c("Set 1", "Set 2", "Set 3"))
ggvenn2(a)


Branch-based Core Community Phylogeny Using the Shade & Stopnisek (2019) algorithm.

Description

Called internally. Identifies all edges and core edges of the core community phylogeny using a branch-based approach with a modified Shade and Stopnisek (2019) algorithm.

Usage

shade_branch(xv, nt, mxtx,fi,st)

Arguments

xv

(Required) The phyloseq object passed from main functions and containing microbial community data.

nt

(Required) The microbial phylogeny passed from main functions.

mxtx

(Required) The maximum number of branches to add sequentially, as a percentage of the total branches.

fi

(Required) The threshold for the percent increase in beta diversity used to identify the taxon at which point adding more taxa yields diminishing returns in explanatory power.

st

(Required) The number of branches to include prior to testing for increases in beta diversity. The default is to use all branches that are present in every sample and to begin testing for branches that are missing from at least one sample.

Details

shade_branch is used internally in the holobiont package to identify core edges of a microbial phylogeny for the branch-based approach using a modified Shade and Stopnisek (2019) algorithm.

Value

This function returns a list of all edges and core edges.

References

Shade, Ashley, and Nejc Stopnisek. "Abundance-occupancy distributions to prioritize plant core microbiome membership." Current opinion in microbiology 49 (2019): 50-58. Briefly, branches are ranked according to abundance first and then occupancy (i.e., occupancy is more important). Next, starting from st branches with the highest occupancy/abundance, branches are added sequentially, up to a maximum number of branches. Each time a new branch is added, the mean beta-diversity (UniFrac) between each pair of samples is calculated and is then averaged across all samples. Next, total beta-diversity (full phylogeny) between pairs samples is calculated, and again averaged across all samples. Finally, the percent increase in total beta-diversity is calculated for each new bracnh added. A threshold is then selected based on the lowest ranked branch that yields a minimum percent increase in beta-diversity.

Examples

#Test with enterotype dataset
library(phyloseq)
library(ape)
library(phytools)
data(enterotype)

set.seed(1)

enterotype<-prune_taxa(taxa_names(enterotype)[2:21],enterotype)
enterotype<-prune_samples(sample_names(enterotype)[2:21],enterotype)

#Generate an example tree and label it with the names of the microbial taxa
enterotype_tree<-rtree(length(taxa_names(enterotype)))
enterotype_tree$tip.label<-taxa_names(enterotype)
enterotype_tree$node.label<-as.character(1:1:enterotype_tree$Nnode)

#Create a phyloseq object with a tree
example_phyloseq<-phyloseq(otu_table(enterotype),phy_tree(as.phylo(enterotype_tree)))
newtree<-bind.tip(phy_tree(example_phyloseq),tip.label='outgroup',edge.length=0.0001,position=0)

shade_branch(example_phyloseq, newtree, NULL, 2, NULL)


Non-phylogenetic Core Community Using the Shade & Stopnisek (2019) algorithm.

Description

Called internally. Identifies all taxa and core taxa of the core community using a non-phylogenetic approach with a modified Shade and Stopnisek (2019) algorithm.

Usage

shade_np(xv, mxtx,fi)

Arguments

xv

(Required) The phyloseq object passed from main functions and containing microbial community data.

mxtx

(Required) The maximum number of branches to add sequentially, as a percentage of the total branches.

fi

(Required) The threshold for the percent increase in beta diversity used to identify the taxon at which point adding more taxa yields diminishing returns in explanatory power.

Details

shade_np is used internally in the holobiont package to identify core edges of a microbial phylogeny for the non-phylogenetic approach using a modified Shade and Stopnisek (2019) algorithm. Briefly, taxa are ranked according to abundance first and then occupancy (i.e., occupancy is more important). Next, taxa are added sequentially, up to a maximum number of taxa. Each time a new taxon is added, the mean beta-diversity (Jaccard) between each pair of samples is calculated and is then averaged across all samples. Next, total beta-diversity (all taxa) between pairs samples is calculated, and again averaged across all samples. Finally, the percent increase in total beta-diversity is calculated for each new taxon added. A threshold is then selected based on the lowest ranked taxon that yields a minimum percent increase in beta-diversity.

Value

This function returns a list of all taxa and core taxa.

References

Shade, Ashley, and Nejc Stopnisek. "Abundance-occupancy distributions to prioritize plant core microbiome membership." Current opinion in microbiology 49 (2019): 50-58.

Examples

#Test with enterotype dataset
library(phyloseq)
library(ape)
library(phytools)
data(enterotype)

set.seed(1)

enterotype<-prune_taxa(taxa_names(enterotype)[2:21],enterotype)
enterotype<-prune_samples(sample_names(enterotype)[2:21],enterotype)

#Generate an example tree and label it with the names of the microbial taxa
enterotype_tree<-rtree(length(taxa_names(enterotype)))
enterotype_tree$tip.label<-taxa_names(enterotype)
enterotype_tree$node.label<-as.character(1:1:enterotype_tree$Nnode)

#Create a phyloseq object with a tree
example_phyloseq<-phyloseq(otu_table(enterotype),phy_tree(as.phylo(enterotype_tree)))
newtree<-bind.tip(phy_tree(example_phyloseq),tip.label='outgroup',edge.length=0.0001,position=0)

shade_np(example_phyloseq, NULL, 2)


Tip-based Core Community Phylogeny Using the Shade & Stopnisek (2019) algorithm.

Description

Called internally. Identifies all edges and core edges of the core community phylogeny using a tip-based approach with a modified Shade and Stopnisek (2019) algorithm.

Usage

shade_tip(xv, nt, mxtx,fi)

Arguments

xv

(Required) The phyloseq object passed from main functions and containing microbial community data.

nt

(Required) The microbial phylogeny passed from main functions.

mxtx

(Required) The maximum number of branches to add sequentially, as a percentage of the total branches.

fi

(Required) The threshold for the percent increase in beta diversity used to identify the taxon at which point adding more taxa yields diminishing returns in explanatory power.

Details

shade_tip is used internally in the holobiont package to identify core edges of a microbial phylogeny for the tip-based approach using a modified Shade and Stopnisek (2019) algorithm. Briefly, taxa are ranked according to abundance first and then occupancy (i.e., occupancy is more important). Next, taxa are added sequentially, up to a maximum number of taxa. Each time a new taxon is added, the mean beta-diversity (UniFrac) between each pair of samples is calculated and is then averaged across all samples. Next, total beta-diversity (all taxa) between pairs samples is calculated, and again averaged across all samples. Finally, the percent increase in total beta-diversity is calculated for each new taxon added. A threshold is then selected based on the lowest ranked taxon that yields a minimum percent increase in beta-diversity.

Value

This function returns a list of all edges and core edges, as well s all taxa and all core taxa.

References

Shade, Ashley, and Nejc Stopnisek. "Abundance-occupancy distributions to prioritize plant core microbiome membership." Current opinion in microbiology 49 (2019): 50-58.

Examples

#Test with enterotype dataset
library(phyloseq)
library(ape)
library(phytools)
data(enterotype)

set.seed(1)

enterotype<-prune_taxa(taxa_names(enterotype)[2:21],enterotype)
enterotype<-prune_samples(sample_names(enterotype)[2:21],enterotype)

#Generate an example tree and label it with the names of the microbial taxa
enterotype_tree<-rtree(length(taxa_names(enterotype)))
enterotype_tree$tip.label<-taxa_names(enterotype)
enterotype_tree$node.label<-as.character(1:1:enterotype_tree$Nnode)

#Create a phyloseq object with a tree
example_phyloseq<-phyloseq(otu_table(enterotype),phy_tree(as.phylo(enterotype_tree)))
newtree<-bind.tip(phy_tree(example_phyloseq),tip.label='outgroup',edge.length=0.0001,position=0)

shade_tip(example_phyloseq, newtree, NULL, 2)

mirror server hosted at Truenetwork, Russian Federation.