philentropy _{^{— Information Theory and Distance Quantification with
R}}

🧭 Similarity and Distance Quantification between Probability Functions

Describe and understand the world through data.

Data collection and data comparison are the foundations of scientific research.
Mathematics provides the abstract framework to describe patterns we observe in nature and Statistics provides the framework to quantify the uncertainty of these patterns.

In statistics, natural patterns are described in the form of probability distributions that either follow fixed patterns (parametric distributions) or more dynamic ones (non-parametric distributions).

The philentropy package implements fundamental distance and similarity measures to quantify distances between probability density functions as well as traditional information theory measures.
In this regard, it aims to provide a framework for comparing natural patterns in a statistical notation.

🧡 This project is born out of my passion for statistics and I hope it will be useful to those who share it with me.

⚙️ Installation

# install philentropy version 0.10.0 from CRAN
install.packages("philentropy")

Or get the latest developer version:

# install.packages("devtools")
library(devtools)
install_github("HajkD/philentropy", build_vignettes = TRUE, dependencies = TRUE)

🧾 Citation

HG Drost (2018).
Philentropy: Information Theory and Distance Quantification with R.
Journal of Open Source Software, 3(26), 765.
https://doi.org/10.21105/joss.00765

🪶 I am developing philentropy in my spare time and would be very grateful if you would consider citing the paper above if it was useful for your research. These citations help me continue maintaining and extending the package.

🧩 Quick Start

library(philentropy)

P <- c(0.1, 0.2, 0.7)
Q <- c(0.2, 0.2, 0.6)

distance(rbind(P, Q), method = "jensen-shannon")

jensen-shannon using unit 'log'.
jensen-shannon 
    0.02628933

💡 Tip: Got a large matrix (rows = samples, cols = features)?
Use distance(X, method="cosine", mute.message=TRUE) to compute the full pairwise matrix quickly and quietly.

📘 Tutorials

🧪 When should I use which distance?

Goal	Recommended Methods
🔁 Clustering / similarity	`cosine`, `correlation`, `euclidean`
📊 Probability or compositional data	`jensen-shannon`, `hellinger`, `kullback-leibler`
🧬 Sparse counts / binary	`canberra`, `jaccard`, `sorensen`
⚖️ Scale-invariant	`manhattan`, `chebyshev`

Run getDistMethods() to explore all 50+ implemented measures.

🧮 Examples

library(philentropy)
philentropy::getDistMethods()

# define probability density functions P and Q
P <- 1:10/sum(1:10)
Q <- 20:29/sum(20:29)

x <- rbind(P, Q)
philentropy::distance(x, method = "jensen-shannon")

jensen-shannon using unit 'log'.
jensen-shannon 
    0.02628933

Alternatively, compute all available distances:

philentropy::dist.diversity(x, p = 2, unit = "log2")

🌟 Papers using philentropy (highlights)

Flagship examples with top venues. Click to expand full lists.

Nature / Cell / Science

A transcriptomic hourglass in brown algae
JS Lotharukpong, M Zheng, R Luthringer et al. – Nature, 2024
Annelid functional genomics reveal the origins of bilaterian life cycles
FM Martín-Zamora, Y Liang, K Guynes et al. – Nature, 2023
An atlas of gene regulatory elements in adult mouse cerebrum
YE Li, S Preissl, X Hou, Z Zhang, K Zhang et al. – Nature, 2021
Convergent somatic mutations in metabolism genes in chronic liver disease
S Ng, F Rouhani, S Brunner, N Brzozowska et al. – Nature, 2021
Antigen dominance hierarchies shape TCF1+ progenitor CD8 T cell phenotypes in tumors
ML Burger, AM Cruz, GE Crossland et al. – Cell, 2021
A comparative atlas of single-cell chromatin accessibility in the human brain
YE Li, S Preissl, M Miller, ND Johnson, Z Wang et al. – Science, 2023

Nature Methods / Nat Comms / Cell family

sciCSR infers B cell state transition and predicts class-switch recombination dynamics using scRNA-seq
JCF Ng, G Montamat Garcia, AT Stewart et al. – Nature Methods, 2024
Decoding the gene regulatory network of endosperm differentiation in maize
Y Yuan, Q Huo, Z Zhang, Q Wang, J Wang et al. – Nature Communications, 2024
Population structure in a fungal human pathogen is potentially linked to pathogenicity
EA Hatmaker, AE Barber, MT Drott et al. – Nature Communications, 2025
Pan-cancer human brain metastases atlas at single-cell resolution
X Xing, J Zhong, J Biermann, H Duan, X Zhang et al. – Cancer Cell, 2025
Gene module reconstruction identifies cellular differentiation processes and the regulatory logic of specialized secretion in zebrafish
Y Wang, J Liu, LY Du, JL Wyss, JA Farrell, AF Schier – Developmental Cell, 2025

Other disciplines (selected)

Staphylococci in high resolution: Capturing diversity within the human nasal microbiota
AC Ingham, DYK Ng, S Iversen, CM Liu et al. – Cell Reports, 2025
The power of visualizing distributional differences: formal graphical n-sample tests
K Konstantinou, T Mrkvička, M Myllymäki – Computational Statistics, 2025
Plant species as ecological engineers of microtopography in a temperate sedge-grass marsh
J Dušek, J Novotný, B Navrátilová et al. – Scientific Reports, 2025
Resolution of MALDI-TOF vs WGS for Bacillus identification (NASA JSC)
F Mazhari, AB Regberg, CL Castro, MG LaMontagne – Frontiers in Microbiology, 2025
Every Hue Has Its Fan Club: Diverse Patterns of Color-Dependent Flower Visitation across Lepidoptera
D Kutcherov, EL Westerman – Integrative and Comparative Biology, 2025

🎓 philentropy has been used in dozens of peer-reviewed publications to quantify distances, divergences, and similarities in complex biological and computational datasets.

🧠 Important Functions

Distance Measures

distance() – Implements 46 probability distance/similarity measures
getDistMethods() – Get all method names for distance
dist.diversity() – Computes distance diversity between PDFs
estimate.probability() – Estimate probability vectors from counts

Information Theory

H() – Shannon’s Entropy H(X)
JE() – Joint Entropy H(X,Y)
CE() – Conditional Entropy H(X|Y)
MI() – Mutual Information I(X,Y)
KL() – Kullback–Leibler Divergence
JSD() – Jensen–Shannon Divergence
gJSD() – Generalized Jensen–Shannon Divergence

🗞️ NEWS

Find the current status and version history in the
👉 NEWS section.

🧩 Appendix — full references

A transcriptomic hourglass in brown algae
JS Lotharukpong, M Zheng, R Luthringer et al. – Nature, 2024

Annelid functional genomics reveal the origins of bilaterian life cycles
FM Martín-Zamora, Y Liang, K Guynes et al. – Nature, 2023

An atlas of gene regulatory elements in adult mouse cerebrum
YE Li, S Preissl, X Hou, Z Zhang, K Zhang et al. – Nature, 2021

Convergent somatic mutations in metabolism genes in chronic liver disease
S Ng, F Rouhani, S Brunner, N Brzozowska et al. – Nature, 2021

Antigen dominance hierarchies shape TCF1+ progenitor CD8 T cell phenotypes in tumors
ML Burger, AM Cruz, GE Crossland et al. – Cell, 2021

High-content single-cell combinatorial indexing
R Mulqueen et al. – Nature Biotechnology, 2021

A comparative atlas of single-cell chromatin accessibility in the human brain
YE Li, S Preissl, M Miller, ND Johnson, Z Wang et al. – Science, 2023

Extinction at the end-Cretaceous and the origin of modern Neotropical rainforests
MR Carvalho, C Jaramillo et al. – Science, 2021

sciCSR infers B cell state transition and predicts class-switch recombination dynamics using single-cell transcriptomic data
JCF Ng, G Montamat Garcia, AT Stewart et al. – Nature Methods, 2024

HERMES: a molecular-formula-oriented method to target the metabolome
R Giné, J Capellades, JM Badia et al. – Nature Methods, 2021

Epithelial zonation along the mouse and human small intestine defines five discrete metabolic domains
RK Zwick, P Kasparek, B Palikuqi et al. – Nature Cell Biology, 2024

The genetic architecture of temperature adaptation is shaped by population ancestry and not by selection regime
KA Otte, V Nolte, F Mallard et al. – Genome Biology, 2021

The Tug1 lncRNA locus is essential for male fertility
JP Lewandowski et al. – Genome Biology, 2020

Decoding the gene regulatory network of endosperm differentiation in maize
Y Yuan, Q Huo, Z Zhang, Q Wang, J Wang et al. – Nature Communications, 2024

A full-body transcription factor expression atlas with completely resolved cell identities in C. elegans
Y Li, S Chen, W Liu, D Zhao, Y Gao, S Hu, H Liu, Y Li et al. – Nature Communications, 2024

Comprehensive mapping and modelling of the rice regulome landscape unveils the regulatory architecture underlying complex traits
T Zhu, C Xia, R Yu, X Zhou, X Xu, L Wang et al. – Nature Communications, 2024

Transcriptional vulnerabilities of striatal neurons in human and rodent models of Huntington’s disease
A Matsushima, SS Pineda, JR Crittenden et al. – Nature Communications, 2023

Population structure in a fungal human pathogen is potentially linked to pathogenicity
EA Hatmaker, AE Barber, MT Drott et al. – Nature Communications, 2025

Resolving the structure of phage–bacteria interactions in the context of natural diversity
KM Kauffman, WK Chang, JM Brown et al. – Nature Communications, 2022

Gut microbiome-mediated metabolism effects on immunity in rural and urban African populations
M Stražar, GS Temba, H Vlamakis et al. – Nature Communications, 2021

Aging, inflammation and DNA damage in the somatic testicular niche with idiopathic germ cell aplasia
M Alfano, AS Tascini, F Pederzoli et al. – Nature Communications, 2021

Single cell census of human kidney organoids shows reproducibility and diminished off-target cells after transplantation
A Subramanian et al. – Nature Communications, 2019

Pan-cancer human brain metastases atlas at single-cell resolution
X Xing, J Zhong, J Biermann, H Duan, X Zhang et al. – Cancer Cell, 2025

The temporal progression of lung immune remodeling during breast cancer metastasis
CS McGinnis, Z Miao, D Superville, W Yao et al. – Cancer Cell, 2024

Cross-tissue human fibroblast atlas reveals myofibroblast subtypes with distinct roles in immune modulation
Y Gao, J Li, W Cheng, T Diao, H Liu, Y Bo, C Liu et al. – Cancer Cell, 2024

Gene module reconstruction identifies cellular differentiation processes and the regulatory logic of specialized secretion in zebrafish
Y Wang, J Liu, LY Du, JL Wyss, JA Farrell, AF Schier – Developmental Cell, 2025

Large-scale chromatin reorganization reactivates placenta-specific genes that drive cellular aging
Z Liu, Q Ji, J Ren, P Yan, Z Wu, S Wang, L Sun, Z Wang et al. – Developmental Cell, 2022

Integrated single-cell and spatial transcriptomic profiling reveals that CD177+ Tregs enhance immunosuppression through apoptosis and resistance to …
Y Liang, L Qiao, Q Qian, R Zhang, Y Li, X Xu et al. – Oncogene, 2025

Conserved and unique features of terminal telomeric sequences in ALT-positive cancer cells
B Azeroglu, W Wu, R Pavani, RS Sandhu et al. – eLife, 2025

Spotless, a reproducible pipeline for benchmarking cell type deconvolution in spatial transcriptomics
C Sang-Aram, R Browaeys, R Seurinck, Y Saeys – eLife, 2024

Loss of adaptive capacity in asthmatic patients revealed by biomarker fluctuation dynamics after rhinovirus challenge
A Sinha et al. – eLife, 2019

Staphylococci in high resolution: Capturing diversity within the human nasal microbiota
AC Ingham, DYK Ng, S Iversen, CM Liu et al. – Cell Reports, 2025

Triple network dynamics and future alcohol consumption in adolescents
CC McIntyre, M Khodaei, RG Lyday et al. – Alcohol: Clinical and Experimental Research, 2025

Benchmarking 13 tools for mutational signature attribution, including a new and improved algorithm
N Jiang, Y Wu, SG Rozen – Briefings in Bioinformatics, 2025

Plant species as ecological engineers of microtopography in a temperate sedge-grass marsh
J Dušek, J Novotný, B Navrátilová et al. – Scientific Reports, 2025

Resolution of MALDI-TOF compared to whole genome sequencing for identification of Bacillus species isolated from cleanrooms at NASA Johnson Space Center
F Mazhari, AB Regberg, CL Castro, MG LaMontagne – Frontiers in Microbiology, 2025

Every Hue Has Its Fan Club: Diverse Patterns of Color-Dependent Flower Visitation across Lepidoptera
D Kutcherov, EL Westerman – Integrative and Comparative Biology, 2025

An in vivo CRISPR screen in chick embryos reveals a role for MLLT3 in specification of neural cells from the caudal epiblast
ARG Libby, T Rito, A Radley, J Briscoe – Development, 2025

Single-Cell Analyses Reveal a Functionally Heterogeneous Exhausted CD8+ T-cell Subpopulation That Is Correlated with Response to Checkpoint Therapy in …
KM Mahuron, O Shahid, P Sao, C Wu et al. – Cancer Research, 2025

ETS1‐Driven Nucleolar Stress Orchestrates OLR1+ Macrophage Crosstalk to Sustain Immunosuppressive Microenvironment in Clear Cell Renal Cell Carcinoma
L Xiao, Z Zhang, T Li, Y Jiang, Y Liu, J Wang, W Tang – Human Mutation, 2025

Association Between Ocular Microbiomes of Children and Their Siblings and Parents
X Ling, Y Zhang, CHT Bui, HN Chan, POS Tam et al. – Investigative Ophthalmology & Visual Science, 2025

Benefits and challenges of host depletion methods in profiling the upper and lower respiratory microbiome
C Wang, L Zhang, C Kan, J He, W Liang, R Xia et al. – Biofilms and Microbiomes, 2025

Unsettled Times: Music Discovery Reveals Divergent Cultural Responses to War
H Lee, M Anglada-Tort, O Sobchuk et al. – PsyArXiv Preprints, 2025

q-Generalization of Nakagami distribution with applications
N Kumar, A Dixit, V Vijay – Japanese Journal of Statistics and Data Science, 2025

The power of visualizing distributional differences: formal graphical n-sample tests
K Konstantinou, T Mrkvička, M Myllymäki – Computational Statistics, 2025

Topic Modeling of Positive and Negative Reviews of Soulslike Video Games
T Guzsvinecz – Computers, 2025

Basic Statistical Inference
M Nguyen – Foundations of Data Analysis, 2025

mirror server hosted at Truenetwork, Russian Federation.

philentropy — Information Theory and Distance Quantification with R