Type: Package
Title: Multivariate t Distribution
Version: 1.1.2
Maintainer: Pierre Santagostini <pierre.santagostini@agrocampus-ouest.fr>
Description: Distance between multivariate t distributions, as presented by N. Bouhlel and D. Rousseau (2023) <doi:10.1109/LSP.2023.3324594>.
Depends: R (≥ 3.3.0)
Imports: rgl, MASS, data.table
License: GPL (≥ 3)
URL: https://forgemia.inra.fr/imhorphen/mstudentd
BugReports: https://forgemia.inra.fr/imhorphen/mstudentd/-/issues
Encoding: UTF-8
RoxygenNote: 7.3.2
Suggests: testthat (≥ 3.2.1)
Config/testthat/edition: 3
NeedsCompilation: no
Packaged: 2024-12-22 23:27:05 UTC; psantagosti
Author: Pierre Santagostini [aut, cre], Nizar Bouhlel [aut]
Repository: CRAN
Date/Publication: 2024-12-23 00:40:02 UTC

Tools for Multivariate t Distributions

Description

This package provides tools for multivariate t distributions (MTD):

Author(s)

Pierre Santagostini pierre.santagostini@agrocampus-ouest.fr, Nizar Bouhlel nizar.bouhlel@agrocampus-ouest.fr

References

S. Kotz and Saralees Nadarajah (2004), Multivariate t Distributions and Their Applications, Cambridge University Press.

N. Bouhlel and D. Rousseau (2023), Exact Rényi and Kullback-Leibler Divergences Between Multivariate t-Distributions, IEEE Signal Processing Letters. doi:10.1109/LSP.2023.3324594 #' @keywords internal

See Also

Useful links:


Contour Plot of the Bivariate t Density

Description

Draws the contour plot of the probability density of the multivariate t distribution with 2 variables with location parameter mu and scatter matrix Sigma.

Usage

contourmtd(nu, mu, Sigma,
                   xlim = c(mu[1] + c(-10, 10)*Sigma[1, 1]),
                   ylim = c(mu[2] + c(-10, 10)*Sigma[2, 2]),
                   zlim = NULL, npt = 30, nx = npt, ny = npt,
                   main = "Multivariate t density",
                   sub = NULL, nlevels = 10,
                   levels = pretty(zlim, nlevels), tol = 1e-6, ...)

Arguments

nu

numeric. The degrees of freedom.

mu

length 2 numeric vector.

Sigma

symmetric, positive-definite square matrix of order 2. The scatter matrix.

xlim, ylim

x-and y- limits.

zlim

z- limits. If NULL, it is the range of the values of the density on the x and y values within xlim and ylim.

npt

number of points for the discretisation.

nx, ny

number of points for the discretisation among the x- and y- axes.

main, sub

main and sub title, as for title.

nlevels, levels

arguments to be passed to the contour function.

tol

tolerance (relative to largest variance) for numerical lack of positive-definiteness in Sigma, for the estimation of the density. see dmtd.

...

additional arguments to plot.window, title, Axis and box, typically graphical parameters such as cex.axis.

Value

Returns invisibly the probability density function.

Author(s)

Pierre Santagostini, Nizar Bouhlel

References

S. Kotz and Saralees Nadarajah (2004), Multivariate t Distributions and Their Applications, Cambridge University Press.

See Also

dmtd: probability density of a multivariate t density

plotmtd: 3D plot of a bivariate t density.

Examples

nu <- 1
mu <- c(1, 4)
Sigma <- matrix(c(0.8, 0.2, 0.2, 0.2), nrow = 2)
contourmtd(nu, mu, Sigma)


Distance/Divergence between Centered Multivariate t Distributions

Description

Computes the distance or divergence (Renyi divergence, Bhattacharyya distance or Hellinger distance) between two random vectors distributed according to multivariate $t$ distributions (MTD) with zero mean vector.

Usage

diststudent(nu1, Sigma1, nu2, Sigma2,
                   dist = c("renyi", "bhattacharyya", "hellinger"),
                   bet = NULL, eps = 1e-06)

Arguments

nu1

numéric. The degrees of freedom of the first distribution.

Sigma1

symmetric, positive-definite matrix. The correlation matrix of the first distribution.

nu2

numéric. The degrees of freedom of the second distribution.

Sigma2

symmetric, positive-definite matrix. The correlation matrix of the second distribution.

dist

character. The distance or divergence used. One of "renyi" (default), "battacharyya" or "hellinger".

bet

numeric, positive and not equal to 1. Order of the Renyi divergence. Ignored if distance="bhattacharyya" or distance="hellinger".

eps

numeric. Precision for the computation of the partial derivative of the Lauricella D-hypergeometric function (see Details). Default: 1e-06.

Details

Given X_1, a random vector of R^p distributed according to the MTD with parameters (\nu_1, \mathbf{0}, \Sigma_1) and X_2, a random vector of R^p distributed according to the MTD with parameters (\nu_2, \mathbf{0}, \Sigma_2).

Let \delta_1 = \frac{\nu_1 + p}{2} \beta, \delta_2 = \frac{\nu_2 + p}{2} (1 - \beta) and \lambda_1, \dots, \lambda_p the eigenvalues of the square matrix \Sigma_1 \Sigma_2^{-1} sorted in increasing order:

\lambda_1 < \dots < \lambda_{p-1} < \lambda_p

The Renyi divergence between X_1 and X_2 is:

\begin{aligned} D_R^\beta(\mathbf{X}_1||\mathbf{X}_1) & = & \displaystyle{\frac{1}{\beta - 1} \bigg[ \beta \ln\left(\frac{\Gamma\left(\frac{\nu_1+p}{2}\right) \Gamma\left(\frac{\nu_2}{2}\right) \nu_2^{\frac{p}{2}}}{\Gamma\left(\frac{\nu_2+p}{2}\right) \Gamma\left(\frac{\nu_1}{2}\right) \nu_1^{\frac{p}{2}}}\right) + \ln\left(\frac{\Gamma\left(\frac{\nu_2+p}{2}\right)}{\Gamma\left(\frac{\nu_2}{2}\right)}\right) + \ln\left(\frac{\Gamma\left(\delta_1 + \delta_2 - \frac{p}{2}\right)}{\Gamma(\delta_1 + \delta_2)}\right) } \\ && \displaystyle{- \frac{\beta}{2} \sum_{i=1}^p{\ln\lambda_i} + \ln F_D \bigg]} \end{aligned}

with F_D given by:

where F_D^{(p)} is the Lauricella D-hypergeometric function defined for p variables:

\displaystyle{ F_D^{(p)}\left(a; b_1, ..., b_p; g; x_1, ..., x_p\right) = \sum\limits_{m_1 \geq 0} ... \sum\limits_{m_p \geq 0}{ \frac{ (a)_{m_1+...+m_p}(b_1)_{m_1} ... (b_p)_{m_p} }{ (g)_{m_1+...+m_p} } \frac{x_1^{m_1}}{m_1!} ... \frac{x_p^{m_p}}{m_p!} } }

Its computation uses the lauricella function.

The Bhattacharyya distance is given by:

D_B(\mathbf{X}_1||\mathbf{X}_2) = \frac{1}{2} D_R^{1/2}(\mathbf{X}_1||\mathbf{X}_2)

And the Hellinger distance is given by:

D_H(\mathbf{X}_1||\mathbf{X}_2) = 1 - \exp{\left(-\frac{1}{2} D_R^{1/2}(\mathbf{X}_1||\mathbf{X}_2)\right)}

Value

A numeric value: the Renyi divergence between the two distributions, with two attributes attr(, "epsilon") (precision of the result of the Lauricella D-hypergeometric function,see Details) and attr(, "k") (number of iterations).

Author(s)

Pierre Santagostini, Nizar Bouhlel

References

N. Bouhlel and D. Rousseau (2023), Exact Rényi and Kullback-Leibler Divergences Between Multivariate t-Distributions, IEEE Signal Processing Letters. doi:10.1109/LSP.2023.3324594

Examples

nu1 <- 2
Sigma1 <- matrix(c(2, 1.2, 0.4, 1.2, 2, 0.6, 0.4, 0.6, 2), nrow = 3)
nu2 <- 4
Sigma2 <- matrix(c(1, 0.3, 0.1, 0.3, 1, 0.4, 0.1, 0.4, 1), nrow = 3)

# Renyi divergence
diststudent(nu1, Sigma1, nu2, Sigma2, bet = 0.25)
diststudent(nu2, Sigma2, nu1, Sigma1, bet = 0.25)

# Bhattacharyya distance
diststudent(nu1, Sigma1, nu2, Sigma2, dist = "bhattacharyya")
diststudent(nu2, Sigma2, nu1, Sigma1, dist = "bhattacharyya")

# Hellinger distance
diststudent(nu1, Sigma1, nu2, Sigma2, dist = "hellinger")
diststudent(nu2, Sigma2, nu1, Sigma1, dist = "hellinger")


Density of a Multivariate t Distribution

Description

Density of the multivariate (p variables) t distribution (MTD) with degrees of freedom nu, mean vector mu and correlation matrix Sigma.

Usage

dmtd(x, nu, mu, Sigma, tol = 1e-6)

Arguments

x

length p numeric vector.

nu

numeric. The degrees of freedom.

mu

length p numeric vector. The mean vector.

Sigma

symmetric, positive-definite square matrix of order p. The correlation matrix.

tol

tolerance (relative to largest variance) for numerical lack of positive-definiteness in Sigma.

Details

The density function of a multivariate t distribution with p variables is given by:

\displaystyle{ f(\mathbf{x}|\nu, \boldsymbol{\mu}, \Sigma) = \frac{\Gamma\left( \frac{\nu+p}{2} \right) |\Sigma|^{-1/2}}{\Gamma\left( \frac{\nu}{2} \right) (\nu \pi)^{p/2}} \left( 1 + \frac{1}{\nu} (\mathbf{x}-\boldsymbol{\mu})^T \Sigma^{-1} (\mathbf{x}-\boldsymbol{\mu}) \right)^{-\frac{\nu+p}{2}} }

When p=1 (univariate case) it becomes:

\displaystyle{ f(x|\nu, \mu, \sigma^2) = \frac{\Gamma\left( \frac{\nu+1}{2} \right)}{\Gamma\left( \frac{\nu}{2} \right) \sqrt{\nu \pi} \sigma} \left(1 + \frac{(x-\mu)^2}{\nu \sigma^2}\right)^{-\frac{\nu+1}{2}} }

Value

The value of the density.

Author(s)

Pierre Santagostini, Nizar Bouhlel

References

S. Kotz and Saralees Nadarajah (2004), Multivariate t Distributions and Their Applications, Cambridge University Press.

Examples

nu <- 1
mu <- c(0, 1, 4)
Sigma <- matrix(c(0.8, 0.3, 0.2, 0.3, 0.2, 0.1, 0.2, 0.1, 0.2), nrow = 3)
dmtd(c(0, 1, 4), nu, mu, Sigma)
dmtd(c(1, 2, 3), nu, mu, Sigma)

# Univariate
dmtd(1, 3, 0, 1)
dt(1, 3)


Kullback-Leibler Divergence between Centered Multivariate t Distributions

Description

Computes the Kullback-Leibler divergence between two random vectors distributed according to multivariate t distributions (MTD) with zero location vector.

Usage

kldstudent(nu1, Sigma1, nu2, Sigma2, eps = 1e-06)

Arguments

nu1

numeric. The degrees of freedom of the first distribution.

Sigma1

symmetric, positive-definite matrix. The scatter matrix of the first distribution.

nu2

numeric. The degrees of freedom of the second distribution.

Sigma2

symmetric, positive-definite matrix. The scatter matrix of the second distribution.

eps

numeric. Precision for the computation of the partial derivative of the Lauricella D-hypergeometric function (see Details). Default: 1e-06.

Details

Given X_1, a random vector of \mathbb{R}^p distributed according to the centered MTD with parameters (\nu_1, 0, \Sigma_1) and X_2, a random vector of \mathbb{R}^p distributed according to the MCD with parameters (\nu_2, 0, \Sigma_2).

Let \lambda_1, \dots, \lambda_p the eigenvalues of the square matrix \Sigma_1 \Sigma_2^{-1} sorted in increasing order:

\lambda_1 < \dots < \lambda_{p-1} < \lambda_p

The Kullback-Leibler divergence of X_1 from X_2 is given by:

\displaystyle{ D_{KL}(\mathbf{X}_1\|\mathbf{X}_2) = \ln\left(\frac{\Gamma\left(\frac{\nu_1+p}{2}\right) \Gamma\left(\frac{\nu_2}{2}\right) \nu_2^{\frac{p}{2}}}{\Gamma\left(\frac{\nu_2+p}{2}\right) \Gamma\left(\frac{\nu_1}{2}\right) \nu_1^{\frac{p}{2}}} \right) + \frac{\nu_2-\nu_1}{2} \left[\psi\left(\frac{\nu_1+p}{2} \right) - \psi\left(\frac{\nu_1}{2}\right)\right] - \frac{1}{2} \sum_{i=1}^p{\ln\lambda_i} - \frac{\nu_2+p}{2} \times D }

where \psi is the digamma function (see Special) and D is given by:

F_D^{(p)} is the Lauricella D-hypergeometric function defined for p variables:

\displaystyle{ F_D^{(p)}\left(a; b_1, ..., b_p; g; x_1, ..., x_p\right) = \sum\limits_{m_1 \geq 0} ... \sum\limits_{m_p \geq 0}{ \frac{ (a)_{m_1+...+m_p}(b_1)_{m_1} ... (b_p)_{m_p} }{ (g)_{m_1+...+m_p} } \frac{x_1^{m_1}}{m_1!} ... \frac{x_p^{m_p}}{m_p!} } }

Value

A numeric value: the Kullback-Leibler divergence between the two distributions, with two attributes attr(, "epsilon") (precision of the partial derivative of the Lauricella D-hypergeometric function,see Details) and attr(, "k") (number of iterations).

Author(s)

Pierre Santagostini, Nizar Bouhlel

References

N. Bouhlel and D. Rousseau (2023), Exact Rényi and Kullback-Leibler Divergences Between Multivariate t-Distributions, IEEE Signal Processing Letters. doi:10.1109/LSP.2023.3324594

Examples

nu1 <- 2
Sigma1 <- matrix(c(2, 1.2, 0.4, 1.2, 2, 0.6, 0.4, 0.6, 2), nrow = 3)
nu2 <- 4
Sigma2 <- matrix(c(1, 0.3, 0.1, 0.3, 1, 0.4, 0.1, 0.4, 1), nrow = 3)

kldstudent(nu1, Sigma1, nu2, Sigma2)
kldstudent(nu2, Sigma2, nu1, Sigma1)


Lauricella D-Hypergeometric Function

Description

Computes the Lauricella D-hypergeometric Function function.

Usage

lauricella(a, b, g, x, eps = 1e-06)

Arguments

a

numeric.

b

numeric vector.

g

numeric.

x

numeric vector. x must have the same length as b.

eps

numeric. Precision for the nested sums (default 1e-06).

Details

If n is the length of the b and x vectors, the Lauricella D-hypergeometric Function function is given by:

\displaystyle{F_D^{(n)}\left(a, b_1, ..., b_n, g; x_1, ..., x_n\right) = \sum_{m_1 \geq 0} ... \sum_{m_n \geq 0}{ \frac{ (a)_{m_1+...+m_n}(b_1)_{m_1} ... (b_n)_{m_n} }{ (g)_{m_1+...+m_n} } \frac{x_1^{m_1}}{m_1!} ... \frac{x_n^{m_n}}{m_n!} } }

where (x)_p is the Pochhammer symbol (see pochhammer).

If |x_i| < 1, i = 1, \dots, n, this sum converges. Otherwise there is an error.

The eps argument gives the required precision for its computation. It is the attr(, "epsilon") attribute of the returned value.

Sometimes, the convergence is too slow and the required precision cannot be reached. If this happens, the attr(, "epsilon") attribute is the precision that was really reached.

Value

A numeric value: the value of the Lauricella function, with two attributes attr(, "epsilon") (precision of the result) and attr(, "k") (number of iterations).

Author(s)

Pierre Santagostini, Nizar Bouhlel

References

N. Bouhlel and D. Rousseau, Exact Rényi and Kullback-Leibler Divergences Between Multivariate t-Distributions. IEEE Signal Processing Letters Processing Letters, vol. 26 no. 7, July 2019. doi:10.1109/LSP.2019.2915000


Logarithm of the Pochhammer Symbol

Description

Computes the logarithm of the Pochhammer symbol.

Usage

lnpochhammer(x, n)

Arguments

x

numeric.

n

positive integer.

Details

The Pochhammer symbol is given by:

\displaystyle{ (x)_n = \frac{\Gamma(x+n)}{\Gamma(x)} = x (x+1) ... (x+n-1) }

So, if n > 0:

\displaystyle{ log\left((x)_n\right) = log(x) + log(x+1) + ... + log(x+n-1) }

If n = 0, \displaystyle{ log\left((x)_n\right) = log(1) = 0}

Value

Numeric value. The logarithm of the Pochhammer symbol.

Author(s)

Pierre Santagostini, Nizar Bouhlel

See Also

pochhammer()

Examples

lnpochhammer(2, 0)
lnpochhammer(2, 1)
lnpochhammer(2, 3)


Plot of the Bivariate t Density

Description

Plots the probability density of the multivariate t distribution with 2 variables with location parameter mu and scatter matrix Sigma.

Usage

plotmtd(nu, mu, Sigma, xlim = c(mu[1] + c(-10, 10)*Sigma[1, 1]),
                ylim = c(mu[2] + c(-10, 10)*Sigma[2, 2]), n = 101,
                xvals = NULL, yvals = NULL, xlab = "x", ylab = "y",
                zlab = "f(x,y)", col = "gray", tol = 1e-6, ...)

Arguments

nu

numeric. The degrees of freedom.

mu

length 2 numeric vector. The mean vector.

Sigma

symmetric, positive-definite square matrix of order 2. The correlation matrix.

xlim, ylim

x-and y- limits.

n

A one or two element vector giving the number of steps in the x and y grid, passed to plot3d.function.

xvals, yvals

The values at which to evaluate x and y. If used, xlim and/or ylim are ignored.

xlab, ylab, zlab

The axis labels.

col

The color to use for the plot. See plot3d.function.

tol

tolerance (relative to largest variance) for numerical lack of positive-definiteness in Sigma, for the estimation of the density. see dmtd.

...

Additional arguments to pass to plot3d.function.

Value

Returns invisibly the probability density function.

Author(s)

Pierre Santagostini, Nizar Bouhlel

References

S. Kotz and Saralees Nadarajah (2004), Multivariate t Distributions and Their Applications, Cambridge University Press.

See Also

dmtd: probability density of a multivariate t density

contourmtd: contour plot of a bivariate t density.

plot3d.function: plot a function of two variables.

Examples

nu <- 1
mu <- c(1, 4)
Sigma <- matrix(c(0.8, 0.2, 0.2, 0.2), nrow = 2)
plotmtd(nu, mu, Sigma)


Pochhammer Symbol

Description

Computes the Pochhammer symbol.

Usage

pochhammer(x, n)

Arguments

x

numeric.

n

positive integer.

Details

The Pochhammer symbol is given by:

\displaystyle{ (x)_n = \frac{\Gamma(x+n)}{\Gamma(x)} = x (x+1) ... (x+n-1) }

Value

Numeric value. The value of the Pochhammer symbol.

Author(s)

Pierre Santagostini, Nizar Bouhlel

Examples

pochhammer(2, 0)
pochhammer(2, 1)
pochhammer(2, 3)


Simulate from a Multivariate t Distribution

Description

Produces one or more samples from the multivariate (p variables) t distribution (MTD) with degrees of freedom nu, mean vector mu and correlation matrix Sigma.

Usage

rmtd(n, nu, mu, Sigma, tol = 1e-6)

Arguments

n

integer. Number of observations.

nu

numeric. The degrees of freedom.

mu

length p numeric vector. The mean vector

Sigma

symmetric, positive-definite square matrix of order p. The correlation matrix.

tol

tolerance for numerical lack of positive-definiteness in Sigma (for mvrnorm, see Details).

Details

A sample from a MTD with parameters \nu, \boldsymbol{\mu} and \Sigma can be generated using:

\displaystyle{X = \mu + \frac{Y}{\sqrt{\frac{u}{\nu}}}}

where Y is a random vector distributed among a centered Gaussian density with covariance matrix \Sigma (generated using mvrnorm) and u is distributed among a Chi-squared distribution with \nu degrees of freedom.

Value

A matrix with p columns and n rows.

Author(s)

Pierre Santagostini, Nizar Bouhlel

References

S. Kotz and Saralees Nadarajah (2004), Multivariate t Distributions and Their Applications, Cambridge University Press.

See Also

dmtd: probability density of a MTD.

Examples

nu <- 3
mu <- c(0, 1, 4)
Sigma <- matrix(c(1, 0.6, 0.2, 0.6, 1, 0.3, 0.2, 0.3, 1), nrow = 3)
x <- rmtd(10000, nu, mu, Sigma)
head(x)
dim(x)
mu; colMeans(x)
nu/(nu-2)*Sigma; var(x)

mirror server hosted at Truenetwork, Russian Federation.