Title: | Information Criteria for Generalized Linear Regression |
Version: | 0.1.0 |
Maintainer: | Fatih Saglam <fatih.saglam@omu.edu.tr> |
Description: | Calculate various information criteria in literature for "lm" and "glm" objects. |
License: | MIT + file LICENSE |
Imports: | stats, Matrix |
Suggests: | MASS |
Encoding: | UTF-8 |
LazyData: | false |
RoxygenNote: | 7.1.1 |
NeedsCompilation: | no |
Packaged: | 2021-11-11 07:31:46 UTC; Fatih |
Author: | Fatih Saglam |
Repository: | CRAN |
Date/Publication: | 2021-11-11 19:10:08 UTC |
Akaike Information Criterion
Description
Calculates Akaike Information Criterion (AIC) and its variants for "lm" and "glm" objects.
Usage
AIC(model)
AIC4(model)
Arguments
model |
a "lm" or "glm" object |
Details
AIC (Akaike, 1973) is calculated as
-2LL(theta) + 2k
and AIC4 (Bozdogan, 1994) as
-2LL(theta) + 2klog
Value
AIC or AIC4 measurement of the model
References
Akaike H., 1973. Maximum likelihood identification of Gaussian autoregressive moving average models. Biometrika, 60(2), 255-265.
Bozdogan, H. 1994. Mixture-model cluster analysis using model selection criteria and a new informational measure of complexity. In Proceedings of the first US/Japan conference on the frontiers of statistical modeling: An informational approach, 69–113. Dordrecht: Springer.
Examples
x1 <- rnorm(100, 3, 2)
x2 <- rnorm(100, 5, 3)
x3 <- rnorm(100, 67, 5)
err <- rnorm(100, 0, 4)
## round so we can use it for Poisson regression
y <- round(3 + 2*x1 - 5*x2 + 8*x3 + err)
m1 <- lm(y~x1 + x2 + x3)
m2 <- glm(y~x1 + x2 + x3, family = "gaussian")
m3 <- glm(y~x1 + x2 + x3, family = "poisson")
AIC(m1)
AIC(m2)
AIC(m3)
AIC4(m1)
AIC4(m2)
AIC4(m3)
Bayesian Information Criterion
Description
Calculates Bayesian Information Criterion (BIC) and its variants (BICadj, BICQ) for "lm" and "glm" objects.
Usage
BIC(model)
BICadj(model)
BICQ(model, q = 0.25)
Arguments
model |
a "lm" or "glm" object |
q |
adjustment parameter for |
Details
BIC (Schwarz, 1978) is calculated as
-2LL(theta) + klog(n)
Adjusted BIC (Dziak et al., 2020) as
-2LL(theta) + klog(n/2pi)
and BICQ (Xu, 2010) as
-2LL(theta) + klog(n) - 2klog(q/(1-q))
.
Value
BIC, BICadj or BICQ measurement of the model
References
Dziak, J. J., Coffman, D. L., Lanza, S. T., Li, R., & Jermiin, L. S. (2020). Sensitivity and specificity of information criteria. Briefings in bioinformatics, 21(2), 553-565.
Xu, C. (2010). Model Selection with Information Criteria.
Schwarz, G. 1978. Estimating the dimension of a model The Annals of Statistics 6 (2), 461–464. <doi:10.1214/aos/1176344136>
Examples
x1 <- rnorm(100, 3, 2)
x2 <- rnorm(100, 5, 3)
x3 <- rnorm(100, 67, 5)
err <- rnorm(100, 0, 4)
## round so we can use it for Poisson regression
y <- round(3 + 2*x1 - 5*x2 + 8*x3 + err)
m1 <- lm(y~x1 + x2 + x3)
m2 <- glm(y~x1 + x2 + x3, family = "gaussian")
m3 <- glm(y~x1 + x2 + x3, family = "poisson")
BIC(m1)
BIC(m2)
BIC(m3)
BICadj(m1)
BICadj(m2)
BICadj(m3)
Consistent Akaike's Information Criterion and Consistent Akaike's Information Criterion with Fisher Information
Description
Consistent Akaike's Information Criterion (CAIC) and Consistent Akaike's Information Criterion with Fisher Information (CAICF) for "lm" and "glm" objects.
Usage
CAIC(model)
CAICF(model)
Arguments
model |
a "lm" or "glm" object. |
Details
CAIC (Bozdogan, 1987) is calculated as
-2LL(theta) + k(log(n) + 1)
CAICF (Bozdogan, 1987) as
-2LL(theta) + 2k + k(log(n)) + log(|F|)
F is the Fisher information matrix.
Value
CAIC or CAICF measurement of the model.
References
Bozdogan, H. (1987). Model selection and Akaike's information criterion (AIC): The general theory and its analytical extensions. Psychometrika, 52(3), 345-370.
Examples
x1 <- rnorm(100, 3, 2)
x2 <- rnorm(100, 5, 3)
x3 <- rnorm(100, 67, 5)
err <- rnorm(100, 0, 4)
## round so we can use it for Poisson regression
y <- round(3 + 2*x1 - 5*x2 + 8*x3 + err)
m1 <- lm(y~x1 + x2 + x3)
m2 <- glm(y~x1 + x2 + x3, family = "gaussian")
m3 <- glm(y~x1 + x2 + x3, family = "poisson")
CAIC(m1)
CAIC(m2)
CAIC(m3)
CAICF(m1)
CAICF(m2)
CAICF(m3)
Fisher Information Criterion
Description
Calculates Fisher Information Criterion (FIC) for "lm" and "glm" objects.
Usage
FIC(model)
Arguments
model |
a "lm" or "glm" object |
Details
FIC (Wei, 1992) is calculated as
-2LL(theta) + log(|X^T X|)
Value
FIC measurement of the model
References
Wei, C. Z. (1992). On predictive least squares principles. The Annals of Statistics, 20(1), 1-42.
Examples
x1 <- rnorm(100, 3, 2)
x2 <- rnorm(100, 5, 3)
x3 <- rnorm(100, 67, 5)
err <- rnorm(100, 0, 4)
## round so we can use it for Poisson regression
y <- round(3 + 2*x1 - 5*x2 + 8*x3 + err)
m1 <- lm(y~x1 + x2 + x3)
m2 <- glm(y~x1 + x2 + x3, family = "gaussian")
m3 <- glm(y~x1 + x2 + x3, family = "poisson")
FIC(m1)
FIC(m2)
FIC(m3)
Generalized Cross-Validation
Description
Calculates Generalized Cross-Validation (GCV) for "lm" and "glm" objects.
Usage
GCV(model)
Arguments
model |
a "lm" or "glm" object |
Details
GCV (Koc and Bozdogan, 2015) is calculated as
RSS/(n(1 - k/n))
RSS is the residual sum of squares.
Value
GCV measurement of the model
References
Koc, E. K., & Bozdogan, H. (2015). Model selection in multivariate adaptive regression splines (MARS) using information complexity as the fitness function. Machine Learning, 101(1), 35-58.
Examples
x1 <- rnorm(100, 3, 2)
x2 <- rnorm(100, 5, 3)
x3 <- rnorm(100, 67, 5)
err <- rnorm(100, 0, 4)
## round so we can use it for Poisson regression
y <- round(3 + 2*x1 - 5*x2 + 8*x3 + err)
m1 <- lm(y~x1 + x2 + x3)
m2 <- glm(y~x1 + x2 + x3, family = "gaussian")
m3 <- glm(y~x1 + x2 + x3, family = "poisson")
GCV(m1)
GCV(m2)
GCV(m3)
Haughton Bayesian information criterion
Description
Calculates Haughton Bayesian information criterion (HBIC) for "lm" and "glm" objects.
Usage
HBIC(model)
Arguments
model |
a "lm" or "glm" object |
Details
HBIC (Bollen et al., 2014) is calculated as
-2LL(theta) + klog(n/(2pi))
Value
HBIC measurement of the model
References
Bollen, K. A., Harden, J. J., Ray, S., & Zavisca, J. (2014). BIC and alternative Bayesian information criteria in the selection of structural equation models. Structural equation modeling: a multidisciplinary journal, 21(1), 1-19.
Examples
x1 <- rnorm(100, 3, 2)
x2 <- rnorm(100, 5, 3)
x3 <- rnorm(100, 67, 5)
err <- rnorm(100, 0, 4)
## round so we can use it for Poisson regression
y <- round(3 + 2*x1 - 5*x2 + 8*x3 + err)
m1 <- lm(y~x1 + x2 + x3)
m2 <- glm(y~x1 + x2 + x3, family = "gaussian")
m3 <- glm(y~x1 + x2 + x3, family = "poisson")
HBIC(m1)
HBIC(m2)
HBIC(m3)
Hannan-Quinn Information Criterion
Description
Calculates Hannan-Quinn Information Criterion (HQIC) for "lm" and "glm" objects.
Usage
HQIC(model)
Arguments
model |
a "lm" or "glm" object |
Details
HQIC (Hannan and Quinn, 1979) is calculated as
-2LL(theta) + 2klog(log(n))
Value
HQIC measurement of the model
References
Hannan, E. J., & Quinn, B. G. (1979). The determination of the order of an autoregression. Journal of the Royal Statistical Society: Series B (Methodological), 41(2), 190-195.
Examples
x1 <- rnorm(100, 3, 2)
x2 <- rnorm(100, 5, 3)
x3 <- rnorm(100, 67, 5)
err <- rnorm(100, 0, 4)
## round so we can use it for Poisson regression
y <- round(3 + 2*x1 - 5*x2 + 8*x3 + err)
m1 <- lm(y~x1 + x2 + x3)
m2 <- glm(y~x1 + x2 + x3, family = "gaussian")
m3 <- glm(y~x1 + x2 + x3, family = "poisson")
HQIC(m1)
HQIC(m2)
HQIC(m3)
Information Matrix-Based Information Criterion
Description
Calculates Information Matrix-Based Information Criterion (IBIC) for "lm" and "glm" objects.
Usage
IBIC(model)
Arguments
model |
a "lm" or "glm" object |
Details
IBIC (Bollen et al., 2012) is calculated as
-2LL(theta) + klog(n/(2pi)) + log(|F|)
F
is the fisher information matrix.
While calculating the Fisher information matrix (F
), we used
the joint parameters (beta,sigma^2
) of the models.
Value
IBIC measurement of the model
References
Bollen, K. A., Ray, S., Zavisca, J., & Harden, J. J. (2012). A comparison of Bayes factor approximation methods including two new methods. Sociological Methods & Research, 41(2), 294-324.
Examples
x1 <- rnorm(100, 3, 2)
x2 <- rnorm(100, 5, 3)
x3 <- rnorm(100, 67, 5)
err <- rnorm(100, 0, 4)
## round so we can use it for Poisson regression
y <- round(3 + 2*x1 - 5*x2 + 8*x3 + err)
m1 <- lm(y~x1 + x2 + x3)
m2 <- glm(y~x1 + x2 + x3, family = "gaussian")
m3 <- glm(y~x1 + x2 + x3, family = "poisson")
IBIC(m1)
IBIC(m2)
IBIC(m3)
Information Criteria
Description
Calculates Various Information Criteria for "lm" and "glm" objects.
Usage
IC(
model,
criteria = c("AIC", "BIC", "CAIC", "KIC", "HQIC", "FIC", "ICOMP_IFIM_C1",
"ICOMP_PEU_C1", "ICOMP_PEU_LN_C1", "CICOMP_C1"),
...
)
Arguments
model |
a "lm" or "glm" object or object list |
criteria |
a vector of criteria names. Can be set to respective numbers. Possible criteria names at the moment are: |
... |
additional parameters. Currently none. |
Details
Calculates Various Information Criteria for "lm" and "glm" objects.
model
can be a list. If it is a list, function returns a
matrix of selected information criteria for all models.
Value
Information criteria of the model(s) for selected criteria
Examples
x1 <- rnorm(100, 3, 2)
x2 <- rnorm(100, 5, 3)
x3 <- rnorm(100, 67, 5)
err <- rnorm(100, 0, 4)
## round so we can use it for Poisson regression
y <- round(3 + 2*x1 - 5*x2 + 8*x3 + err)
m1 <- lm(y~x1 + x2 + x3)
m2 <- glm(y~x1 + x2 + x3, family = "gaussian")
m3 <- glm(y~x1 + x2 + x3, family = "poisson")
IC(model = m1, criteria = 1:32)
IC(model = list(lm = m1,
glm = m2,
glm_pois = m3), criteria = 1:32)
Informational Complexity
Description
These functions calculates Informational Complexity (ICOMP) variants for "lm" and "glm" objects.
Usage
ICOMP(model, type = "IFIM", C = "C1")
ICOMP_IFIM_CF(model)
ICOMP_IFIM_C1(model)
ICOMP_IFIM_C1F(model)
ICOMP_IFIM_C1R(model)
ICOMP_PEU_CF(model)
ICOMP_PEU_C1(model)
ICOMP_PEU_C1F(model)
ICOMP_PEU_C1R(model)
ICOMP_PEU_LN_CF(model)
ICOMP_PEU_LN_C1(model)
ICOMP_PEU_LN_C1F(model)
ICOMP_PEU_LN_C1R(model)
CICOMP_CF(model)
CICOMP_C1(model)
CICOMP_C1F(model)
CICOMP_C1R(model)
Arguments
model |
a "lm" or "glm" object |
type |
type of ICOMP. Available types are "IFIM", "PEU", "PEU_LN" and "CICOMP". Default is "IFIM". |
C |
type of complexity. Available types are "CF", "C1", "C1F" and "C1R". Default is "C1". |
Details
ICOMP(IFIM) (Bozdogan, 2003) is calculated as
-2LL(theta) + 2C(F^{-1})
ICOMP(IFIM-peu) (Koc and Bozdogan, 2015) as
-2LL(theta) + k + 2C(F^{-1})
ICOMP(IFIM-peuln) (Bozdogan, 2010) as
-2LL(theta) + k + 2log(n)C(F^{-1})
and CICOMP (Pamukcu et al., 2015) as
-2LL(theta) + k(log(n) + 1) + 2C(F^{-1})
F
is the fisher information matrix. F^{-1}
is the
reverse Fisher information matrix.
C
is the complexity measure. Four variants are available:
C_1
(Bozdogan, 2010) is
C_1(F^{-1}) = s/2*log(lambda_a / lambda_g)
C_F
(Bozdogan, 2010) is
C_F(F^{-1}) = 1/s*sum_i^s(lambda_i - lambda_a)
C_1F
(Bozdogan, 2010) is
C_1F(F^{-1}) = 1/(4lambda_a^2)*sum_i^s(lambda_i - lambda_a)
C_1R
(Bozdogan, 2000) is
C_1R(F^{-1}) = 1/2*log(|R|)
Here, R
is the correlation matrix of the model, lambda_1, ..., lambda_s
are eigenvalues of F
, lambda_a
and lambda_g
are arithmetic and
geometric mean of eigenvalues of F
, respectively. s
is the dimension
of F
.
While calculating the Fisher information matrix (F
), we used
the joint parameters (beta,sigma^2
) of the models. In C1R(.)
function,
we utilized the usual variance-covariance matrix Cov(beta)
of the
models. beta is the vector of regression coefficients.
Value
Informational Complexity measurement of the model
References
Bozdogan, H. (2003). Intelligent Statistical Data Mining with Information Complexity and Genetic Algorithms Hamparsum Bozdogan University of Tennessee, Knoxville, USA. In Statistical data mining and knowledge discovery (pp. 47-88). Chapman and Hall/CRC.
Koc, E. K., & Bozdogan, H. (2015). Model selection in multivariate adaptive regression splines (MARS) using information complexity as the fitness function. Machine Learning, 101(1), 35-58.
Bozdogan, H. (2010). A new class of information complexity (ICOMP) criteria with an application to customer profiling and segmentation. İstanbul Üniversitesi İşletme Fakültesi Dergisi, 39(2), 370-398.
Pamukçu, E., Bozdogan, H., & Çalık, S. (2015). A novel hybrid dimension reduction technique for undersized high dimensional gene expression data sets using information complexity criterion for cancer classification. Computational and mathematical methods in medicine, 2015.
Bozdogan, H. (2000). Akaike's information criterion and recent developments in information complexity. Journal of mathematical psychology, 44(1), 62-91.
Examples
x1 <- rnorm(100, 3, 2)
x2 <- rnorm(100, 5, 3)
x3 <- rnorm(100, 67, 5)
err <- rnorm(100, 0, 4)
## round so we can use it for Poisson regression
y <- round(3 + 2*x1 - 5*x2 + 8*x3 + err)
m1 <- lm(y~x1 + x2 + x3)
m2 <- glm(y~x1 + x2 + x3, family = "gaussian")
m3 <- glm(y~x1 + x2 + x3, family = "poisson")
ICOMP_IFIM_CF(m1)
ICOMP_IFIM_CF(m2)
ICOMP_IFIM_CF(m3)
CICOMP_C1(m1)
CICOMP_C1(m2)
CICOMP_C1(m3)
ICOMP(m1, type = "PEU", C = "C1R")
Joint Information Criterion
Description
Joint Information Criterion (JIC) for "lm" and "glm" objects.
Usage
JIC(model)
Arguments
model |
a "lm" or "glm" object |
Details
JIC (Rahman and King, 1999) is calculated as
-2LL(theta) + 1/2*(klog(n) - nlog(1-k/n))
Value
JIC measurement of the model
References
Rahman, M. S., & King, M. L. (1999). Improved model selection criterion. Communications in Statistics-Simulation and Computation, 28(1), 51-71.
Examples
x1 <- rnorm(100, 3, 2)
x2 <- rnorm(100, 5, 3)
x3 <- rnorm(100, 67, 5)
err <- rnorm(100, 0, 4)
## round so we can use it for Poisson regression
y <- round(3 + 2*x1 - 5*x2 + 8*x3 + err)
m1 <- lm(y~x1 + x2 + x3)
m2 <- glm(y~x1 + x2 + x3, family = "gaussian")
m3 <- glm(y~x1 + x2 + x3, family = "poisson")
JIC(m1)
JIC(m2)
JIC(m3)
Kullback–Leibler Information Criterion
Description
Calculates Kullback–Leibler Information Criterion (KIC) and its corrected form (KICC) for "lm" and "glm" objects.
Usage
KIC(model)
KICC(model)
Arguments
model |
a "lm" or "glm" object |
Details
KIC (Seghouane, 2006) is calculated as
-2LL(theta) + 3k
and KICC (Seghouane, 2006) is calculated as
-2LL(theta) + ((k + 1)(3n - k - 2)) + (k/(n-k))
Value
KIC measurement of the model
References
Seghouane, A. K. (2006). A note on overfitting properties of KIC and KICC. Signal Processing, 86(10), 3055-3060.
Examples
x1 <- rnorm(100, 3, 2)
x2 <- rnorm(100, 5, 3)
x3 <- rnorm(100, 67, 5)
err <- rnorm(100, 0, 4)
## round so we can use it for Poisson regression
y <- round(3 + 2*x1 - 5*x2 + 8*x3 + err)
m1 <- lm(y~x1 + x2 + x3)
m2 <- glm(y~x1 + x2 + x3, family = "gaussian")
m3 <- glm(y~x1 + x2 + x3, family = "poisson")
KIC(m1)
KIC(m2)
KIC(m3)
KICC(m1)
KICC(m2)
KICC(m3)
Scaled Unit Information Prior Bayesian Information Criterion
Description
Calculates Scaled Unit Information Prior Bayesian Information Criterion (SPBIC) for "lm" and "glm" objects.
Usage
SPBIC(model)
Arguments
model |
a "lm" or "glm" object |
Details
SPBIC (Bollen et al., 2012) is calculated as
-2LL(theta) + k(1 - log(k/(beta^T(Sigma)^{-1}beta)))
beta and Sigma are vector and covariance matrix of regression coefficients.
Value
SPBIC measurement of the model
References
Bollen, K. A., Ray, S., Zavisca, J., & Harden, J. J. (2012). A comparison of Bayes factor approximation methods including two new methods. Sociological Methods & Research, 41(2), 294-324.
Examples
x1 <- rnorm(100, 3, 2)
x2 <- rnorm(100, 5, 3)
x3 <- rnorm(100, 67, 5)
err <- rnorm(100, 0, 4)
## round so we can use it for Poisson regression
y <- round(3 + 2*x1 - 5*x2 + 8*x3 + err)
m1 <- lm(y~x1 + x2 + x3)
m2 <- glm(y~x1 + x2 + x3, family = "gaussian")
m3 <- glm(y~x1 + x2 + x3, family = "poisson")
SPBIC(m1)
SPBIC(m2)
SPBIC(m3)
Reverse Fisher Matrix
Description
This function allows you to calculate Fisher Information Matrix using "lm" and "glm" objects.
Usage
reverse_fisher(model)
Arguments
model |
a "lm" or "glm" object |
Details
Calculates Fisher Information Matrix using "lm" and "glm" objects. It uses
Value
a booster object with below components.
n_train |
Number of cases in the input dataset. |