Type: | Package |
Title: | Lagrangian Multiplier Smoothing Splines for Smooth Function Estimation |
Version: | 0.2.0 |
Description: | Implements Lagrangian multiplier smoothing splines for flexible nonparametric regression and function estimation. Provides tools for fitting, prediction, and inference using a constrained optimization approach to enforce smoothness. Supports generalized linear models, Weibull accelerated failure time (AFT) models, quadratic programming problems, and customizable arbitrary correlation structures. Options for fitting in parallel are provided. The method builds upon the framework described by Ezhov et al. (2018) <doi:10.1515/jag-2017-0029> using Lagrangian multipliers to fit cubic splines. For more information on correlation structure estimation, see Searle et al. (2009) <ISBN:978-0470009598>. For quadratic programming and constrained optimization in general, see Nocedal & Wright (2006) <doi:10.1007/978-0-387-40065-5>. For a comprehensive background on smoothing splines, see Wahba (1990) <doi:10.1137/1.9781611970128> and Wood (2006) <ISBN:978-1584884743> "Generalized Additive Models: An Introduction with R". |
License: | MIT + file LICENSE |
Language: | en-US |
Depends: | R (≥ 3.5.0) |
Imports: | Rcpp (≥ 1.0.7), RcppArmadillo, FNN, RColorBrewer, plotly, quadprog, methods, stats |
LinkingTo: | Rcpp, RcppArmadillo |
Suggests: | testthat (≥ 3.0.0), spelling, knitr, rmarkdown, parallel, survival, graphics |
URL: | https://github.com/matthewlouisdavisBioStat/lgspline |
BugReports: | https://github.com/matthewlouisdavisBioStat/lgspline/issues |
Encoding: | UTF-8 |
RoxygenNote: | 7.3.2 |
Config/testthat/edition: | 3 |
NeedsCompilation: | yes |
Packaged: | 2025-04-24 02:16:05 UTC; defgi |
Author: | Matthew Davis |
Maintainer: | Matthew Davis <matthewlouisdavis@gmail.com> |
Repository: | CRAN |
Date/Publication: | 2025-04-24 11:40:13 UTC |
Lagrangian Multiplier Smoothing Splines
Description
Implements Lagrangian multiplier smoothing splines for flexible nonparametric regression and function estimation. Provides tools for fitting, prediction, and inference using a constrained optimization approach to enforce smoothness. Supports generalized linear models, Weibull accelerated failure time (AFT) models, quadratic programming problems, and customizable arbitrary correlation structures. Options for fitting in parallel are provided. The method builds upon the framework described by Ezhov et al. (2018) doi:10.1515/jag-2017-0029 using Lagrangian multipliers to fit cubic splines. For more information on correlation structure estimation, see Searle et al. (2009) <ISBN:978-0470009598>. For quadratic programming and constrained optimization in general, see Nocedal & Wright (2006) doi:10.1007/978-0-387-40065-5. For a comprehensive background on smoothing splines, see Wahba (1990) doi:10.1137/1.9781611970128 and Wood (2006) <ISBN:978-1584884743> "Generalized Additive Models: An Introduction with R".
Author(s)
Maintainer: Matthew Davis matthewlouisdavis@gmail.com (ORCID)
See Also
Useful links:
Report bugs at https://github.com/matthewlouisdavisBioStat/lgspline/issues
Efficient Matrix Multiplication Operator
Description
Operator wrapper around C++ efficient_matrix_mult()
for matrix multiplication syntax.
This is an internal function meant to provide improvement over base R's operator for certain large matrix operations, at a cost of potential slight slowdown for smaller problems.
Usage
x %**% y
Arguments
x |
Left matrix |
y |
Right matrix |
Value
Matrix product of x and y
Examples
M1 <- matrix(1:4, 2, 2)
M2 <- matrix(5:8, 2, 2)
M1 %**% M2
Efficient Matrix Multiplication for \textbf{A}^{T}\textbf{G}\textbf{A}
Description
Efficient Matrix Multiplication for \textbf{A}^{T}\textbf{G}\textbf{A}
Usage
AGAmult_wrapper(
G,
A,
K,
nc,
nca,
parallel,
cl,
chunk_size,
num_chunks,
rem_chunks
)
Arguments
G |
List of G matrices ( |
A |
Constraint matrix ( |
K |
Number of partitions minus 1 ( |
nc |
Number of columns per partition |
nca |
Number of constraint columns |
parallel |
Use parallel processing |
cl |
Cluster object |
chunk_size |
Chunk size for parallel |
num_chunks |
Number of chunks |
rem_chunks |
Remaining chunks |
Details
Computes \textbf{A}^{T}\textbf{G}\textbf{A}
efficiently in parallel chunks using AGAmult_chunk()
.
Value
Matrix product \textbf{A}^{T}\textbf{G}\textbf{A}
Lagrangian Multiplier Smoothing Splines: Mathematical Details
Description
This document provides the mathematical and implementation details for Lagrangian Multiplier Smoothing Splines.
Statistical Problem Formulation
Consider a dataset with observed predictors \mathbf{T}
(an N \times q
matrix) and
corresponding response values \mathbf{y}
(an N \times 1
vector). We assume the
relationship follows a generalized linear model with an unknown smooth function f
:
y_i \sim \mathcal{D}(g^{-1}(f(\mathbf{t}_i)), \sigma^2)
where \mathcal{D}
is a distribution, g
is a link function,
\mathbf{t}_i
is the i
th row of \mathbf{T}
, and \sigma^2
is a dispersion parameter.
The objective is to estimate the unknown function f
that balances:
Goodness of fit: How well the model fits the observed data
Smoothness: Avoiding excessive fluctuations in the estimated function
Interpretability: Understanding the relationship between predictors and response
Lagrangian Multiplier Smoothing Splines address this by:
Partitioning the predictor space into
K+1
regionsFitting local polynomial models within each partition
Explicitly enforcing smoothness where partitions meet using Lagrangian multipliers
Penalizing the integrated squared second derivative of the estimated function
Unlike other smoothing spline formulations, this technique ensures that no post-fitting algebraic rearrangement or disentangelement of a spline basis is needed to obtain interpretable models. The relationship between predictor and response is explicit, and the basis expansions for each partition are homogeneous.
Overview
Lagrangian Multiplier Smoothing Splines fit piecewise polynomial regression models to partitioned data, with smoothness at partition boundaries enforced through Lagrangian multipliers. The approach is penalized by the integrated squared second derivative of the estimated function.
Unlike traditional smoothing splines that implicitly derive piecewise polynomials through optimization, or regression splines using specialized bases (e.g., B-splines), this method:
Explicitly represents polynomial basis functions in a natural form
Uses Lagrangian multipliers to enforce smoothness constraints
Maintains interpretability of coefficient estimates
The fitted model is directly interpretable without the need for reparameterization following fitting. This implementation accommodates non-spline terms and interactions, GLMs, correlation structures, and inequality constraints in addition to linear regression assuming Gaussian response. Extensive customization is offered for users to adapt lgspline for their own modelling frameworks.
Core notation:
-
\mathbf{y}_{(N \times 1)}
: Response vector -
\mathbf{T}_{(N \times q)}
: Matrix of predictors -
\mathbf{X}_{(N \times P)}
: Block-diagonal matrix of polynomial expansions -
\boldsymbol{\Lambda}_{(P \times P)}
: Penalty matrix -
\tilde{\boldsymbol{\beta}}_{(P \times 1)}
: Constrained coefficient estimates -
\mathbf{G}_{(P \times P)}
:(\mathbf{X}^T\mathbf{W}\mathbf{X} + \boldsymbol{\Lambda})^{-1}
-
\mathbf{A}_{(P \times r)}
: Constraint matrix ensuring smoothness -
\mathbf{U}_{(P \times P)}
:\mathbf{I} - \mathbf{G}\mathbf{A}(\mathbf{A}^T\mathbf{G}\mathbf{A})^{-1}\mathbf{A}^T
Model Formulation and Estimation
Model Structure
The method decomposes the predictor space into K+1 partitions and fits polynomial regression models within each partition, constraining the fitted function to be smooth at partition boundaries.
For a single predictor, the function within each partition k is represented as:
f_k(t) = \beta_k^{(0)} + \beta_k^{(1)}t + \beta_k^{(2)}t^2 + \beta_k^{(3)}t^3 + ...
More generally, for each partition k, the model takes the form:
f_k(\mathbf{t}) = \mathbf{x}^T\boldsymbol{\beta}_k
Where \mathbf{x}
contains polynomial basis functions (intercept, linear, quadratic,
cubic terms, and their interactions) and \boldsymbol{\beta}_k
are the corresponding coefficients.
Smoothness constraints enforce that the function value, first and second derivatives match at adjacent partition boundaries:
f_k(t_k) = f_{k+1}(t_k)
f'_k(t_k) = f'_{k+1}(t_k)
f''_k(t_k) = f''_{k+1}(t_k)
These constraints are expressed as linear equations in the \mathbf{A}
matrix such that
\mathbf{A}^T\boldsymbol{\beta} = \mathbf{0}
implies the smoothness conditions are satisfied.
Estimation Procedure
The estimation procedure follows these key steps:
1. Unconstrained estimation:
\hat{\boldsymbol{\beta}} = \mathbf{G}\mathbf{X}^T\mathbf{y}
where \mathbf{G} = (\mathbf{X}^T\mathbf{W}\mathbf{X} + \boldsymbol{\Lambda})^{-1}
for weighted design matrix \mathbf{W}
and penalty matrix \boldsymbol{\Lambda}
.
2. Apply smoothness constraints:
\tilde{\boldsymbol{\beta}} = \hat{\boldsymbol{\beta}} - \mathbf{G}\mathbf{A}(\mathbf{A}^T\mathbf{G}\mathbf{A})^{-1}\mathbf{A}^T\hat{\boldsymbol{\beta}}
This can be computed efficiently as:
\tilde{\boldsymbol{\beta}} = \mathbf{U}\hat{\boldsymbol{\beta}}
where \mathbf{U} = \mathbf{I} - \mathbf{G}\mathbf{A}(\mathbf{A}^T\mathbf{G}\mathbf{A})^{-1}\mathbf{A}^T
.
3. For GLMs, iterative refinement:
Update
\mathbf{G}
based on current estimatesRecompute
\tilde{\boldsymbol{\beta}}
using the new\mathbf{G}
Continue until convergence
4. For inequality constraints (shape or range constraints):
Sequential quadratic programming using
solve.QP
Initial values from the equality-constrained estimates
5. Variance estimation:
\tilde{\sigma}^2 = \frac{\|\mathbf{y} - \mathbf{X}\tilde{\boldsymbol{\beta}}\|^2}{N - \text{trace}(\mathbf{X}\mathbf{U}\mathbf{G}\mathbf{X}^T)}
The variance-covariance matrix of \tilde{\boldsymbol{\beta}}
is estimated as:
\text{Var}(\tilde{\boldsymbol{\beta}}) = \tilde{\sigma}^2\mathbf{U}\mathbf{G}
This approach offers computational efficiency through:
Parallel computation for each partition
Inversion of only small matrices (one per partition)
Efficient calculation of
\mathbf{X}\mathbf{U}\mathbf{G}\mathbf{X}^T
trace
Knot Selection and Partitioning
Univariate Case
For a single predictor, knots are placed at evenly-spaced quantiles of the predictor variable:
The predictor range is divided into K+1 regions using K interior knots
Each observation is assigned to a partition based on these knot boundaries
Custom knots can be specified through the
custom_knots
parameter
Multivariate Case
For multiple predictors, a k-means clustering approach is used:
K+1 cluster centers are identified using k-means on standardized predictors
Neighboring centers are determined using a midpoint criterion:
Centers i and j are neighbors if the midpoint between them is closer to both i and j than to any other center
Observations are assigned to the nearest cluster center
The default number of knots (K) is determined adaptively based on:
Sample size (N)
Number of basis terms per partition (p)
Number of predictors (q)
Model complexity (GLM family)
Custom Model Specification
The package provides interfaces for extending to custom distributions and estimation procedures.
Family Structure
A list containing GLM components:
- family
Character name of distribution
- link
Character name of link function
- linkfun
Function transforming response to linear predictor scale
- linkinv
Function transforming linear predictor to response scale
- custom_dev.resids
Optional function for GCV optimization criterion
Unconstrained Fitting
Core function for unconstrained coefficient estimation per partition:
function(X, y, LambdaHalf, Lambda, keep_weighted_Lambda, family, tol, K, parallel, cl, chunk_size, num_chunks, rem_chunks, order_indices, weights, status, ...) { # Returns p-length coefficient vector }
Dispersion Estimation
Computes scale/dispersion parameter:
function(mu, y, order_indices, family, observation_weights, ...) { # Returns scalar dispersion estimate # Defaults to 1 (no dispersion) }
GLM Weights
Computes observation weights:
function(mu, y, order_indices, family, dispersion, observation_weights, ...) { # Returns weights vector # Defaults to family variance with optional weights }
Schur Corrections
Adjusts covariance matrix for parameter uncertainty:
function(X, y, B, dispersion, order_list, K, family, observation_weights, ...) { # Required for valid standard errors when dispersion affects coefficient estimation }
Constraint Systems
Linear Equality Constraints
Linear constraints enforce exact relationships between coefficients, implemented through the Lagrangian multiplier approach and integrated with smoothness constraints.
Constraints are specified as \mathbf{h}^T\boldsymbol{\beta} = \mathbf{h}^T\boldsymbol{\beta}_0
via:
lgspline( y ~ spl(x), constraint_vectors = h, # Matrix of constraint vectors constraint_values = beta0 # Target values )
Common applications include:
No-intercept models (via
no_intercept = TRUE
)Offset terms with coefficients constrained to 1
Hypothesis testing with nested models
Inequality Constraints
Inequality constraints enforce bounds or directional relationships on the fitted function through sequential quadratic programming.
Built-in constraints include:
Range constraints:
qp_range_lower
andqp_range_upper
Monotonicity:
qp_monotonic_increase
orqp_monotonic_decrease
Derivative constraints:
qp_positive_derivative
orqp_negative_derivative
Custom inequality constraints can be defined through:
-
qp_Amat_fxn
: Function returning constraint matrix -
qp_bvec_fxn
: Function returning constraint vector -
qp_meq_fxn
: Function returning number of equality constraints
Correlation Structure Estimation
Correlation patterns in clustered data are modeled using a matrix whitening approach based on
restricted maximum likelihood (REML). The correlation structure is parameterized through the
square root inverse matrix \mathbf{V}^{-1/2}
.
Available Correlation Structures
-
Exchangeable: Constant correlation within clusters
-
Spatial Exponential: Exponential decay with distance
-
AR(1): Correlation based on rank differences
-
Gaussian/Squared Exponential: Smooth decay with squared distance
-
Spherical: Polynomial decay with exact cutoff
-
Matern: Flexible correlation with adjustable smoothness
-
Gamma-Cosine: Combined gamma decay with oscillation
-
Gaussian-Cosine: Combined Gaussian decay with oscillation
The REML objective function combines correlation structure, parameter estimates, and smoothness constraints:
\frac{1}{N}\big(-\log|\mathbf{V}^{-1/2}| + \frac{N}{2}\log(\sigma^2) + \frac{1}{2\sigma^2}(\mathbf{y} - \mathbf{X}\tilde{\boldsymbol{\beta}})^{T}\mathbf{V}^{-1}(\mathbf{y} - \mathbf{X}\tilde{\boldsymbol{\beta}}) + \frac{1}{2}\log|\sigma^2\mathbf{U}\mathbf{G} | \big)
Analytical gradients are provided for efficient optimization of correlation parameters.
Custom Correlation Functions
Custom correlation structures can be specified through:
-
VhalfInv_fxn
: Creates\mathbf{V}^{-1/2}
-
Vhalf_fxn
: Creates\mathbf{V}^{1/2}
(for non-Gaussian responses) -
REML_grad
: Provides analytical gradient -
VhalfInv_logdet
: Efficient determinant computation -
custom_VhalfInv_loss
: Alternative optimization objective
Penalty Construction and Optimization
Smoothing Spline Penalty
The penalty matrix \boldsymbol{\Lambda}
is constructed as the integrated squared second
derivative of the estimated function over the support of the predictors:
\int_a^b \|f''(t)\|^2 dt = \int_a^b \|\mathbf{x}''^\top \boldsymbol{\beta}_k\|^2 dt = \boldsymbol{\beta}_k^{T}\left\{\int_a^b \mathbf{x}''\mathbf{x}''^{\top} dt\right\} \boldsymbol{\beta}_k = \boldsymbol{\beta}_k^{T}\boldsymbol{\Lambda}_s\boldsymbol{\beta}_k
For a one-dimensional cubic function, the second derivative basis is \mathbf{x}'' = (0, 0, 2, 6t)^{T}
and the penalty matrix has a closed-form expression:
\boldsymbol{\Lambda}_s = \begin{pmatrix}
0 & 0 & 0 & 0 \\
0 & 0 & 0 & 0 \\
0 & 0 & 4(b-a) & 6(b^2-a^2) \\
0 & 0 & 6(b^2-a^2) & 12(b^3-a^3)
\end{pmatrix}
The full penalty matrix combines the smoothing spline penalty with a ridge penalty on non-spline terms and optional predictor-specific or partition-specific penalties:
\boldsymbol{\Lambda} = \lambda_s (\boldsymbol{\Lambda}_s + \lambda_r\boldsymbol{\Lambda}_r + \sum_{l=1}^L \lambda_l \boldsymbol{\Lambda}_l)
where:
-
\lambda_s
is the global smoothing parameter (wiggle_penalty
) -
\boldsymbol{\Lambda}_s
is the smoothing spline penalty -
\boldsymbol{\Lambda}_r
is a ridge penalty on lower-order terms (flat_ridge_penalty
) -
\lambda_l
are optional predictor/partition-specific penalties
This penalty structure defines a meaningful metric for function smoothness, pulling estimates toward linear functions rather than simply shrinking coefficients toward zero.
Penalty Optimization
A penalized variant of Generalized Cross Validation (GCV) is used to find optimal penalties:
\text{GCV} = \frac{\sum_{i=1}^N r_i^2}{N \big(1-\frac{1}{N}\text{trace}(\mathbf{X}\mathbf{U}\mathbf{G}\mathbf{X}^T) \big)^2} + \frac{mp}{2} \sum_{l=1}^{L}(\log (1+\lambda_{l})-1)^2
where:
-
r_i = y_i - \tilde{y}_i
are residuals (or replaced with custom alternative for GLMs) -
\text{trace}(\mathbf{X}\mathbf{U}\mathbf{G}\mathbf{X}^T)
represents effective degrees of freedom -
mp
is the "meta-penalty" term that regularizes predictor/partition-specific penalties
For GLMs, a pseudo-count approach is used to ensure valid link transformations, or
custom_dev.resids
can be provided to replace the sum-of-squared errors.
Optimization employs:
Grid search for initial values
Damped BFGS with analytical gradients
Automated restarts and error handling
Inflation factor
((N+2)/(N-2))^2
for final penalties
References
Buse, A., & Lim, L. (1977). Cubic Splines as a Special Case of Restricted Least Squares. Journal of the American Statistical Association, 72, 64-68.
Craven, P., & Wahba, G. (1978). Smoothing noisy data with spline functions. Numerische Mathematik, 31, 377-403.
Eilers, P. H. C., & Marx, B. D. (1996). Flexible smoothing with B-splines and penalties. Statistical Science, 11, 89-121.
Ezhov, N., Neitzel, F., & Petrovic, S. (2018). Spline approximation, Part 1: Basic methodology. Journal of Applied Geodesy, 12, 139-155.
Kisi, O., Heddam, S., Parmar, K. S., Petroselli, A., Külls, C., & Zounemat-Kermani, M. (2025). Integration of Gaussian process regression and K means clustering for enhanced short term rainfall runoff modeling. Scientific Reports, 15, 7444.
Nocedal, J., & Wright, S. J. (2006). Numerical Optimization (2nd ed.). Springer.
Reinsch, C. H. (1967). Smoothing by spline functions. Numerische Mathematik, 10, 177-183.
Searle, S. R., Casella, G., & McCulloch, C. E. (2009). Variance Components. Wiley Series in Probability and Statistics. Wiley.
Silverman, B. W. (1984). Spline Smoothing: The Equivalent Variable Kernel Method. The Annals of Statistics, 12, 898-916.
Wahba, G. (1990). Spline Models for Observational Data. Society for Industrial and Applied Mathematics.
Wood, S. N. (2006). Generalized Additive Models: An Introduction with R. Chapman & Hall/CRC, Boca Raton.
Efficient Matrix Multiplication of G and A Matrices
Description
Efficient Matrix Multiplication of G and A Matrices
Usage
GAmult_wrapper(
G,
A,
K,
nc,
nca,
parallel,
cl,
chunk_size,
num_chunks,
rem_chunks
)
Arguments
G |
List of G matrices |
A |
Constraint matrix |
K |
Number of partitions minus 1 |
nc |
Number of columns per partition |
nca |
Number of constraint columns |
parallel |
Use parallel processing |
cl |
Cluster object |
chunk_size |
Size of parallel chunks |
num_chunks |
Number of chunks |
rem_chunks |
Remaining chunks |
Details
Computes G Processes in parallel chunks if enabled. Avoids unnecessary operations.
Value
Matrix product
Finite-difference Gradient Computer
Description
Computes finite-difference approximation of gradient given input of arguments x and function fn
Usage
approx_grad(x, fn, eps = sqrt(.Machine$double.eps))
Arguments
x |
Numeric vector of function arguments |
fn |
Function returning list(objective, gradient) |
eps |
Numeric scalar, finite difference tolerance |
Details
Used within efficient_bfgs
if needed externally, but internally, this function
is actually ignored since when VhalfInv_grad
is not supplied, stats::optim()
is used instead.
Value
Numeric vector of finite-difference approximated gradient
Matrix Inversion using Armadillo
Description
Computes the inverse of a matrix using Armadillo's inversion method
Usage
armaInv(x)
Arguments
x |
Input matrix to be inverted |
Value
Inverted matrix
Extract model coefficients
Description
Extracts polynomial coefficients for each partition from a fitted lgspline model.
Usage
## S3 method for class 'lgspline'
coef(object, ...)
Arguments
object |
A fitted lgspline model object containing coefficient vectors. |
... |
Not used. |
Details
For each partition, coefficients represent a polynomial expansion of the predictor(s) by column index, for example:
intercept: Constant term
v: Linear term
v_^2: Quadratic term
v^3: Cubic term
_v_x_w_: Interaction between v and w
If column/variable names are present, indices will be replaced with column/variable names.
Coefficients can be accessed either as separate vectors per partition or combined into
a single matrix using Reduce('cbind', coef(model_fit))
.
Value
A list where each element corresponds to a partition and contains a single-column matrix of coefficient values for that partition. Row names indicate the term type. Returns NULL if coefficients are not found in the object.
- partition1, partition2, ...
Matrices containing coefficients for each partition.
See Also
Examples
## Simulate some data and fit using default settings
set.seed(1234)
t <- runif(1000, -10, 10)
y <- 2*sin(t) + -0.06*t^2 + rnorm(length(t))
model_fit <- lgspline(t, y)
## Extract coefficients
coefficients <- coef(model_fit)
## Print coefficients for first partition
print(coefficients[[1]])
## Compare coefficients across all partitions
print(Reduce('cbind', coefficients))
Collapse Matrix List into a Single Block-Diagonal Matrix
Description
Transforms a list of matrices into a single block-diagonal matrix. This is useful for quadratic programming problems especially, where the block-diagonal operations may not be plausible.
Usage
collapse_block_diagonal(matlist)
Arguments
matlist |
List of input matrices |
Value
Block-diagonal matrix combining input matrices
Compute Eigenvalues and Related Matrices for G
Description
Compute Eigenvalues and Related Matrices for G
Usage
compute_G_eigen(
X_gram,
Lambda,
K,
parallel,
cl,
chunk_size,
num_chunks,
rem_chunks,
family,
unique_penalty_per_partition,
L_partition_list,
keep_G = TRUE,
shur_corrections
)
Arguments
X_gram |
Gram matrix list ( |
Lambda |
Penalty matrix ( |
K |
Number of partitions minus 1 ( |
parallel |
Use parallel processing |
cl |
Cluster object |
chunk_size |
Chunk size for parallel |
num_chunks |
Number of chunks |
rem_chunks |
Remaining chunks |
family |
GLM family |
unique_penalty_per_partition |
Use partition penalties |
L_partition_list |
Partition penalty list ( |
keep_G |
Return full G matrix ( |
shur_corrections |
List of Shur complement corrections ( |
Details
Computes \textbf{G}
, \textbf{G}^{1/2}
and \textbf{G}^{-1/2}
matrices via eigendecomposition of \textbf{X}^{T}\textbf{X} + \boldsymbol{\Lambda}_\text{effective} + \textbf{S}
.
Handles partition-specific penalties and parallel processing.
For non-identity link functions, also returns \textbf{G}^{-1/2}
.
Value
List containing combinations of:
G - Full
\textbf{G}
matrix (ifkeep_G=TRUE
)Ghalf -
\textbf{G}^{1/2}
matrixGhalfInv -
\textbf{G}^{-1/2}
matrix (for non-identity links)
Compute Component \textbf{G}^{1/2}\textbf{A}(\textbf{A}^{T}\textbf{G}\textbf{A})^{-1}\textbf{A}^{T}\textbf{G}\textbf{X}^{T}\textbf{y}
Description
Compute Component \textbf{G}^{1/2}\textbf{A}(\textbf{A}^{T}\textbf{G}\textbf{A})^{-1}\textbf{A}^{T}\textbf{G}\textbf{X}^{T}\textbf{y}
Usage
compute_GhalfXy_temp_wrapper(
G,
Ghalf,
A,
AGAInv,
Xy,
nc,
K,
parallel,
cl,
chunk_size,
num_chunks,
rem_chunks
)
Arguments
G |
List of |
Ghalf |
List of |
A |
Constraint matrix |
AGAInv |
Inverse of |
Xy |
List of |
nc |
Number of columns |
K |
Number of partitions minus 1 ( |
parallel |
Use parallel processing |
cl |
Cluster object |
chunk_size |
Size of parallel chunks |
num_chunks |
Number of chunks |
rem_chunks |
Remaining chunks |
Details
Computes \textbf{G}^{1/2}\textbf{A}(\textbf{A}^{T}\textbf{G}\textbf{A})^{-1}\textbf{A}^{T}\textbf{G}\textbf{X}^{T}\textbf{y}
efficiently in parallel chunks.
Note: The description seems slightly off from the C++ helper functions called (e.g., 'compute_AGXy', 'compute_result_blocks'). This computes a component needed for least-squares transformation involving \textbf{G}^{1/2}
.
Returns both the result and intermediate \textbf{A}^{T}\textbf{G}\textbf{A})^{-1}\textbf{A}^{T}\textbf{G}\textbf{X}^{T}\textbf{y}
product for reuse.
Value
List containing:
Result vector
AGAInvAGXy intermediate product
(\textbf{A}^{T}\textbf{G}\textbf{A})^{-1}\textbf{A}^{T}\textbf{G}\textbf{X}^{T}\textbf{y}
Construct Smoothing Spline Penalty Matrix
Description
Builds penalty matrix combining smoothing spline and ridge penalties with optional predictor/partition-specific components. Handles custom penalties and scaling.
Usage
compute_Lambda(
custom_penalty_mat,
L1,
wiggle_penalty,
flat_ridge_penalty,
K,
nc,
unique_penalty_per_predictor,
unique_penalty_per_partition,
penalty_vec,
colnm_expansions,
just_Lambda = TRUE
)
Arguments
custom_penalty_mat |
Matrix; optional custom ridge penalty structure |
L1 |
Matrix; integrated squared second derivative penalty ( |
wiggle_penalty , flat_ridge_penalty |
Numeric; smoothing and ridge penalty parameters |
K |
Integer; number of interior knots ( |
nc |
Integer; number of basis columns per partition |
unique_penalty_per_predictor , unique_penalty_per_partition |
Logical; enable predictor/partition-specific penalties |
penalty_vec |
Named numeric; custom penalty values for predictors/partitions |
colnm_expansions |
Character; column names for linking penalties to predictors |
just_Lambda |
Logical; return only combined penalty matrix ( |
Value
List containing:
Lambda - Combined
nc \times nc
penalty matrix (\boldsymbol{\Lambda}
)L1 - Smoothing spline penalty matrix (
\textbf{L}_1
)L2 - Ridge penalty matrix (
\textbf{L}_2
)L_predictor_list - List of predictor-specific penalty matrices (
\textbf{L}_\text{predictor\_list}
)L_partition_list - List of partition-specific penalty matrices (
\textbf{L}_\text{partition\_list}
)
If just_Lambda=TRUE
and no partition penalties, returns only Lambda matrix \boldsymbol{\Lambda}
.
Compute Derivative of Penalty Matrix G with Respect to Lambda
Description
Calculates the derivative of the penalty matrix \textbf{G}
with respect to the
smoothing parameter lambda (\lambda
), supporting both global and partition-specific penalties.
This is related to the derivative of the diagonal weight matrix 1/(1+\textbf{x}^{T}\textbf{U}\textbf{G}\textbf{x})
w.r.t. the penalty.
Usage
compute_dG_dlambda(
G,
L,
K,
lambda,
unique_penalty_per_partition,
L_partition_list,
parallel,
cl,
chunk_size,
num_chunks,
rem_chunks
)
Arguments
G |
A list of penalty matrices |
L |
The base penalty matrix |
K |
Number of partitions minus 1 ( |
lambda |
Smoothing parameter value |
unique_penalty_per_partition |
Logical indicating partition-specific penalties |
L_partition_list |
Optional list of partition-specific penalty matrices |
parallel |
Logical to enable parallel processing |
cl |
Cluster object for parallel computation |
chunk_size |
Size of chunks for parallel processing |
num_chunks |
Number of chunks |
rem_chunks |
Remainder chunks |
Value
A list of derivative matrices d\textbf{G}/d\lambda
for each partition
Compute Derivative of \textbf{U}\textbf{G}\textbf{X}^{T}\textbf{y}
with Respect to Lambda
Description
Compute Derivative of \textbf{U}\textbf{G}\textbf{X}^{T}\textbf{y}
with Respect to Lambda
Usage
compute_dG_u_dlambda_xy(
AGAInv_AGXy,
AGAInv,
G,
A,
dG_dlambda,
nc,
nca,
K,
Xy,
Ghalf,
dGhalf,
GhalfXy_temp,
parallel,
cl,
chunk_size,
num_chunks,
rem_chunks
)
Arguments
AGAInv_AGXy |
Product of |
AGAInv |
Inverse of |
G |
List of |
A |
Constraint matrix |
dG_dlambda |
List of |
nc |
Number of columns |
nca |
Number of constraint columns |
K |
Number of partitions minus 1 ( |
Xy |
List of |
Ghalf |
List of |
dGhalf |
List of |
GhalfXy_temp |
Temporary storage for |
parallel |
Use parallel processing |
cl |
Cluster object |
chunk_size |
Size of parallel chunks |
num_chunks |
Number of chunks |
rem_chunks |
Remaining chunks |
Details
Computes d(\textbf{U}\textbf{G}\textbf{X}^{T}\textbf{y})/d\lambda
.
Uses efficient implementation avoiding large matrix construction.
For large problems (K \ge 10
, nc > 4
), uses chunked parallel processing.
For smaller problems, uses simpler least squares approach based on \textbf{G}^{1/2}
.
Value
Vector of derivatives
Compute Matrix Square Root Derivative
Description
Calculates d\textbf{G}^{1/2}/d\lambda
matrices for each partition using eigendecomposition.
Follows similar approach to compute_G_eigen()
but for matrix derivatives.
Usage
compute_dGhalf(
dG_dlambda,
nc,
K,
parallel,
cl,
chunk_size,
num_chunks,
rem_chunks
)
Arguments
dG_dlambda |
List of |
nc |
Integer; number of columns per partition |
K |
Integer; number of interior knots ( |
parallel , cl , chunk_size , num_chunks , rem_chunks |
Parallel computation parameters |
Value
List of nc \times nc
matrices containing d\textbf{G}_k^{1/2}/d\lambda
for each partition k
Compute Derivative of Penalty Matrix G with Respect to Lambda (Wrapper)
Description
Wrapper function for computing the derivative of the weight matrix w.r.t lambda \lambda
.
This involves computing terms related to the derivative of 1/(1+\textbf{x}^{T}\textbf{U}\textbf{G}\textbf{x})
.
Usage
compute_dW_dlambda_wrapper(
G,
A,
GXX,
Ghalf,
dG_dlambda,
dGhalf_dlambda,
AGAInv,
nc,
K,
parallel,
cl,
chunk_size,
num_chunks,
rem_chunks
)
Arguments
G |
A list of penalty matrices |
A |
Constraint matrix |
GXX |
List of |
Ghalf |
List of |
dG_dlambda |
List of |
dGhalf_dlambda |
List of |
AGAInv |
Inverse of |
nc |
Number of columns |
K |
Number of partitions minus 1 ( |
parallel |
Logical to enable parallel processing |
cl |
Cluster object for parallel computation |
chunk_size |
Size of chunks for parallel processing |
num_chunks |
Number of chunks |
rem_chunks |
Remainder chunks |
Value
Scalar value representing the trace derivative component.
Compute Gram Matrix for Block Diagonal Structure
Description
Compute Gram Matrix for Block Diagonal Structure
Usage
compute_gram_block_diagonal(
list_in,
parallel,
cl,
chunk_size,
num_chunks,
rem_chunks
)
Arguments
list_in |
List of matrices |
parallel |
Use parallel processing |
cl |
Cluster object |
chunk_size |
Chunk size for parallel |
num_chunks |
Number of chunks |
rem_chunks |
Remaining chunks |
Details
For a list of matrices, will compute the gram matrix of each element of the list.
Value
List of Gram matrices (\textbf{X}^{T}\textbf{X}
) for each block
Calculate Trace of Matrix Product \text{trace}(\textbf{X}\textbf{U}\textbf{G}\textbf{X}^{T})
Description
Calculate Trace of Matrix Product \text{trace}(\textbf{X}\textbf{U}\textbf{G}\textbf{X}^{T})
Usage
compute_trace_UGXX_wrapper(
G,
A,
GXX,
AGAInv,
nc,
nca,
K,
parallel,
cl,
chunk_size,
num_chunks,
rem_chunks
)
Arguments
G |
List of G matrices ( |
A |
Constraint matrix ( |
GXX |
List of |
AGAInv |
Inverse of |
nc |
Number of columns |
nca |
Number of constraint columns |
K |
Number of partitions minus 1 ( |
parallel |
Use parallel processing |
cl |
Cluster object |
chunk_size |
Size of parallel chunks |
num_chunks |
Number of chunks |
rem_chunks |
Remaining chunks |
Details
Computes \text{trace}(\textbf{X}\textbf{U}\textbf{G}\textbf{X}^{T})
where \textbf{U} = \textbf{I} - \textbf{G}\textbf{A}(\textbf{A}^{T}\textbf{G}\textbf{A})^{-1}\textbf{A}^{T}
.
Handles parallel computation by splitting into chunks.
Value
Trace value
Create Block Diagonal Matrix
Description
Create Block Diagonal Matrix
Usage
create_block_diagonal(matrix_list)
Arguments
matrix_list |
List of matrices to arrange diagonally |
Details
Takes in a list of matrices, and returns a block-diagonal matrix with each element of the list as one block. All off-diagonal elements are 0. Matrices must have compatible dimensions.
Value
Block diagonal matrix with input matrices on diagonal
Create One-Hot Encoded Matrix
Description
Converts a categorical vector into a one-hot encoded matrix where each unique value becomes a binary column.
Usage
create_onehot(x)
Arguments
x |
A vector containing categorical values (factors, character, etc.) |
Details
The function creates dummy variables for each unique value in the input vector using
model.matrix()
with dummy-intercept coding. Column names are cleaned by removing the
'x' prefix added by model.matrix()
.
Value
A data frame containing the one-hot encoded binary columns with cleaned column names
Examples
## lgspline will not accept this format of "catvar", because inputting data
# this way can cause difficult-to-diagnose issues in formula parsing
# all variables must be numeric
df <- data.frame(numvar = rnorm(100),
catvar = rep(LETTERS[1:4],
25))
print(head(df))
## Instead, replace with dummy-intercept coding by
# 1) applying one-hot encoding
# 2) dropping the first column
# 3) appending to our data
dummy_intercept_coding <- create_onehot(df$catvar)[,-1]
df$catvar <- NULL
df <- cbind(df, dummy_intercept_coding)
print(head(df))
Damped Newton-Raphson Parameter Optimization
Description
Performs iterative parameter estimation with adaptive step-size dampening
Internal function for fitting unconstrained GLM models using damped Newton-Raphson optimization technique.
Usage
damped_newton_r(
parameters,
loglikelihood,
gradient,
neghessian,
tol = 1e-07,
max_cnt = 64,
max_dmp_steps = 16
)
Arguments
parameters |
Initial parameter vector to be optimized |
loglikelihood |
Function computing log-likelihood for current parameters |
gradient |
Function computing parameter gradients |
neghessian |
Function computing negative Hessian matrix |
tol |
Numeric convergence tolerance (default 1e-7) |
max_cnt |
Maximum number of optimization iterations (default 64) |
max_dmp_steps |
Maximum damping step attempts (default 16) |
Details
Implements a robust damped Newton-Raphson optimization algorithm.
Value
Optimized parameter estimates after convergence or reaching iteration limit
See Also
- nr_iterate
for parameter update computation
BFGS Implementation for REML Parameter Estimation
Description
BFGS optimizer designed for REML optimization of correlation parameters. Combines function evaluation and gradient computation into single call to avoid redundant model refitting.
Usage
efficient_bfgs(par, fn, control = list())
Arguments
par |
Numeric vector of initial parameter values. |
fn |
Function returning list(objective, gradient). Must return both objective value and gradient vector matching length(par). |
control |
List of control parameters:
|
Details
Implements BFGS, used internally by lgspline()
for optimizing correlation parameters via REML
when argument for computing gradient VhalfInv_grad
is not NULL.
This is more efficient than native BFGS, since gradient and loss can be computed simultaneously, avoiding re-computing components in "fn" and "gr" separately.
Value
List containing:
- par
Parameter vector minimizing objective
- value
Minimum objective value
- counts
Number of iterations
- convergence
TRUE if converged within maxit
- message
Description of termination status
- vcov
Final approximation of inverse-Hessian, useful for inference
Examples
## Minimize Rosenbrock function
fn <- function(x) {
# Objective
f <- 100*(x[2] - x[1]^2)^2 + (1-x[1])^2
# Gradient
g <- c(-400*x[1]*(x[2] - x[1]^2) - 2*(1-x[1]),
200*(x[2] - x[1]^2))
list(f, g)
}
(res <- efficient_bfgs(c(0.5, 2.5), fn))
## Compare to
(res0 <- stats::optim(c(0.5, 2.5), function(x)fn(x)[[1]], hessian = TRUE))
solve(res0$hessian)
Efficient Matrix Multiplication
Description
Performs matrix multiplication using RcppArmadillo
Usage
efficient_matrix_mult(A, B)
Arguments
A |
First input matrix |
B |
Second input matrix |
Value
Matrix product of A and B
Generate Grid Indices Without expand.grid()
Description
Generate Grid Indices Without expand.grid()
Usage
expgrid(vec_list, indices)
Arguments
vec_list |
List of vectors to combine |
indices |
Indices of combinations to return |
Details
Returns selected combinations from the cartesian product of vec_list
without constructing full expand.grid()
for memory efficiency.
Value
Data frame of selected combinations
Find Extremum of Fitted Lagrangian Multiplier Smoothing Spline
Description
Finds global extrema of a fitted lgspline model using deterministic or stochastic optimization strategies. Supports custom objective functions for advanced applications like Bayesian optimization acquisition functions.
Usage
find_extremum(
object,
vars = NULL,
quick_heuristic = TRUE,
initial = NULL,
B_predict = NULL,
minimize = FALSE,
stochastic = FALSE,
stochastic_draw = function(mu, sigma, ...) {
N <- length(mu)
rnorm(N, mu,
sigma)
},
sigmasq_predict = object$sigmasq_tilde,
custom_objective_function = NULL,
custom_objective_derivative = NULL,
...
)
Arguments
object |
A fitted lgspline model object containing partition information and fitted values |
vars |
Vector; A vector of numeric indices (or character variable names) of predictors to optimize for. If NULL (by default), all predictors will be optimized. |
quick_heuristic |
Logical; whether to search only the top-performing partition. When TRUE (default), optimizes within the best partition. When FALSE, initiates searches from all partition local maxima. |
initial |
Numeric vector; Optional initial values for optimization. Useful for fixing binary predictors or providing starting points. Default NULL |
B_predict |
Matrix; Optional custom coefficient list for prediction. Useful for posterior draws in Bayesian optimization. Default NULL |
minimize |
Logical; whether to find minimum instead of maximum. Default FALSE |
stochastic |
Logical; whether to add noise for stochastic optimization. Enables better exploration of the function space. Default FALSE |
stochastic_draw |
Function; Generates random noise/modifies predictions for stochastic optimization, analogous to posterior_predictive_draw. Takes three arguments:
Default |
sigmasq_predict |
Numeric; Variance parameter for stochastic optimization. Controls the magnitude of random perturbations. Defaults to object$sigmasq_tilde |
custom_objective_function |
Function; Optional custom objective function for optimization. Takes arguments:
Default NULL |
custom_objective_derivative |
Function; Optional gradient function for custom optimization objective. Takes arguments:
Default NULL |
... |
Additional arguments passed to internal optimization routines. |
Details
This method finds extrema (maxima or minima) of the fitted function or composite functions of the fit. The optimization process can be customized through several approaches:
Partition-based search: Either focuses on the top-performing partition (quick_heuristic = TRUE) or searches across all partition local maxima
Stochastic optimization: Adds random noise during optimization for better exploration
Custom objectives: Supports user-defined objective functions and gradients for specialized optimization tasks like Bayesian optimization
Value
A list containing the following components:
- t
Numeric vector of input values at the extremum.
- y
Numeric value of the objective function at the extremum.
See Also
lgspline
for fitting the model,
generate_posterior
for generating posterior draws
Examples
## Basic usage with simulated data
set.seed(1234)
t <- runif(1000, -10, 10)
y <- 2*sin(t) + -0.06*t^2 + rnorm(length(t))
model_fit <- lgspline(t, y)
plot(model_fit)
## Find global maximum and minimum
max_point <- find_extremum(model_fit)
min_point <- find_extremum(model_fit, minimize = TRUE)
abline(v = max_point$t, col = 'blue') # Add maximum point
abline(v = min_point$t, col = 'red') # Add minimum point
## Advanced usage: custom objective functions
# expected improvement acquisition function
ei_custom_objective_function = function(mu, sigma, y_best, ...) {
d <- y_best - mu
d * pnorm(d/sigma) + sigma * dnorm(d/sigma)
}
# derivative of ei
ei_custom_objective_derivative = function(mu, sigma, y_best, d_mu, ...) {
d <- y_best - mu
z <- d/sigma
d_z <- -d_mu/sigma
pnorm(z)*d_mu - d*dnorm(z)*d_z + sigma*z*dnorm(z)*d_z
}
## Single iteration of Bayesian optimization
post_draw <- generate_posterior(model_fit)
acq <- find_extremum(model_fit,
stochastic = TRUE, # Enable stochastic exploration
B_predict = post_draw$post_draw_coefficients,
sigmasq_predict = post_draw$post_draw_sigmasq,
custom_objective_function = ei_custom_objective_function,
custom_objective_derivative = ei_custom_objective_derivative)
abline(v = acq$t, col = 'green') # Add acquisition point
Find Neighboring Cluster Partitions Using Midpoint Distance Criterion
Description
Identifies neighboring partitions by evaluating whether the midpoint between cluster centers is closer to those centers than to any other center.
Usage
find_neighbors(centers, parallel, cl, neighbor_tolerance)
Arguments
centers |
Matrix; rows are cluster center coordinates |
parallel |
Logical; use parallel processing |
cl |
Cluster object for parallel execution |
neighbor_tolerance |
Numeric; scaling factor for distance comparisons |
Value
List where element i contains indices of centers neighboring center i
Generate Posterior Samples from Fitted Lagrangian Multiplier Smoothing Spline
Description
Draws samples from the posterior distribution of model parameters and optionally generates posterior predictive samples. Uses Laplace approximation for non-Gaussian responses.
Usage
generate_posterior(
object,
new_sigmasq_tilde = object$sigmasq_tilde,
new_predictors = object$X[[1]],
theta_1 = 0,
theta_2 = 0,
posterior_predictive_draw = function(N, mean, sqrt_dispersion, ...) {
rnorm(N,
mean, sqrt_dispersion)
},
draw_dispersion = TRUE,
include_posterior_predictive = FALSE,
num_draws = 1,
...
)
Arguments
object |
A fitted lgspline model object containing model parameters and fit statistics |
new_sigmasq_tilde |
Numeric; Dispersion parameter for sampling. Controls variance of posterior draws. Default object$sigmasq_tilde |
new_predictors |
Matrix; New data matrix for posterior predictive sampling. Should match
structure of original predictors. Default = predictors as input to |
theta_1 |
Numeric; Shape parameter for prior gamma distribution of inverse-dispersion. Default 0 implies uniform prior |
theta_2 |
Numeric; Rate parameter for prior gamma distribution of inverse-dispersion. Default 0 implies uniform prior |
posterior_predictive_draw |
Function; Random number generator for posterior predictive samples. Takes arguments:
|
draw_dispersion |
Logical; whether to sample the dispersion parameter from its posterior distribution. When FALSE, uses point estimate. Default TRUE |
include_posterior_predictive |
Logical; whether to generate posterior predictive samples for new observations. Default FALSE |
num_draws |
Integer; Number of posterior draws to generate. Default 1 |
... |
Additional arguments passed to internal sampling routines. |
Details
Implements posterior sampling using the following approach:
Coefficient posterior: Assumes sqrt(N)B ~ N(Btilde, sigma^2UG)
Dispersion parameter: Sampled from inverse-gamma distribution with user-specified prior parameters (theta_1, theta_2) and model-based sufficient statistics
Posterior predictive: Generated using custom sampling function, defaulting to Gaussian for standard normal responses
For the dispersion parameter, the sampling process follows for a fitted lgspline object "model_fit" (where unbias_dispersion is coerced to 1 if TRUE, 0 if FALSE)
shape <- theta_1 + 0.5 * (model_fit$N - model_fit$unbias_dispersion * model_fit$trace_XUGX) rate <- theta_2 + 0.5 * (model_fit$N - model_fit$unbias_dispersion * model_fit$trace_XUGX) * new_sigmasq_tilde post_draw_sigmasq <- 1/rgamma(1, shape, rate)
Users can modify sufficient statistics by adjusting theta_1 and theta_2 relative to the default model-based values.
Value
A list containing the following components:
- post_draw_coefficients
List of length num_draws containing posterior coefficient samples.
- post_draw_sigmasq
List of length num_draws containing posterior dispersion parameter samples (or repeated point estimate if draw_dispersion = FALSE).
- post_pred_draw
List of length num_draws containing posterior predictive samples (only if include_posterior_predictive = TRUE).
See Also
lgspline
for model fitting,
wald_univariate
for Wald-type inference
Examples
## Generate example data
t <- runif(1000, -10, 10)
true_y <- 2*sin(t) + -0.06*t^2
y <- rnorm(length(true_y), true_y, 1)
## Fit model (using unstandardized expansions for consistent inference)
model_fit <- lgspline(t, y,
K = 7,
standardize_expansions_for_fitting = FALSE)
## Compare Wald (= t-intervals here) to Monte Carlo credible intervals
# Get Wald intervals
wald <- wald_univariate(model_fit,
cv = qt(0.975, df = model_fit$trace_XUGX))
wald_bounds <- cbind(wald[["interval_lb"]], wald[["interval_ub"]])
## Generate posterior samples (uniform prior)
post_draws <- generate_posterior(model_fit,
theta_1 = -1,
theta_2 = 0,
num_draws = 2000)
## Convert to matrix and compute credible intervals
post_mat <- Reduce('cbind',
lapply(post_draws$post_draw_coefficients,
function(x) Reduce("rbind", x)))
post_bounds <- t(apply(post_mat, 1, quantile, c(0.025, 0.975)))
## Compare intervals
print(round(cbind(wald_bounds, post_bounds), 4))
Compute Integrated Squared Second Derivative Penalty Matrix for Smoothing Splines
Description
Generates a penalty matrix representing the integrated squared second derivatives for smoothing spline basis functions, which controls the smoothness of the fitted curve.
Usage
get_2ndDerivPenalty(
colnm_expansions,
C,
power1_cols,
power2_cols,
power3_cols,
power4_cols,
interaction_single_cols,
interaction_quad_cols,
triplet_cols,
nc,
select_cols = NULL
)
Arguments
colnm_expansions |
Character vector of column names for basis expansions |
C |
Numeric matrix of basis expansions |
power1_cols |
Indices of linear term columns |
power2_cols |
Indices of quadratic term columns |
power3_cols |
Indices of cubic term columns |
power4_cols |
Indices of quartic term columns |
interaction_single_cols |
Indices of single interaction columns |
interaction_quad_cols |
Indices of quadratic interaction columns |
triplet_cols |
Indices of triplet interaction columns |
nc |
Number of cubic expansions |
select_cols |
Optional vector of column indices to penalize (default: all linear terms) |
Details
This function computes the analytic form of the traditional integrated, squared, second-derivative evaluated over the bounds of the input data.
If f(x) = \textbf{X}\boldsymbol{\beta}
, then the penalty is based on \int \{ f''(x) \}^2 dx = \boldsymbol{\beta}^{T}(\int \textbf{X}''^{T}\textbf{X}'' dx)\boldsymbol{\beta}
.
This function computes the matrix \textbf{P} = \int \textbf{X}''^{T}\textbf{X}'' dx
.
When scaled by a non-negative scalar (wiggle penalty, predictor penalties and/or partition penalties), this becomes the smoothing spline penalty.
Value
A symmetric p \times p
penalty matrix \textbf{P}
representing integrated squared second derivatives
for basis expansions in a single partition of the smoothing spline.
Wrapper for Smoothing Spline Penalty Computation
Description
Computes smoothing spline penalty matrix with optional parallel processing.
Calls get_2ndDerivPenalty
after
processing spline vs. nonspline terms and preparing for parallel if desired.
Usage
get_2ndDerivPenalty_wrapper(
K,
colnm_expansions,
C,
power1_cols,
power2_cols,
power3_cols,
power4_cols,
interaction_single_cols,
interaction_quad_cols,
triplet_cols,
nonspline_cols,
nc,
parallel,
cl
)
Arguments
K |
Number of partitions ( |
colnm_expansions |
Column names of basis expansions |
C |
Basis expansion matrix of two rows, first of all maximums, second of all minimums, for all variables of interest = |
power1_cols |
Linear term columns |
power2_cols |
Quadratic term columns |
power3_cols |
Cubic term columns |
power4_cols |
Quartic term columns |
interaction_single_cols |
Single interaction columns |
interaction_quad_cols |
Quadratic interaction columns |
triplet_cols |
Triplet interaction columns |
nonspline_cols |
Predictors not treated as spline effects |
nc |
Number of cubic expansions |
parallel |
Logical to enable parallel processing |
cl |
Cluster object for parallel computation |
Value
A p \times p
penalty matrix for smoothing spline regularization containing the
elementwise sum of the integrated squared second derivative of the fitted
function with respect to predictors of interest.
Function is exported for reference purposes - use at your own risk!
Get Constrained GLM Coefficient Estimates
Description
Core estimation function for Lagrangian multiplier smoothing splines. Computes coefficient estimates under smoothness constraints and penalties for GLMs, handling four distinct computational paths depending on model structure.
Usage
get_B(
X,
X_gram,
Lambda,
keep_weighted_Lambda,
unique_penalty_per_partition,
L_partition_list,
A,
Xy,
y,
K,
nc,
nca,
Ghalf,
GhalfInv,
parallel_eigen,
parallel_aga,
parallel_matmult,
parallel_unconstrained,
cl,
chunk_size,
num_chunks,
rem_chunks,
family,
unconstrained_fit_fxn,
iterate,
qp_score_function,
quadprog,
qp_Amat,
qp_bvec,
qp_meq,
prevB = NULL,
prevUnconB = NULL,
iter_count = 0,
prev_diff = Inf,
tol,
constraint_value_vectors,
order_list,
glm_weight_function,
shur_correction_function,
need_dispersion_for_estimation,
dispersion_function,
observation_weights,
homogenous_weights,
return_G_getB,
blockfit,
just_linear_without_interactions,
Vhalf,
VhalfInv,
...
)
Arguments
X |
List of design matrices by partition |
X_gram |
List of Gram matrices by partition |
Lambda |
Combined penalty matrix (smoothing spline + ridge) |
keep_weighted_Lambda |
Logical; retain GLM weights in smoothing spline penalty |
unique_penalty_per_partition |
Logical; whether to use partition-specific penalties |
L_partition_list |
List of partition-specific penalty matrices |
A |
Matrix of smoothness constraints (continuity, differentiability) |
Xy |
List of X^T y products by partition |
y |
List of responses by partition |
K |
Integer; number of knots (partitions = K+1) |
nc |
Integer; number of coefficients per partition |
nca |
Integer; number of columns in constraint matrix |
Ghalf |
List of G^(1/2) matrices by partition |
GhalfInv |
List of G^(-1/2) matrices by partition |
parallel_eigen , parallel_aga , parallel_matmult , parallel_unconstrained |
Logical flags for parallel computation components |
cl |
Cluster object for parallel processing |
chunk_size , num_chunks , rem_chunks |
Parameters for chunking in parallel processing |
family |
GLM family object (includes link function and variance functions) |
unconstrained_fit_fxn |
Function for obtaining unconstrained estimates |
iterate |
Logical; whether to use iterative optimization for non-linear links |
qp_score_function |
Function; see description in |
quadprog |
Logical; whether to use quadratic programming for inequality constraints |
qp_Amat , qp_bvec , qp_meq |
Quadratic programming constraint parameters |
prevB |
List of previous coefficient estimates for warm starts |
prevUnconB |
List of previous unconstrained coefficient estimates |
iter_count , prev_diff |
Iteration tracking for convergence |
tol |
Numeric; convergence tolerance |
constraint_value_vectors |
List of additional constraint values |
order_list |
List of observation orderings by partition |
glm_weight_function |
Function for computing GLM weights |
shur_correction_function |
Function for uncertainty corrections in information matrix |
need_dispersion_for_estimation |
Logical; whether dispersion needs estimation |
dispersion_function |
Function for estimating dispersion parameter |
observation_weights |
Optional observation weights by partition |
homogenous_weights |
Logical; whether weights are constant |
return_G_getB |
Logical; whether to return G matrices with coefficient estimates |
blockfit |
Logical; whether to use block-fitting approach for special structure |
just_linear_without_interactions |
Vector of columns for non-spline effects |
Vhalf , VhalfInv |
Square root and inverse square root correlation matrices for GEE fitting |
... |
Additional arguments passed to fitting functions |
Details
This function implements the method of Lagrangian multipliers for fitting constrained generalized linear models with smoothing splines. The function follows one of four main computational paths depending on the model structure:
1. Pure GEE (No Blockfitting): When Vhalf and VhalfInv are provided without blockfitting, the function uses generalized estimating equations to handle correlated data. The design matrix is arranged in block-diagonal form, and the estimation explicitly accounts for the correlation structure provided. Uses sequential quadratic programming (SQP) for optimization.
2. Blockfitting (With or Without Correlation): When blockfit=TRUE and linear-only terms are specified, the function separates spline effects from linear-only terms. The design matrix is restructured with spline terms in block-diagonal form and linear terms stacked together. This structure is particularly efficient when there are many non-spline effects. The function can handle both correlated (GEE) and uncorrelated data in this path.
3. Canonical Gaussian, No Correlation: For Gaussian family with identity link and no correlation structure, calculations are greatly simplified. No unconstrained fit function is needed; estimation uses direct matrix operations. This path takes advantage of the closed-form solution available for linear models with Gaussian errors.
4. GLMs, No Correlation: For non-Gaussian GLMs without correlation structure, the function requires an unconstrained_fit_fxn to obtain initial estimates. It may use iterative fitting for non-canonical links. This path first computes unconstrained estimates for each partition, then applies constraints using the method of Lagrangian multipliers.
All paths use efficient matrix decompositions and avoid explicitly constructing full matrices when possible. For non-canonical links, iterative fitting is employed to converge to the optimal solution.
Value
For 'return_G_getB = FALSE': A list of coefficient vectors by partition.
For 'return_G_getB = TRUE': A list with elements:
- B
List of coefficient vectors by partition
- G_list
List containing G matrices (covariance matrices), Ghalf (square root), and GhalfInv (inverse square root)
Efficiently Construct U Matrix
Description
Efficiently Construct U Matrix
Usage
get_U(G, A, K, nc, nca)
Arguments
G |
List of G matrices ( |
A |
Constraint matrix ( |
K |
Number of partitions minus 1 ( |
nc |
Number of columns per partition |
nca |
Number of constraint columns |
Details
Computes \textbf{U} = \textbf{I} - \textbf{G}\textbf{A}(\textbf{A}^{T}\textbf{G}\textbf{A})^{-1}\textbf{A}^{T}
efficiently, avoiding unnecessary
multiplication of blocks of \textbf{G}
with all-0 elements.
Value
\textbf{U}
matrix for constraints
Get Centers for Partitioning
Description
Get Centers for Partitioning
Usage
get_centers(data, K, cluster_args, cluster_on_indicators)
Arguments
data |
Matrix of predictor data |
K |
Number of partitions minus 1 ( |
cluster_args |
List with custom centers and kmeans args |
cluster_on_indicators |
Include binary predictors in clustering |
Details
Returns partition centers via:
1. Custom supplied centers if provided as a valid K \times q
matrix
2. kmeans clustering on all non-spline variables if cluster_on_indicators=TRUE
3. kmeans clustering excluding binary variables if cluster_on_indicators=FALSE
Value
Matrix of cluster centers
Generate Interaction Variable Patterns
Description
Generates all possible interaction patterns for 2 or 3 variables. This is used in part for identifying which interactions and expansions to exclude (provided to "exclude_these_expansions" argument of lgspline) based on formulas provided.
Usage
get_interaction_patterns(vars)
Arguments
vars |
Character vector of variable names |
Value
Character vector of interaction pattern strings
Generate Design Matrix with Polynomial and Interaction Terms
Description
Internal function for creating a design matrix containing polynomial expansions and interaction terms for predictor variables. Supports customizable term generation including polynomial degrees up to quartic terms, interaction types, and selective term exclusion.
Column names take on the form "_v_" for linear terms, "_v_^d" for polynomial powers up to d = 4, and "_v_x_w_" for interactions between variables v and w, where v and w are column indices of the input predictor matrix.
The custom_basis_fxn
argument, if supplied, requires the same arguments
as this function, in the same order, minus the eponymous argument,
"custom_basis_fxn".
Usage
get_polynomial_expansions(
predictors,
numerics,
just_linear_with_interactions,
just_linear_without_interactions,
exclude_interactions_for = NULL,
include_quadratic_terms = TRUE,
include_cubic_terms = TRUE,
include_quartic_terms = FALSE,
include_2way_interactions = TRUE,
include_3way_interactions = TRUE,
include_quadratic_interactions = FALSE,
exclude_these_expansions = NULL,
custom_basis_fxn = NULL,
...
)
Arguments
predictors |
Numeric matrix of predictor variables |
numerics |
Integer vector; column indices for variables to expand as polynomials |
just_linear_with_interactions |
Integer vector; column indices for variables to keep linear but allow interactions |
just_linear_without_interactions |
Integer vector; column indices for variables to keep linear without interactions |
exclude_interactions_for |
Integer vector; column indices to exclude from all interactions |
include_quadratic_terms |
Logical; whether to include squared terms (default TRUE) |
include_cubic_terms |
Logical; whether to include cubic terms (default TRUE) |
include_quartic_terms |
Logical; whether to include 4th degree terms (default FALSE) |
include_2way_interactions |
Logical; whether to include two-way interactions (default TRUE) |
include_3way_interactions |
Logical; whether to include three-way interactions (default TRUE) |
include_quadratic_interactions |
Logical; whether to include interactions with squared terms (default TRUE) |
exclude_these_expansions |
Character vector; names of specific terms to exclude from final matrix |
custom_basis_fxn |
Function; optional custom basis expansion function that accepts all arguments listed here except itself |
... |
Additional arguments passed to |
Value
Matrix with columns for intercept, polynomial terms, and specified interactions
Compute Gram Matrix
Description
Calculates X^T * X for the input matrix X
Usage
gramMatrix(X)
Arguments
X |
Input matrix |
Value
Gram matrix (X^T * X)
Matrix Inversion with Fallback Methods
Description
Attempts matrix inversion using multiple methods, falling back to more robust approaches if standard inversion fails.
Usage
invert(mat, include_warnings = FALSE)
Arguments
mat |
Square matrix to invert |
include_warnings |
Logical; default FALSE for current implementation. |
Details
Tries methods in order:
1. Direct inversion using armaInv()
2. Generalized inverse using eigendecomposition
3. Returns identity matrix with warning if both fail
For eigendecomposition, uses a small ridge penalty (1e-16
) for stability and
zeroes eigenvalues below machine precision.
Value
Inverted matrix or identity matrix if all methods fail
Examples
## Well-conditioned matrix
A <- matrix(c(4,2,2,4), 2, 2)
invert(A) %**% A
## Singular matrix falls back to M.P. generalized inverse
B <- matrix(c(1,1,1,1), 2, 2)
invert(B) %**% B
Test if Vector is Binary
Description
Test if Vector is Binary
Usage
is_binary(x)
Arguments
x |
Vector to test |
Value
Logical indicating if x has exactly 2 unique values
Expand Matrix into Partition Lists Based on Knot Boundaries
Description
Takes an input N \times p
matrix of polynomial expansions and outputs a list of
length K+1
, isolating the rows of the input corresponding to assigned partition.
Usage
knot_expand_list(partition_codes, partition_bounds, nr, mat, K)
Arguments
partition_codes |
Numeric vector; values determining partition assignment for each row |
partition_bounds |
Numeric vector; ordered knot locations defining partition boundaries |
nr |
Integer; number of rows in input matrix |
mat |
Numeric matrix; data to be partitioned |
K |
Integer; number of interior knots (resulting in |
Value
List of length K+1
, each element containing the submatrix for that partition
Compute Leave-One-Out Cross-Validated predictions for Gaussian Response/Identity Link under Constraint.
Description
Computes the leave-one-out cross-validated predictions from a model fit, assuming Gaussian-distributed response with identity link.
Usage
leave_one_out(model_fit)
Arguments
model_fit |
A fitted Lagrangian smoothing spline model |
Value
A vector of leave-one-out cross-validated predictions
Examples
## Basic usage with Gaussian response, computing PRESS
set.seed(1234)
t <- rnorm(50)
y <- sin(t) + rnorm(50, 0, .25)
fit <- lgspline(t, y)
loo <- leave_one_out(fit)
press <- mean((y-loo)^2)
plot(loo, y,
main = "Leave-One-Out Cross-Validation Prediction vs. Observed Response",
xlab = 'Prediction', ylab = 'Response')
abline(0, 1)
Fit Lagrangian Multiplier Smoothing Splines
Description
A comprehensive software package for fitting a variant of smoothing splines as a constrained optimization problem, avoiding the need to algebraically disentangle a spline basis after fitting, and allowing for interpretable interactions and non-spline effects to be included.
lgspline
fits piecewise polynomial regression splines constrained to be smooth where
they meet, penalized by the squared, integrated, second-derivative of the
estimated function with respect to predictors.
The method of Lagrangian multipliers is used to enforce the following:
Equivalent fitted values at knots
Equivalent first derivatives at knots, with respect to predictors
Equivalent second derivatives at knots, with respect to predictors
The coefficients are penalized by an analytical form of the traditional cubic smoothing spline penalty, as well as tunable modifications that allow for unique penalization of multiple predictors and partitions.
This package supports model fitting for multiple spline and non-spline effects, GLM families, Weibull accelerated failure time (AFT) models, correlation structures, quadratic programming constraints, and extensive customization for user-defined models.
In addition, parallel processing capabilities and comprehensive tools for visualization, frequentist, and Bayesian inference are provided.
Usage
lgspline(predictors = NULL, y = NULL, formula = NULL, response = NULL,
standardize_response = TRUE, standardize_predictors_for_knots = TRUE,
standardize_expansions_for_fitting = TRUE, family = gaussian(),
glm_weight_function = function(mu, y, order_indices, family, dispersion,
observation_weights, ...) {
if(any(!is.null(observation_weights))){
family$variance(mu) * observation_weights
} else {
family$variance(mu)
}
}, shur_correction_function = function(X, y, B, dispersion, order_list, K,
family, observation_weights, ...) {
lapply(1:(K+1), function(k) 0)
}, need_dispersion_for_estimation = FALSE,
dispersion_function = function(mu, y, order_indices, family,
observation_weights, ...) { 1 },
K = NULL, custom_knots = NULL, cluster_on_indicators = FALSE,
make_partition_list = NULL, previously_tuned_penalties = NULL,
smoothing_spline_penalty = NULL, opt = TRUE, use_custom_bfgs = TRUE,
delta = NULL, tol = 10*sqrt(.Machine$double.eps),
invsoftplus_initial_wiggle = c(-25, 20, -15, -10, -5),
invsoftplus_initial_flat = c(-14, -7), wiggle_penalty = 2e-07,
flat_ridge_penalty = 0.5, unique_penalty_per_partition = TRUE,
unique_penalty_per_predictor = TRUE, meta_penalty = 1e-08,
predictor_penalties = NULL, partition_penalties = NULL,
include_quadratic_terms = TRUE, include_cubic_terms = TRUE,
include_quartic_terms = NULL, include_2way_interactions = TRUE,
include_3way_interactions = TRUE, include_quadratic_interactions = FALSE,
offset = c(), just_linear_with_interactions = NULL,
just_linear_without_interactions = NULL, exclude_interactions_for = NULL,
exclude_these_expansions = NULL, custom_basis_fxn = NULL,
include_constrain_fitted = TRUE, include_constrain_first_deriv = TRUE,
include_constrain_second_deriv = TRUE,
include_constrain_interactions = TRUE, cl = NULL, chunk_size = NULL,
parallel_eigen = TRUE, parallel_trace = FALSE, parallel_aga = FALSE,
parallel_matmult = FALSE, parallel_unconstrained = TRUE,
parallel_find_neighbors = FALSE, parallel_penalty = FALSE,
parallel_make_constraint = FALSE,
unconstrained_fit_fxn = unconstrained_fit_default,
keep_weighted_Lambda = FALSE, iterate_tune = TRUE,
iterate_final_fit = TRUE, blockfit = FALSE,
qp_score_function = function(X, y, mu, order_list, dispersion,
VhalfInv, observation_weights, ...) {
if(!is.null(observation_weights)) {
crossprod(X, cbind((y - mu)*observation_weights))
} else {
crossprod(X, cbind(y - mu))
}
}, qp_observations = NULL, qp_Amat = NULL, qp_bvec = NULL, qp_meq = 0,
qp_positive_derivative = FALSE, qp_negative_derivative = FALSE,
qp_monotonic_increase = FALSE, qp_monotonic_decrease = FALSE,
qp_range_upper = NULL, qp_range_lower = NULL, qp_Amat_fxn = NULL,
qp_bvec_fxn = NULL, qp_meq_fxn = NULL, constraint_values = cbind(),
constraint_vectors = cbind(), return_G = TRUE, return_Ghalf = TRUE,
return_U = TRUE, estimate_dispersion = TRUE, unbias_dispersion = NULL,
return_varcovmat = TRUE, custom_penalty_mat = NULL,
cluster_args = c(custom_centers = NA, nstart = 10),
dummy_dividor = 1.2345672152894e-22,
dummy_adder = 2.234567210529e-18, verbose = FALSE,
verbose_tune = FALSE, expansions_only = FALSE,
observation_weights = NULL, do_not_cluster_on_these = c(),
neighbor_tolerance = 1 + 1e-08, null_constraint = NULL,
critical_value = qnorm(1 - 0.05/2), data = NULL, weights = NULL,
no_intercept = FALSE, correlation_id = NULL, spacetime = NULL,
correlation_structure = NULL, VhalfInv = NULL, Vhalf = NULL,
VhalfInv_fxn = NULL, Vhalf_fxn = NULL, VhalfInv_par_init = c(),
REML_grad = NULL, custom_VhalfInv_loss = NULL, VhalfInv_logdet = NULL,
include_warnings = TRUE, ...)
Arguments
predictors |
Default: NULL. Numeric matrix or data frame of predictor variables. Supports direct matrix input or formula interface when used with 'data' argument. Must contain numeric predictors, with categorical variables pre-converted to numeric indicators. |
y |
Default: NULL. Numeric response variable vector representing the target/outcome/dependent variable etc. to be modeled. |
formula |
Default: NULL. Optional statistical formula for model specification, serving as an alternative to direct matrix input. Supports standard R formula syntax with special 'spl()' function for defining spline terms. |
response |
Default: NULL. Alternative name for response variable, providing compatibility with different naming conventions. Takes precedence only if 'y' is not supplied. |
standardize_response |
Default: TRUE. Logical indicator controlling whether the response variable should be centered by mean and scaled by standard deviation before model fitting. When TRUE, tends to improve numerical stability. Only offered for identity link functions (when family$link == 'identity') |
standardize_predictors_for_knots |
Default: TRUE. Logical flag determining whether predictor variables should be standardized before knot placement. Ensures consistent knot selection across different predictor scales, and that no one predictor dominates in terms of influence on knot placement. For all expansions (x), standardization corresponds to dividing by the difference in 69 and 31st percentiles = x / (quantile(x, 0.69) - quantile(x, 0.31)). |
standardize_expansions_for_fitting |
Default: TRUE. Logical switch to standardize polynomial basis expansions during model fitting. Provides computational stability during penalty tuning without affecting statistical inference, as design matrices, variance-covariance matrices, and coefficient estimates are systematically backtransformed after fitting to account for the standardization. If TRUE, |
family |
Default: gaussian(). Generalized linear model (GLM) distribution family specifying the error distribution and link function for model fitting. Defaults to Gaussian distribution with identity link. Supports custom family specifications, including user-defined link functions and optional custom tuning loss criteria. Minimally requires 1) family name (family) 2) link name (link) 3) linkfun (link function) 4) linkinv (link function inverse) 5) variance (mean variance relationship function, |
glm_weight_function |
Default: function that returns family$variance(mu) * observation_weights if weights exist, family$variance(mu) otherwise. Codes the mean-variance relationship of a GLM or GLM-like model, the diagonal |
shur_correction_function |
Default: function that returns list of zeros. Advanced function for computing Schur complements |
need_dispersion_for_estimation |
Default: FALSE. Logical indicator specifying whether a dispersion parameter is required for coefficient estimation. This is not needed for canonical regular exponential family models, but is often needed otherwise (such as fitting Weibull AFT models). |
dispersion_function |
Default: function that returns 1. Custom function for estimating the dispersion parameter. Unless |
K |
Default: NULL. Integer specifying the number of knot locations for spline partitions. This can intuitively be considered the total number of partitions - 1. |
custom_knots |
Default: NULL. Optional matrix providing user-specified knot locations in 1-D. |
cluster_on_indicators |
Default: FALSE. Logical flag determining whether indicator variables should be used for clustering knot locations. |
make_partition_list |
Default: NULL. Optional list allowing direct specification of custom partition assignments. The intent is that the make_partition_list returned by one model can be supplied here to keep the same knot locations for another. |
previously_tuned_penalties |
Default: NULL. Optional list of pre-computed penalty components from previous model fits. |
smoothing_spline_penalty |
Default: NULL. Optional custom smoothing spline penalty matrix for fine-tuned complexity control. |
opt |
Default: TRUE. Logical switch controlling whether model penalties should be automatically optimized via generalized cross-validation. Turn this off if previously_tuned_penalties are supplied AND desired, otherwise, the previously_tuned_penalties will be ignored. |
use_custom_bfgs |
Default: TRUE. Logical indicator selecting between a native implementation of damped-BFGS optimization method with analytical gradients or base R's BFGS implementation with finite-difference approximation of gradients. |
delta |
Default: NULL. Numeric pseudocount used for stabilizing optimization in non-identity link function scenarios. |
tol |
Default: 10*sqrt(.Machine$double.eps). Numeric convergence tolerance controlling the precision of optimization algorithms. |
invsoftplus_initial_wiggle |
Default: c(-25, 20, -15, -10, -5). Numeric vector of initial grid points for wiggle penalty optimization, specified on the inverse-softplus ( |
invsoftplus_initial_flat |
Default: c(-7, 0). Numeric vector of initial grid points for ridge penalty optimization, specified on the inverse-softplus ( |
wiggle_penalty |
Default: 2e-7. Numeric penalty controlling the integrated squared second derivative, governing function smoothness. Applied to both smoothing spline penalty (alone) and is multiplied by |
flat_ridge_penalty |
Default: 0.5. Numeric flat ridge penalty for additional regularization on only intercepts and linear terms (won't affect interactions or quadratic/cubic/quartic terms by default). If |
unique_penalty_per_partition |
Default: TRUE. Logical flag allowing the magnitude of the smoothing spline penalty to differ across partition. |
unique_penalty_per_predictor |
Default: TRUE. Logical flag allowing the magnitude of the smoothing spline penalty to differ between predictors. |
meta_penalty |
Default: 1e-8. Numeric "meta-penalty" applied to predictor and partition penalties during tuning. The minimization of GCV is modified to be a penalized minimization problem, with penalty |
predictor_penalties |
Default: NULL. Optional list of custom penalties specified per predictor. |
partition_penalties |
Default: NULL. Optional list of custom penalties specified per partition. |
include_quadratic_terms |
Default: TRUE. Logical switch to include squared predictor terms in basis expansions. |
include_cubic_terms |
Default: TRUE. Logical switch to include cubic predictor terms in basis expansions. |
include_quartic_terms |
Default: NULL. Logical switch to include quartic predictor terms in basis expansions. This is highly recommended for fitting models with multiple predictors to avoid over-specified constraints. When NULL (by default), will internally set to FALSE if only one predictor present, and TRUE otherwise. |
include_2way_interactions |
Default: TRUE. Logical switch to include linear two-way interactions between predictors. |
include_3way_interactions |
Default: TRUE. Logical switch to include three-way interactions between predictors. |
include_quadratic_interactions |
Default: FALSE. Logical switch to include linear-quadratic interaction terms. |
offset |
Default: Empty vector. When non-missing, this is a vector of column indices/names to include as offsets. |
just_linear_with_interactions |
Default: NULL. Integer vector specifying columns to retain linear terms with interactions. |
just_linear_without_interactions |
Default: NULL. Integer vector specifying columns to retain only linear terms without interactions. |
exclude_interactions_for |
Default: NULL. Integer vector indicating columns to exclude from all interaction terms. |
exclude_these_expansions |
Default: NULL. Character vector specifying basis expansions to be excluded from the model. These must be named columns of the data, or in the form "_1_", "_2_", "_1_x_2_", "_2_^2" etc. where "1" and "2" indicate column indices of predictor matrix input. |
custom_basis_fxn |
Default: NULL. Optional user-defined function for generating custom basis expansions. See |
include_constrain_fitted |
Default: TRUE. Logical switch to constrain fitted values at knot points. |
include_constrain_first_deriv |
Default: TRUE. Logical switch to constrain first derivatives at knot points. |
include_constrain_second_deriv |
Default: TRUE. Logical switch to constrain second derivatives at knot points. |
include_constrain_interactions |
Default: TRUE. Logical switch to constrain interaction terms at knot points. |
cl |
Default: NULL. Parallel processing cluster object for distributed computation (use |
chunk_size |
Default: NULL. Integer specifying custom fixed chunk size for parallel processing. |
parallel_eigen |
Default: TRUE. Logical flag to enable parallel processing for eigenvalue decomposition computations. |
parallel_trace |
Default: FALSE. Logical flag to enable parallel processing for trace computation. |
parallel_aga |
Default: FALSE. Logical flag to enable parallel processing for specific matrix operations involving G and A. |
parallel_matmult |
Default: FALSE. Logical flag to enable parallel processing for block-diagonal matrix multiplication. |
parallel_unconstrained |
Default: TRUE. Logical flag to enable parallel processing for unconstrained maximum likelihood estimation. |
parallel_find_neighbors |
Default: FALSE. Logical flag to enable parallel processing for neighbor identification (which partitions are neighbors). |
parallel_penalty |
Default: FALSE. Logical flag to enable parallel processing for penalty matrix construction. |
parallel_make_constraint |
Default: FALSE. Logical flag to enable parallel processing for constraint matrix generation. |
unconstrained_fit_fxn |
Default: |
keep_weighted_Lambda |
Default: FALSE. Logical flag to retain generalized linear model weights in penalty constraints using Tikhonov parameterization. It is advised to turn this to TRUE when fitting non-canonical GLMs. The default |
iterate_tune |
Default: TRUE. Logical switch to use iterative optimization during penalty tuning. If FALSE, |
iterate_final_fit |
Default: TRUE. Logical switch to use iterative optimization for final model fitting. If FALSE, |
blockfit |
Default: FALSE. Logical switch to abandon per-partition fitting for non-spline effects without interactions, collapse all matrices into block-diagonal single-matrix form, and fit agnostic to partition. This would be more efficient for many non-spline effects without interactions and relatively few spline effects or non-spline effects with interactions. Ignored if |
qp_score_function |
Default: |
qp_observations |
Default: NULL. Numeric vector of observations to apply constraints to for monotonic and range quadratic programming constraints. Useful for saving computational resources. |
qp_Amat |
Default: NULL. Constraint matrix for quadratic programming formulation. The |
qp_bvec |
Default: NULL. Constraint vector for quadratic programming formulation. The |
qp_meq |
Default: 0. Number of equality constraints in quadratic programming setup. The |
qp_positive_derivative , qp_monotonic_increase |
Default: FALSE. Logical flags to constrain the function to have positive first derivatives/be monotonically increasing using quadratic programming with respect to the order (ascending rows) of the input data set. |
qp_negative_derivative , qp_monotonic_decrease |
Default: FALSE. Logical flags to constrain the function to have negative first derivatives/be monotonically decreasing using quadratic programming with respect to the order (ascending rows) of the input data set. |
qp_range_upper |
Default: NULL. Numeric upper bound for constrained fitted values using quadratic programming. |
qp_range_lower |
Default: NULL. Numeric lower bound for constrained fitted values using quadratic programming. |
qp_Amat_fxn |
Default: NULL. Custom function for generating Amat matrix in quadratic programming. |
qp_bvec_fxn |
Default: NULL. Custom function for generating bvec vector in quadratic programming. |
qp_meq_fxn |
Default: NULL. Custom function for determining meq equality constraints in quadratic programming. |
constraint_values |
Default: |
constraint_vectors |
Default: |
return_G |
Default: TRUE. Logical switch to return the unscaled variance-covariance matrix without smoothing constraints ( |
return_Ghalf |
Default: TRUE. Logical switch to return the matrix square root of the unscaled variance-covariance matrix without smoothing constraints ( |
return_U |
Default: TRUE. Logical switch to return the constraint projection matrix |
estimate_dispersion |
Default: TRUE. Logical flag to estimate the dispersion parameter after fitting. |
unbias_dispersion |
Default NULL. Logical switch to multiply final dispersion estimates by |
return_varcovmat |
Default: TRUE. Logical switch to return the variance-covariance matrix of the estimated coefficients. This is needed for performing Wald inference. |
custom_penalty_mat |
Default: NULL. Optional |
cluster_args |
Default: |
dummy_dividor |
Default: 0.00000000000000000000012345672152894. Small numeric constant to prevent division by zero in computational routines. |
dummy_adder |
Default: 0.000000000000000002234567210529. Small numeric constant to prevent division by zero in computational routines. |
verbose |
Default: FALSE. Logical flag to print general progress messages during model fitting (does not include during tuning). |
verbose_tune |
Default: FALSE. Logical flag to print detailed progress messages during penalty tuning specifically. |
expansions_only |
Default: FALSE. Logical switch to return only basis expansions without full model fitting. Useful for setting up custom constraints and penalties. |
observation_weights |
Default: NULL. Numeric vector of observation-specific weights for generalized least squares estimation. |
do_not_cluster_on_these |
Default: c(). Vector specifying predictor columns to exclude from clustering procedures, in addition to the non-spline effects by default. |
neighbor_tolerance |
Default: 1 + 1e-8. Numeric tolerance for determining neighboring partitions using k-means clustering. Greater values means more partitions are likely to be considered neighbors. Intended for internal use only (modify at your own risk!). |
null_constraint |
Default: NULL. Alternative parameterization of constraint values. |
critical_value |
Default: |
data |
Default: NULL. Optional data frame providing context for formula-based model specification. |
weights |
Default: NULL. Alternative name for observation weights, maintained for interface compatibility. |
no_intercept |
Default: FALSE. Logical flag to remove intercept, constraining it to 0. The function automatically constructs constraint_vectors and constraint_values to achieve this. Calling formulas with a "0+" in it like |
correlation_id , spacetime |
Default: NULL. N-length vector and N-row matrix of cluster (or subject, group etc.) ids and longitudinal/spatial variables respectively, whereby observations within each grouping of |
correlation_structure |
Default: NULL. Native implementations of popular variance-covariance structures. Offers options for "exchangeable", "spatial-exponential", "squared-exponential", "ar(1)", "spherical", "gaussian-cosine", "gamma-cosine", and "matern", along with their aliases. The eponymous correlation structure is fit along with coefficients and dispersion, with correlation estimated using a REML objective. See section "Correlation Structure Estimation" for more details. |
VhalfInv |
Default: NULL. Matrix representing a fixed, custom square-root-inverse covariance structure for the response variable of longitudinal and spatial modeling. Must be an |
Vhalf |
Default: NULL. Matrix representing a fixed, custom square-root covariance structure for the response variable of longitudinal and spatial modeling. Must be an |
VhalfInv_fxn |
Default: NULL. Function for parametric modeling of the covariance structure |
Vhalf_fxn |
Default: NULL. Function for efficient computation of |
VhalfInv_par_init |
Default: c(). Numeric vector of initial parameter values for |
REML_grad |
Default: NULL. Function for evaluating the gradient of the objective function (negative REML or custom loss) with respect to the parameters of |
custom_VhalfInv_loss |
Default: NULL. Alternative to negative REML for serving as the objective function for optimizing correlation parameters. Must take the same "par" argument as |
VhalfInv_logdet |
Default: NULL. Function for efficient computation of |
include_warnings |
Default: TRUE. Logical switch to control display of warning messages during model fitting. |
... |
Additional arguments passed to the unconstrained model fitting function. |
Details
A flexible and interpretable implementation of smoothing splines including:
Multiple predictors and interaction terms
Various GLM families and link functions
Correlation structures for longitudinal/clustered data
Shape constraints via quadratic programming
Parallel computation for large datasets
Comprehensive inference tools
Value
A list containing the following components:
- y
Original response vector.
- ytilde
Fitted/predicted values on the scale of the response.
- X
List of design matrices
\textbf{X}_k
for each partition k, containing basis expansions including intercept, linear, quadratic, cubic, and interaction terms as specified.- A
Constraint matrix
\textbf{A}
encoding smoothness constraints at knot points and any user-specified linear constraints.- B
List of fitted coefficients
\boldsymbol{\beta}_k
for each partition k on the original, unstandardized scale of the predictors and response.- B_raw
List of fitted coefficients for each partition on the predictor-and-response standardized scale.
- K
Number of interior knots with one predictor (number of partitions minus 1 with > 1 predictor).
- p
Number of basis expansions of predictors per partition.
- q
Number of predictor variables.
- P
Total number of coefficients (
p \times (K+1)
).- N
Number of observations.
- penalties
List containing optimized penalty matrices and components:
Lambda: Combined penalty matrix (
\boldsymbol{\Lambda}
), includes\textbf{L}_{\text{predictor\_list}}
contributions but not\textbf{L}_{\text{partition\_list}}
.L1: Smoothing spline penalty matrix (
\textbf{L}_1
).L2: Ridge penalty matrix (
\textbf{L}_2
).L predictor list: Predictor-specific penalty matrices (
\textbf{L}_{\text{predictor\_list}}
).L partition list: Partition-specific penalty matrices (
\textbf{L}_{\text{partition\_list}}
).
- knot_scale_transf
Function for transforming predictors to standardized scale used for knot placement.
- knot_scale_inv_transf
Function for transforming standardized predictors back to original scale.
- knots
Matrix of knot locations on original unstandarized predictor scale for one predictor.
- partition_codes
Vector assigning observations to partitions.
- partition_bounds
Vector or matrix specifying the boundaries between partitions.
- knot_expand_function
Internal function for expanding data according to partition structure.
- predict
Function for generating predictions on new data.
- assign_partition
Function for assigning new observations to partitions.
- family
GLM family object specifying the error distribution and link function.
- estimate_dispersion
Logical indicating whether dispersion parameter was estimated.
- unbias_dispersion
Logical indicating whether dispersion estimates should be unbiased.
- backtransform_coefficients
Function for converting standardized coefficients to original scale.
- forwtransform_coefficients
Function for converting coefficients to standardized scale.
- mean_y, sd_y
Mean and standard deviation of response if standardized.
- og_order
Original ordering of observations before partitioning.
- order_list
List containing observation indices for each partition.
- constraint_values, constraint_vectors
Matrices specifying linear equality constraints if provided.
- make_partition_list
List containing partition information for > 1-D cases.
- expansion_scales
Vector of scaling factors used for standardizing basis expansions.
- take_derivative, take_interaction_2ndderivative
Functions for computing derivatives of basis expansions.
- get_all_derivatives_insample
Function for computing all derivatives on training data.
- numerics
Indices of numeric predictors used in basis expansions.
- power1_cols, power2_cols, power3_cols, power4_cols
Column indices for linear through quartic terms.
- quad_cols
Column indices for all quadratic terms (including interactions).
- interaction_single_cols, interaction_quad_cols
Column indices for linear-linear and linear-quadratic interactions.
- triplet_cols
Column indices for three-way interactions.
- nonspline_cols
Column indices for terms excluded from spline expansion.
- return_varcovmat
Logical indicating whether variance-covariance matrix was computed.
- raw_expansion_names
Names of basis expansion terms.
- std_X, unstd_X
Functions for standardizing/unstandardizing design matrices.
- parallel_cluster_supplied
Logical indicating whether a parallel cluster was supplied.
- weights
List of observation weights per partition.
- G
List of unscaled unconstrained variance-covariance matrices
\textbf{G}_k
per partition k ifreturn_G=TRUE
. Computed as(\textbf{X}_k^T\textbf{X}_k + \boldsymbol{\Lambda}_\text{eff})^{-1}
for partition k.- Ghalf
List of
\textbf{G}_k^{1/2}
matrices ifreturn_Ghalf=TRUE
.- U
Constraint projection matrix
\textbf{U}
ifreturn_U=TRUE
. For K=0 and no constraints, returns identity. Otherwise, returns\textbf{U} = \textbf{I} - \textbf{G}\textbf{A}(\textbf{A}^T\textbf{G}\textbf{A})^{-1}\textbf{A}^T
. Used for computing the variance-covariance matrix\sigma^2 \textbf{U}\textbf{G}
.- sigmasq_tilde
Estimated (or fixed) dispersion parameter
\tilde{\sigma}^2
.- trace_XUGX
Effective degrees of freedom (
\text{trace}(\textbf{X}\textbf{U}\textbf{G}\textbf{X}^{T})
).- varcovmat
Variance-covariance matrix of coefficient estimates if
return_varcovmat=TRUE
.- VhalfInv
The
\textbf{V}^{-1/2}
matrix used for implementing correlation structures, if specified.- VhalfInv_fxn, Vhalf_fxn, VhalfInv_logdet, REML_grad
Functions for generating
\textbf{V}^{-1/2}
,\textbf{V}^{1/2}
,\log|\textbf{V}^{-1/2}|
, and gradient of REML if provided.- VhalfInv_params_estimates
Vector of estimated correlation parameters when using
VhalfInv_fxn
.- VhalfInv_params_vcov
Approximate variance-covariance matrix of estimated correlation parameters from BFGS optimization.
- wald_univariate
Function for computing univariate Wald statistics and confidence intervals.
- critical_value
Critical value used for confidence interval construction.
- generate_posterior
Function for drawing from the posterior distribution of coefficients.
- find_extremum
Function for optimizing the fitted function.
- plot
Function for visualizing fitted curves.
- quadprog_list
List containing quadratic programming components if applicable.
When expansions_only=TRUE
is used, a reduced list is returned containing only the following prior to any fitting or tuning:
- X
Design matrices
\textbf{X}_k
- y
Response vectors
\textbf{y}_k
- A
Constraint matrix
\textbf{A}
- penalties
Penalty matrices
- order_list, og_order
Ordering information
- expansion_scales, colnm_expansions
Scaling and naming information
- K, knots
Knot information
- make_partition_list, partition_codes, partition_bounds
Partition information
- constraint_vectors, constraint_values
Constraint information
- quadprog_list
Quadratic programming components if applicable
The returned object has class "lgspline" and provides comprehensive tools for
model interpretation, inference, prediction, and visualization. All
coefficients and predictions can be transformed between standardized and
original scales using the provided transformation functions. The object includes
both frequentist and Bayesian inference capabilities through Wald statistics
and posterior sampling. Advanced customization options are available for
analyzing arbitrarily complex study designs.
See Details
for descriptions of the model fitting process.
See Also
-
solve.QP
for quadratic programming optimization -
plot_ly
for interactive plotting -
kmeans
for k-means clustering -
optim
for general purpose optimization routines
Examples
## ## ## ## Simple Examples ## ## ## ## ## ## ## ## ## ## ## ## ## ## ## ## ## ## ## ## ##
## Simulate some data, fit using default settings, and plot
set.seed(1234)
t <- runif(2500, -10, 10)
y <- 2*sin(t) + -0.06*t^2 + rnorm(length(t))
model_fit <- lgspline(t, y)
plot(t, y, main = 'Observed Data vs. Fitted Function, Colored by Partition')
plot(model_fit, add = TRUE)
## Repeat using logistic regression, with univariate inference shown
# and alternative function call
y <- rbinom(length(y), 1, 1/(1+exp(-std(y))))
df <- data.frame(t = t, y = y)
model_fit <- lgspline(y ~ spl(t),
df,
opt = FALSE, # no tuning penalties
family = quasibinomial())
plot(t, y, main = 'Observed Data vs Fitted Function with Formulas and Derivatives',
ylim = c(-0.5, 1.05), cex.main = 0.8)
plot(model_fit,
show_formulas = TRUE,
text_size_formula = 0.65,
legend_pos = 'bottomleft',
legend_args = list(y.intersp = 1.1),
add = TRUE)
## Notice how the coefficients match the formula, and expansions are
# homogenous across partitions without reparameterization
print(summary(model_fit))
## Overlay first and second derivatives of fitted function respectively
derivs <- predict(model_fit,
new_predictors = sort(t),
take_first_derivatives = TRUE,
take_second_derivatives = TRUE)
points(sort(t), derivs$first_deriv, col = 'gold', type = 'l')
points(sort(t), derivs$second_deriv, col = 'goldenrod', type = 'l')
legend('bottomright',
col = c('gold','goldenrod'),
lty = 1,
legend = c('First Derivative', 'Second Derivative'))
## Simple 2D example - including a non-spline effect
z <- seq(-2, 2, length.out = length(y))
df <- data.frame(Predictor1 = t,
Predictor2 = z,
Response = sin(y)+0.1*z)
model_fit <- lgspline(Response ~ spl(Predictor1) + Predictor1*Predictor2,
df)
## Notice, while spline effects change over partitions,
# interactions and non-spline effects are constrained to remain the same
coefficients <- Reduce('cbind', coef(model_fit))
colnames(coefficients) <- paste0('Partition ', 1:(model_fit$K+1))
print(coefficients)
## One or two variables can be selected for plotting at a time
# even when >= 3 predictors are present
plot(model_fit,
custom_title = 'Marginal Relationship of Predictor 1 and Response',
vars = 'Predictor1',
custom_response_lab = 'Response',
show_formulas = TRUE,
legend_pos = 'bottomright',
digits = 4,
text_size_formula = 0.5)
## 3D plots are implemented as well, retaining analytical formulas
my_plot <- plot(model_fit,
show_formulas = TRUE,
custom_response_lab = 'Response')
my_plot
## ## ## ## More Detailed 1D Example ## ## ## ## ## ## ## ## ## ## ## ## ## ## ## ## ## ##
## 1D data generating functions
t <- seq(-9, 9, length.out = 1000)
slinky <- function(x) {
(50 * cos(x * 2) -2 * x^2 + (0.25 * x)^4 + 80)
}
coil <- function(x) {
(100 * cos(x * 2) +-1.5 * x^2 + (0.1 * x)^4 +
(0.05 * x^3) + (-0.01 * x^5) +
(0.00002 * x^6) -(0.000001 * x^7) + 100)
}
exponential_log <- function(x) {
unlist(c(sapply(x, function(xx) {
if (xx <= 1) {
100 * (exp(xx) - exp(1))
} else {
100 * (log(xx))
}
})))
}
scaled_abs_gamma <- function(x) {
2*sqrt(gamma(abs(x)))
}
## Composite function
fxn <- function(x)(slinky(t) +
coil(t) +
exponential_log(t) +
scaled_abs_gamma(t))
## Bind together with random noise
dat <- cbind(t, fxn(t) + rnorm(length(t), 0, 50))
colnames(dat) <- c('t', 'y')
x <- dat[,'t']
y <- dat[,'y']
## Fit Model, 4 equivalent ways are shown below
model_fit <- lgspline(t, y)
model_fit <- lgspline(y ~ spl(t), as.data.frame(dat))
model_fit <- lgspline(response = y, predictors = t)
model_fit <- lgspline(data = as.data.frame(dat), formula = y ~ .)
# This is not valid: lgspline(y ~ ., t)
# This is not valid: lgspline(y, data = as.data.frame(dat))
# Do not put operations in formulas, not valid: lgspline(y ~ log(t) + spl(t))
## Basic Functionality
predict(model_fit, new_predictors = rnorm(1)) # make prediction on new data
head(leave_one_out(model_fit)) # leave-one-out cross-validated predictions
coef(model_fit) # extract coefficients
summary(model_fit) # model information and Wald inference
generate_posterior(model_fit) # generate draws of parameters from posterior distribution
find_extremum(model_fit, minimize = TRUE) # find the minimum of the fitted function
## Incorporate range constraints, custom knots, keep penalization identical
# across partitions
model_fit <- lgspline(y ~ spl(t),
unique_penalty_per_partition = FALSE,
custom_knots = cbind(c(-2, -1, 0, 1, 2)),
data = data.frame(t = t, y = y),
qp_range_lower = -150,
qp_range_upper = 150)
## Plotting the constraints and knots
plot(model_fit,
custom_title = 'Fitted Function Constrained to Lie Between (-150, 150)',
cex.main = 0.75)
# knot locations
abline(v = model_fit$knots)
# lower bound from quadratic program
abline(h = -150, lty = 2)
# upper bound from quadratic program
abline(h = 150, lty = 2)
# observed data
points(t, y, cex = 0.24)
## Enforce monotonic increasing constraints on fitted values
# K = 4 => 5 partitions
t <- seq(-10, 10, length.out = 100)
y <- 5*sin(t) + t + 2*rnorm(length(t))
model_fit <- lgspline(t,
y,
K = 4,
qp_monotonic_increase = TRUE)
plot(t, y, main = 'Monotonic Increasing Function with Respect to Fitted Values')
plot(model_fit,
add = TRUE,
show_formulas = TRUE,
legend_pos = 'bottomright',
custom_predictor_lab = 't',
custom_response_lab = 'y')
## ## ## ## 2D Example using Volcano Dataset ## ## ## ## ## ## ## ## ## ## ## ## ## ## ##
## Prep
data('volcano')
volcano_long <-
Reduce('rbind', lapply(1:nrow(volcano), function(i){
t(sapply(1:ncol(volcano), function(j){
c(i, j, volcano[i,j])
}))
}))
colnames(volcano_long) <- c('Length', 'Width', 'Height')
## Fit, with 50 partitions
# When fitting with > 1 predictor and large K, including quartic terms
# is highly recommended, and/or dropping the second-derivative constraint.
# Otherwise, the constraints can impose all partitions to be equal, with one
# cubic function fit for all (there isn't enough degrees of freedom to fit
# unique cubic functions due to the massive amount of constraints).
# Below, quartic terms are included and the constraint of second-derivative
# smoothness at knots is ignored.
model_fit <- lgspline(volcano_long[,c(1, 2)],
volcano_long[,3],
include_quadratic_interactions = TRUE,
K = 49,
opt = FALSE,
return_U = FALSE,
return_varcov = FALSE,
estimate_variance = TRUE,
return_Ghalf = FALSE,
return_G = FALSE,
include_constrain_second_deriv = FALSE,
unique_penalty_per_predictor = FALSE,
unique_penalty_per_partition = FALSE,
wiggle_penalty = 1e-5, # the fixed wiggle penalty
flat_ridge_penalty = 1e-4) # the ridge penalty
## Plotting on new data with interactive visual + formulas
new_input <- expand.grid(seq(min(volcano_long[,1]),
max(volcano_long[,1]),
length.out = 250),
seq(min(volcano_long[,2]),
max(volcano_long[,2]),
length.out = 250))
model_fit$plot(new_predictors = new_input,
show_formulas = TRUE,
custom_response_lab = "Height",
custom_title = 'Volcano 3-D Map',
digits = 2)
## ## ## ## Advanced Techniques using Trees Dataset ## ## ## ## ## ## ## ## ## ## ## ## ##
## Goal here is to introduce how lgspline works with non-canonical GLMs and
# demonstrate some custom features
data('trees')
## L1-regularization constraint function on standardized coefficients
# Bound all coefficients to be less than a certain value (l1_bound) in absolute
# magnitude such that | B^{(j)}_k | < lambda for all j = 1....p coefficients,
# and k = 1...K+1 partitions.
l1_constraint_matrix <- function(p, K) {
## Total number of coefficients
P <- p * (K + 1)
## Create diagonal matrices for L1 constraint
# First matrix: lamdba > -bound
# Second matrix: -lambda > -bound
first_diag <- diag(P)
second_diag <- -diag(P)
## Combine matrices
l1_Amat <- cbind(first_diag, second_diag)
return(l1_Amat)
}
## Bounds absolute value of coefficients to be < l1_bound
l1_bound_vector <- function(qp_Amat,
scales,
l1_bound) {
## Combine matrices
l1_bvec <- rep(-l1_bound, ncol(qp_Amat)) * c(1, scales)
return(l1_bvec)
}
## Fit model, using predictor-response formulation, assuming
# Gamma-distributed response, and custom quadratic-programming constraints,
# with qp_score_function/glm_weight_function updated for non-canonical GLMs
# as well as quartic terms, keeping the effect of height constant across
# partitions, and 3 partitions in total. Hence, this is an advanced-usage
# case.
# You can modify this code for performing l1-regularization in general.
# For canonical GLMs, the default qp_score_function/glm_weight_function are
# correct and do not need to be changed
# (custom functionality is not needed for canonical GLMs).
model_fit <- lgspline(
Volume ~ spl(Girth) + Height*Girth,
data = with(trees, cbind(Girth, Height, Volume)),
family = Gamma(link = 'log'),
keep_weighted_Lambda = TRUE,
glm_weight_function = function(
mu,
y,
order_indices,
family,
dispersion,
observation_weights,
...){
rep(1/dispersion, length(y))
},
dispersion_function = function(
mu,
y,
order_indices,
family,
observation_weights,
...){
mean(
mu^2/((y-mu)^2)
)
}, # = biased estimate of 1/shape parameter
need_dispersion_for_estimation = TRUE,
unbias_dispersion = TRUE, # multiply dispersion by N/(N-trace(XUGX^{T}))
K = 2, # 3 partitions
opt = FALSE, # keep penalties fixed
unique_penalty_per_partition = FALSE,
unique_penalty_per_predictor = FALSE,
flat_ridge_penalty = 1e-64,
wiggle_penalty = 1e-64,
qp_score_function = function(X, y, mu, order_list, dispersion, VhalfInv,
observation_weights, ...){
t(X) %**% diag(c(1/mu * 1/dispersion)) %**% cbind(y - mu)
}, # updated score for gamma regression with log link
qp_Amat_fxn = function(N, p, K, X, colnm, scales, deriv_fxn, ...) {
l1_constraint_matrix(p, K)
},
qp_bvec_fxn = function(qp_Amat, N, p, K, X, colnm, scales, deriv_fxn, ...) {
l1_bound_vector(qp_Amat, scales, 25)
},
qp_meq_fxn = function(qp_Amat, N, p, K, X, colnm, scales, deriv_fxn, ...) 0
)
## Notice, interaction effect is constant across partitions as is the effect
# of Height alone
Reduce('cbind', coef(model_fit))
print(summary(model_fit))
## Plot results
plot(model_fit, custom_predictor_lab1 = 'Girth',
custom_predictor_lab2 = 'Height',
custom_response_lab = 'Volume',
custom_title = 'Girth and Height Predicting Volume of Trees',
show_formulas = TRUE)
## Verify magnitude of unstandardized coefficients does not exceed bound (25)
print(max(abs(unlist(model_fit$B))))
## Find height and girth where tree volume is closest to 42
# Uses custom objective that minimizes MSE discrepancy between predicted
# value and 42.
# The vanilla find_extremum function can be thought of as
# using "function(mu)mu" aka the identity function as the
# objective, where mu = "f(t)", our estimated function. The derivative is then
# d_mu = "f'(t)" with respect to predictors t.
# But with more creative objectives, and since we have machinery for
# f'(t) already available, we can compute gradients for (and optimize)
# arbitrary differentiable functions of our predictors too.
# For any objective, differentiate w.r.t. to mu, then multiply by d_mu to
# satisfy chain rule.
# Here, we have objective function: 0.5*(mu-42)^2
# and gradient : (mu-42)*d_mu
# and L-BFGS-B will be used to find the height and girth that most closely
# yields a prediction of 42 within the bounds of the observed data.
# The d_mu also takes into account link function transforms automatically
# for most common link functions, and will return warning + instructions
# on how to program the link-function derivatives otherwise.
## Custom acquisition functions for Bayesian optimization could be coded here.
find_extremum(
model_fit,
minimize = TRUE,
custom_objective_function = function(mu, sigma, ybest, ...){
0.5*(mu - 42)^2
},
custom_objective_derivative = function(mu, sigma, ybest, d_mu, ...){
(mu - 42) * d_mu
}
)
## ## ## ## How to Use Formulas in lgspline ## ## ## ## ## ## ## ## ## ## ## ## ## ## ##
## Demonstrates splines with multiple mixed predictors and interactions
## Generate data
n <- 2500
x <- rnorm(n)
y <- rnorm(n)
z <- sin(x)*mean(abs(y))/2
## Categorical predictors
cat1 <- rbinom(n, 1, 0.5)
cat2 <- rbinom(n, 1, 0.5)
cat3 <- rbinom(n, 1, 0.5)
## Response with mix of effects
response <- y + z + 0.1*(2*cat1 - 1)
## Continuous predictors re-named
continuous1 <- x
continuous2 <- z
## Combine data
dat <- data.frame(
response = response,
continuous1 = continuous1,
continuous2 = continuous2,
cat1 = cat1,
cat2 = cat2,
cat3 = cat3
)
## Example 1: Basic Model with Default Terms, No Intercept
# standardize_response = FALSE often needed when constraining intercepts to 0
fit1 <- lgspline(
formula = response ~ 0 + spl(continuous1, continuous2) +
cat1*cat2*continuous1 + cat3,
K = 2,
standardize_response = FALSE,
data = dat
)
## Examine coefficients included
rownames(fit1$B$partition1)
## Verify intercept term is near 0 up to some numeric tolerance
abs(fit1$B[[1]][1]) < 1e-8
## Example 2: Similar Model with Intercept, Other Terms Excluded
fit2 <- lgspline(
formula = response ~ spl(continuous1, continuous2) +
cat1*cat2*continuous1 + cat3,
K = 1,
standardize_response = FALSE,
include_cubic_terms = FALSE,
exclude_these_expansions = c( # Not all need to actually be present
'_batman_x_robin_',
'_3_x_4_', # no cat1 x cat2 interaction, coded using column indices
'continuous1xcontinuous2', # no continuous1 x continuous2 interaction
'thejoker'
),
data = dat
)
## Examine coefficients included
rownames(Reduce('cbind',coef(fit2)))
# Intercept will probably be present and non-0 now
abs(fit2$B[[1]][1]) < 1e-8
## ## ## ## Compare Inference to survreg for Weibull AFT Model Validation ##
# Only linear predictors, no knots, no penalties, using Weibull AFT Model
# The goal here is to ensure that for the special case of no spline effects
# and no knots, this implementation will be consistent with other model
# implementations.
# Also note, that when using models (like Weibull AFT) where dispersion is
# being estimated and is required for estimating beta coefficients,
# we use a shur complement correction function to adjust (or "correct") our
# variance-covariance matrix for both estimation and inference to account for
# uncertainty in estimating the dispersion.
# Typically the shur_correction_function would return a negative-definite
# matrix, as it's output is elementwise added to the information matrix prior
# to inversion.
require(survival)
df <- data.frame(na.omit(
pbc[,c('time','trt','stage','hepato','bili','age','status')]
))
## Weibull AFT using lgspline, showing how some custom options can be used to
# fit more complicated models
model_fit <- lgspline(time ~ trt + stage + hepato + bili + age,
df,
family = weibull_family(),
need_dispersion_for_estimation = TRUE,
dispersion_function = weibull_dispersion_function,
glm_weight_function = weibull_glm_weight_function,
shur_correction_function = weibull_shur_correction,
unconstrained_fit_fxn = unconstrained_fit_weibull,
opt = FALSE,
wiggle_penalty = 0,
flat_ridge_penalty = 0,
K = 0,
status = pbc$status!=0)
print(summary(model_fit))
## Survreg results match closely on estimates and inference for coefficients
survreg_fit <- survreg(Surv(time, status!=0) ~ trt + stage + hepato + bili + age,
df)
print(summary(survreg_fit))
## sigmasq_tilde = scale^2 of survreg
print(c(sqrt(model_fit$sigmasq_tilde), survreg_fit$scale))
## ## ## ## Modelling Correlation Structures ## ## ## ## ## ## ## ## ## ## ## ## ## ## ##
## Setup
n_blocks <- 150 # Number of correlation_ids (subjects)
block_size <- 5 # Size of each correlation_ids (number of repeated measures per subj.)
N <- n_blocks * block_size # total sample size (balanced here)
rho_true <- 0.25 # True correlation
## Generate predictors and mean structure
t <- seq(-9, 9, length.out = N)
true_mean <- sin(t)
## Create block compound symmetric errors = I(1-p) + Jp
errors <- Reduce('rbind',
lapply(1:n_blocks,
function(i){
sigma <- diag(block_size) + rho_true *
(matrix(1, block_size, block_size) -
diag(block_size))
matsqrt(sigma) %*% rnorm(block_size)
}))
## Generate response with correlated errors
y <- true_mean + errors * 0.5
## Fit model with correlation structure
# include_warnings = FALSE is a good idea here, since many proposed
# correlations won't work
model_fit <- lgspline(t,
y,
K = 4,
correlation_id = rep(1:n_blocks, each = block_size),
correlation_structure = 'exchangeable',
include_warnings = FALSE
)
## Assess overall fit
plot(t, y, main = 'Sinosudial Fit Under Correlation Structure')
plot(model_fit, add = TRUE, show_formulas = TRUE, custom_predictor_lab = 't')
## Compare estimated vs true correlation
rho_est <- tanh(model_fit$VhalfInv_params_estimates)
print(c("True correlation:" = rho_true,
"Estimated correlation:" = rho_est))
## Quantify uncertainty in correlation estimate with 95% confidence interval
se <- c(sqrt(diag(model_fit$VhalfInv_params_vcov))) / sqrt(model_fit$N)
ci <- tanh(model_fit$VhalfInv_params_estimates + c(-1.96, 1.96)*se)
print("95% CI for correlation:")
print(ci)
## Also check SD (should be close to 0.5)
print(sqrt(model_fit$sigmasq_tilde))
## Toeplitz Simulation Setup, with demonstration of custom functions
# and boilerplate. Toep is not implemented by default, because it makes
# strong assumptions on the study design and missingness that are rarely met,
# with non-obvious workarounds.
# If a GLM was to-be-fit, you'd also submit a function "Vhalf_fxn" analogous
# to VhalfInv_fxn with same argument (par) and an output of an N x N matrix
# that yields the inverse of VhalfInv_fxn output.
n_blocks <- 150 # Number of correlation_ids
block_size <- 4 # Observations per correlation_id
N <- n_blocks * block_size # total sample size
rho_true <- 0.5 # True correlation within correlation_ids
true_intercept <- 2 # True intercept
true_slope <- 0.5 # True slope for covariate
## Create design matrix with meaningful predictors
Tmat <- matrix(0, N, 2)
Tmat[,1] <- 1 # Intercept
Tmat[,2] <- cos(rnorm(N)) # Continuous predictor
## True coefficients
beta <- c(true_intercept, true_slope)
## Create time and correlation_id variables
time_var <- rep(1:block_size, n_blocks)
correlation_id_var <- rep(1:n_blocks, each = block_size)
## Create block compound symmetric errors
errors <- Reduce('rbind',
lapply(1:n_blocks, function(i) {
sigma <- diag(block_size) + rho_true *
(matrix(1, block_size, block_size) -
diag(block_size))
matsqrt(sigma) %*% rnorm(block_size)
}))
## Generate response with correlated errors and covariate effect
y <- Tmat %*% beta + errors * 2
## Toeplitz correlation function
VhalfInv_fxn <- function(par) {
# Initialize correlation matrix
corr <- matrix(0, block_size, block_size)
# Construct Toeplitz matrix with correlation by lag
for(i in 1:block_size) {
for(j in 1:block_size) {
lag <- abs(time_var[i] - time_var[j])
if(lag == 0) {
corr[i,j] <- 1
} else if(lag <= length(par)) {
# Use tanh to bound correlations between -1 and 1
corr[i,j] <- tanh(par[lag])
}
}
}
## Matrix square root inverse
corr_inv_sqrt <- matinvsqrt(corr)
## Expand to full matrix using Kronecker product
kronecker(diag(n_blocks), corr_inv_sqrt)
}
## Determinant function (for efficiency)
# This avoids taking determinant of N by N matrix
VhalfInv_logdet <- function(par) {
# Initialize correlation matrix
corr <- matrix(0, block_size, block_size)
# Construct Toeplitz matrix
for(i in 1:block_size) {
for(j in 1:block_size) {
lag <- abs(time_var[i] - time_var[j])
if(lag == 0) {
corr[i,j] <- 1
} else if(lag <= length(par)) {
corr[i,j] <- tanh(par[lag])
}
}
}
# Compute log determinant
log_det_invsqrt_corr <- -0.5 * determinant(corr, logarithm=TRUE)$modulus[1]
return(n_blocks * log_det_invsqrt_corr)
}
## REML gradient function
REML_grad <- function(par, model_fit, ...) {
## Initialize gradient vector
n_par <- length(par)
gradient <- numeric(n_par)
## Get dimensions and organize data
nr <- nrow(model_fit$X[[1]])
## Process derivatives one parameter at a time
for(p in 1:n_par) {
## Initialize derivative matrix
dV <- matrix(0, nrow(model_fit$VhalfInv), ncol(model_fit$VhalfInv))
V <- matrix(0, nrow(model_fit$VhalfInv), ncol(model_fit$VhalfInv))
## Compute full correlation matrix and its derivative for parameter p
for(clust in unique(correlation_id_var)) {
inds <- which(correlation_id_var == clust)
block_size <- length(inds)
## Initialize block matrices
V_block <- matrix(0, block_size, block_size)
dV_block <- matrix(0, block_size, block_size)
## Construct Toeplitz matrix and its derivative
for(i in 1:block_size) {
for(j in 1:block_size) {
## Compute lag between observations
lag <- abs(time_var[i] - time_var[j])
## Diagonal is always 1
if(i == j) {
V_block[i,j] <- 1
dV_block[i,j] <- 0
} else {
## Correlation for off-diagonal depends on lag
if(lag <= length(par)) {
## Correlation via tanh parameterization
V_block[i,j] <- tanh(par[lag])
## Derivative for the relevant parameter
if(lag == p) {
## Chain rule for tanh: d/dx tanh(x) = 1 - tanh^2(x)
dV_block[i,j] <- 1 - tanh(par[p])^2
}
}
}
}
}
## Assign blocks to full matrices
V[inds, inds] <- V_block
dV[inds, inds] <- dV_block
}
## GLM Weights based on current model fit (all 1s for normal)
glm_weights <- rep(1, model_fit$N)
## Quadratic form contribution
resid <- model_fit$y - model_fit$ytilde
VinvResid <- model_fit$VhalfInv %**% cbind(resid) / glm_weights
quad_term <- -0.5 * ((t(VinvResid) %**% dV) %**% VinvResid) /
model_fit$sigmasq_tilde
## Log|V| contribution - trace term
trace_term <- 0.5 * sum(diag(model_fit$VhalfInv %**%
model_fit$VhalfInv %**%
dV))
## Information matrix contribution
U <- t(t(model_fit$U) * rep(c(1, model_fit$expansion_scales),
model_fit$K + 1)) /
model_fit$sd_y
VhalfInvX <- model_fit$VhalfInv %**%
collapse_block_diagonal(model_fit$X)[unlist(
model_fit$og_order
),] %**%
U
## Lambda computation for GLMs
if(length(model_fit$penalties$L_partition_list) != (model_fit$K + 1)){
model_fit$penalties$L_partition_list <- lapply(
1:(model_fit$K + 1), function(k)0
)
}
Lambda <- U %**% collapse_block_diagonal(
lapply(1:(model_fit$K + 1),
function(k)
c(1, model_fit$expansion_scales) * (
model_fit$penalties$L_partition_list[[k]] +
model_fit$penalties$Lambda) %**%
diag(c(1, model_fit$expansion_scales)) /
model_fit$sd_y^2
)
) %**% t(U)
XVinvX_inv <- invert(gramMatrix(t(t(VhalfInvX)*c(glm_weights))) +
Lambda)
VInvX <- model_fit$VhalfInv %**% VhalfInvX
sc <- sqrt(norm(VInvX, '2'))
VInvX <- VInvX/sc
dXVinvX <-
(XVinvX_inv %**% t(VInvX)) %**%
(dV %**% VInvX)
XVinvX_term <- -0.5 * colSums(cbind(c(diag(dXVinvX) * sc))) * sc
## Store gradient component (all three terms)
gradient[p] <- as.numeric(quad_term + trace_term + XVinvX_term)
}
## Return normalized gradient
return(gradient / model_fit$N)
}
## Visualization
plot(time_var, y, col = correlation_id_var,
main = "Simulated Data with Toeplitz Correlation")
## Fit model with custom Toeplitz (takes ~ 5-10 minutes on my laptop)
# Note, for GLMs, efficiency would be improved by supplying a Vhalf_fxn
# although strictly only VhalfInv_fxn and VhalfInv_par_init are needed
model_fit <- lgspline(
response = y,
predictors = Tmat[,2],
K = 4,
VhalfInv_fxn = VhalfInv_fxn,
VhalfInv_logdet = VhalfInv_logdet,
REML_grad = REML_grad,
VhalfInv_par_init = c(0, 0, 0),
include_warnings = FALSE
)
## Print comparison of true and estimated correlations
rho_true <- rep(0.5, 3)
rho_est <- tanh(model_fit$VhalfInv_params_estimates)
cat("Correlation Estimates:\n")
print(data.frame(
"True Correlation" = rho_true,
"Estimated Correlation" = rho_est
))
## Should be ~ 2
print(sqrt(model_fit$sigmasq_tilde))
## ## ## ## Parallelism ## ## ## ## ## ## ## ## ## ## ## ## ## ## ## ## ## ## ## ## ## ##
## Data generating function
a <- runif(500000, -9, 9)
b <- runif(500000, -9, 9)
c <- rnorm(500000)
d <- rpois(500000, 1)
y <- sin(a) + cos(b) - 0.2*sqrt(a^2 + b^2) +
abs(a) + b +
0.5*(a^2 + b^2) +
(1/6)*(a^3 + b^3) +
a*b*c -
c +
d +
rnorm(500000, 0, 5)
## Set up cores
cl <- parallel::makeCluster(1)
on.exit(parallel::stopCluster(cl))
## This example shows some options for what operations can be parallelized
# By default, only parallel_eigen and parallel_unconstrained are TRUE
# G, G^{-1/2}, and G^{1/2} are computed in parallel across each of the
# K+1 partitions.
# However, parallel_unconstrained only affects GLMs without corr. components
# - it does not affect fitting here
system.time({
parfit <- lgspline(y ~ spl(a, b) + a*b*c + d,
data = data.frame(y = y,
a = a,
b = b,
c = c,
d = d),
cl = cl,
K = 1,
parallel_eigen = TRUE,
parallel_unconstrained = TRUE,
parallel_aga = FALSE,
parallel_find_neighbors = FALSE,
parallel_trace = FALSE,
parallel_matmult = FALSE,
parallel_make_constraint = FALSE,
parallel_penalty = FALSE)
})
parallel::stopCluster(cl)
print(summary(parfit))
lgspline: Lagrangian Multiplier Smoothing Splines
Description
Allows for common S3 methods including print, summary, coef, plot, and predict, with additional inference methods provided.
Details
This package implements various methods for working with lgspline models, including printing, summarizing, plotting, and conducting statistical inference.
Low-Level Fitting for Lagrangian Smoothing Splines
Description
The core function for fitting Lagrangian smoothing splines with less user-friendliness.
Usage
lgspline.fit(predictors, y = NULL, standardize_response = TRUE,
standardize_predictors_for_knots = TRUE,
standardize_expansions_for_fitting = TRUE, family = gaussian(),
glm_weight_function = function(mu, y, order_indices, family,
dispersion, observation_weights,
...) {
if(any(!is.null(observation_weights))){
family$variance(mu) * observation_weights
} else {
family$variance(mu)
}
},
shur_correction_function = function(X, y, B, dispersion, order_list,
K, family, observation_weights,
...) {
lapply(1:(K+1), function(k) 0)
},
need_dispersion_for_estimation = FALSE,
dispersion_function = function(mu, y, order_indices, family,
observation_weights, ...) { 1 },
K = NULL, custom_knots = NULL, cluster_on_indicators = FALSE,
make_partition_list = NULL, previously_tuned_penalties = NULL,
smoothing_spline_penalty = NULL, opt = TRUE, use_custom_bfgs = TRUE,
delta = NULL, tol = 10*sqrt(.Machine$double.eps),
invsoftplus_initial_wiggle = c(-25, 20, -15, -10, -5),
invsoftplus_initial_flat = c(-14, -7), wiggle_penalty = 2e-07,
flat_ridge_penalty = 0.5, unique_penalty_per_partition = TRUE,
unique_penalty_per_predictor = TRUE, meta_penalty = 1e-08,
predictor_penalties = NULL, partition_penalties = NULL,
include_quadratic_terms = TRUE, include_cubic_terms = TRUE,
include_quartic_terms = FALSE, include_2way_interactions = TRUE,
include_3way_interactions = TRUE,
include_quadratic_interactions = FALSE,
offset = c(), just_linear_with_interactions = NULL,
just_linear_without_interactions = NULL,
exclude_interactions_for = NULL,
exclude_these_expansions = NULL, custom_basis_fxn = NULL,
include_constrain_fitted = TRUE,
include_constrain_first_deriv = TRUE,
include_constrain_second_deriv = TRUE,
include_constrain_interactions = TRUE, cl = NULL, chunk_size = NULL,
parallel_eigen = TRUE, parallel_trace = FALSE, parallel_aga = FALSE,
parallel_matmult = FALSE, parallel_unconstrained = FALSE,
parallel_find_neighbors = FALSE, parallel_penalty = FALSE,
parallel_make_constraint = FALSE,
unconstrained_fit_fxn = unconstrained_fit_default,
keep_weighted_Lambda = FALSE, iterate_tune = TRUE,
iterate_final_fit = TRUE, blockfit = FALSE,
qp_score_function = function(X, y, mu, order_list, dispersion,
VhalfInv, observation_weights, ...) {
if(!is.null(observation_weights)) {
crossprod(X, cbind((y - mu)*observation_weights))
} else {
crossprod(X, cbind(y - mu))
}
},
qp_observations = NULL, qp_Amat = NULL, qp_bvec = NULL, qp_meq = 0,
qp_positive_derivative = FALSE, qp_negative_derivative = FALSE,
qp_monotonic_increase = FALSE, qp_monotonic_decrease = FALSE,
qp_range_upper = NULL, qp_range_lower = NULL, qp_Amat_fxn = NULL,
qp_bvec_fxn = NULL, qp_meq_fxn = NULL, constraint_values = cbind(),
constraint_vectors = cbind(), return_G = TRUE, return_Ghalf = TRUE,
return_U = TRUE, estimate_dispersion = TRUE,
unbias_dispersion = TRUE,
return_varcovmat = TRUE, custom_penalty_mat = NULL,
cluster_args = c(custom_centers = NA, nstart = 10),
dummy_dividor = 1.2345672152894e-22,
dummy_adder = 2.234567210529e-18, verbose = FALSE,
verbose_tune = FALSE, expansions_only = FALSE,
observation_weights = NULL, do_not_cluster_on_these = c(),
neighbor_tolerance = 1 + 1e-16, no_intercept = FALSE,
VhalfInv = NULL, Vhalf = NULL, include_warnings = TRUE, ...)
Arguments
predictors |
Default: NULL. Numeric matrix or data frame of predictor variables. Supports direct matrix input or formula interface when used with 'data' argument. Must contain numeric predictors, with categorical variables pre-converted to numeric indicators. |
y |
Default: NULL. Numeric response variable vector representing the target/outcome/dependent variable etc. to be modeled. |
standardize_response |
Default: TRUE. Logical indicator controlling whether the response variable should be centered by mean and scaled by standard deviation before model fitting. When TRUE, tends to improve numerical stability. Only offered for identity link functions (when family$link == 'identity') |
standardize_predictors_for_knots |
Default: TRUE. Logical flag determining whether predictor variables should be standardized before knot placement. Ensures consistent knot selection across different predictor scales, and that no one predictor dominates in terms of influence on knot placement. For all expansions (x), standardization corresponds to dividing by the difference in 69 and 31st percentiles = x / (quantile(x, 0.69) - quantile(x, 0.31)). |
standardize_expansions_for_fitting |
Default: TRUE. Logical switch to standardize polynomial basis expansions during model fitting. Provides computational stability during penalty tuning without affecting statistical inference, as design matrices, variance-covariance matrices, and coefficient estimates are systematically backtransformed after fitting to account for the standardization. If TRUE, |
family |
Default: gaussian(). Generalized linear model (GLM) distribution family specifying the error distribution and link function for model fitting. Defaults to Gaussian distribution with identity link. Supports custom family specifications, including user-defined link functions and optional custom tuning loss criteria. Minimally requires 1) family name (family) 2) link name (link) 3) linkfun (link function) 4) linkinv (link function inverse) 5) variance (mean variance relationship function, |
glm_weight_function |
Default: function that returns family$variance(mu) * observation_weights if weights exist, family$variance(mu) otherwise. Codes the mean-variance relationship of a GLM or GLM-like model, the diagonal |
shur_correction_function |
Default: function that returns list of zeros. Advanced function for computing Schur complements |
need_dispersion_for_estimation |
Default: FALSE. Logical indicator specifying whether a dispersion parameter is required for coefficient estimation. This is not needed for canonical regular exponential family models, but is often needed otherwise (such as fitting Weibull AFT models). |
dispersion_function |
Default: function that returns 1. Custom function for estimating the dispersion parameter. Unless |
K |
Default: NULL. Integer specifying the number of knot locations for spline partitions. This can intuitively be considered the total number of partitions - 1. |
custom_knots |
Default: NULL. Optional matrix providing user-specified knot locations in 1-D. |
cluster_on_indicators |
Default: FALSE. Logical flag determining whether indicator variables should be used for clustering knot locations. |
make_partition_list |
Default: NULL. Optional list allowing direct specification of custom partition assignments. The intent is that the make_partition_list returned by one model can be supplied here to keep the same knot locations for another. |
previously_tuned_penalties |
Default: NULL. Optional list of pre-computed penalty components from previous model fits. |
smoothing_spline_penalty |
Default: NULL. Optional custom smoothing spline penalty matrix for fine-tuned complexity control. |
opt |
Default: TRUE. Logical switch controlling whether model penalties should be automatically optimized via generalized cross-validation. Turn this off if previously_tuned_penalties are supplied AND desired, otherwise, the previously_tuned_penalties will be ignored. |
use_custom_bfgs |
Default: TRUE. Logical indicator selecting between a native implementation of damped-BFGS optimization method with analytical gradients or base R's BFGS implementation with finite-difference approximation of gradients. |
delta |
Default: NULL. Numeric pseudocount used for stabilizing optimization in non-identity link function scenarios. |
tol |
Default: 10*sqrt(.Machine$double.eps). Numeric convergence tolerance controlling the precision of optimization algorithms. |
invsoftplus_initial_wiggle |
Default: c(-25, 20, -15, -10, -5). Numeric vector of initial grid points for wiggle penalty optimization, specified on the inverse-softplus ( |
invsoftplus_initial_flat |
Default: c(-7, 0). Numeric vector of initial grid points for ridge penalty optimization, specified on the inverse-softplus ( |
wiggle_penalty |
Default: 2e-7. Numeric penalty controlling the integrated squared second derivative, governing function smoothness. Applied to both smoothing spline penalty (alone) and is multiplied by |
flat_ridge_penalty |
Default: 0.5. Numeric flat ridge penalty for additional regularization on only intercepts and linear terms (won't affect interactions or quadratic/cubic/quartic terms by default). If |
unique_penalty_per_partition |
Default: TRUE. Logical flag allowing the magnitude of the smoothing spline penalty to differ across partition. |
unique_penalty_per_predictor |
Default: TRUE. Logical flag allowing the magnitude of the smoothing spline penalty to differ between predictors. |
meta_penalty |
Default: 1e-8. Numeric "meta-penalty" applied to predictor and partition penalties during tuning. The minimization of GCV is modified to be a penalized minimization problem, with penalty |
predictor_penalties |
Default: NULL. Optional list of custom penalties specified per predictor. |
partition_penalties |
Default: NULL. Optional list of custom penalties specified per partition. |
include_quadratic_terms |
Default: TRUE. Logical switch to include squared predictor terms in basis expansions. |
include_cubic_terms |
Default: TRUE. Logical switch to include cubic predictor terms in basis expansions. |
include_quartic_terms |
Default: NULL. Logical switch to include quartic predictor terms in basis expansions. This is highly recommended for fitting models with multiple predictors to avoid over-specified constraints. When NULL (by default), will internally set to FALSE if only one predictor present, and TRUE otherwise. |
include_2way_interactions |
Default: TRUE. Logical switch to include linear two-way interactions between predictors. |
include_3way_interactions |
Default: TRUE. Logical switch to include three-way interactions between predictors. |
include_quadratic_interactions |
Default: FALSE. Logical switch to include linear-quadratic interaction terms. |
offset |
Default: Empty vector. When non-missing, this is a vector of column indices/names to include as offsets. |
just_linear_with_interactions |
Default: NULL. Integer vector specifying columns to retain linear terms with interactions. |
just_linear_without_interactions |
Default: NULL. Integer vector specifying columns to retain only linear terms without interactions. |
exclude_interactions_for |
Default: NULL. Integer vector indicating columns to exclude from all interaction terms. |
exclude_these_expansions |
Default: NULL. Character vector specifying basis expansions to be excluded from the model. These must be named columns of the data, or in the form "_1_", "_2_", "_1_x_2_", "_2_^2" etc. where "1" and "2" indicate column indices of predictor matrix input. |
custom_basis_fxn |
Default: NULL. Optional user-defined function for generating custom basis expansions. See |
include_constrain_fitted |
Default: TRUE. Logical switch to constrain fitted values at knot points. |
include_constrain_first_deriv |
Default: TRUE. Logical switch to constrain first derivatives at knot points. |
include_constrain_second_deriv |
Default: TRUE. Logical switch to constrain second derivatives at knot points. |
include_constrain_interactions |
Default: TRUE. Logical switch to constrain interaction terms at knot points. |
cl |
Default: NULL. Parallel processing cluster object for distributed computation (use |
chunk_size |
Default: NULL. Integer specifying custom fixed chunk size for parallel processing. |
parallel_eigen |
Default: TRUE. Logical flag to enable parallel processing for eigenvalue decomposition computations. |
parallel_trace |
Default: FALSE. Logical flag to enable parallel processing for trace computation. |
parallel_aga |
Default: FALSE. Logical flag to enable parallel processing for specific matrix operations involving G and A. |
parallel_matmult |
Default: FALSE. Logical flag to enable parallel processing for block-diagonal matrix multiplication. |
parallel_unconstrained |
Default: TRUE. Logical flag to enable parallel processing for unconstrained maximum likelihood estimation. |
parallel_find_neighbors |
Default: FALSE. Logical flag to enable parallel processing for neighbor identification (which partitions are neighbors). |
parallel_penalty |
Default: FALSE. Logical flag to enable parallel processing for penalty matrix construction. |
parallel_make_constraint |
Default: FALSE. Logical flag to enable parallel processing for constraint matrix generation. |
unconstrained_fit_fxn |
Default: |
keep_weighted_Lambda |
Default: FALSE. Logical flag to retain generalized linear model weights in penalty constraints using Tikhonov parameterization. It is advised to turn this to TRUE when fitting non-canonical GLMs. The default |
iterate_tune |
Default: TRUE. Logical switch to use iterative optimization during penalty tuning. If FALSE, |
iterate_final_fit |
Default: TRUE. Logical switch to use iterative optimization for final model fitting. If FALSE, |
blockfit |
Default: FALSE. Logical switch to abandon per-partition fitting for non-spline effects without interactions, collapse all matrices into block-diagonal single-matrix form, and fit agnostic to partition. This would be more efficient for many non-spline effects without interactions and relatively few spline effects or non-spline effects with interactions. Ignored if |
qp_score_function |
Default: |
qp_observations |
Default: NULL. Numeric vector of observations to apply constraints to for monotonic and range quadratic programming constraints. Useful for saving computational resources. |
qp_Amat |
Default: NULL. Constraint matrix for quadratic programming formulation. The |
qp_bvec |
Default: NULL. Constraint vector for quadratic programming formulation. The |
qp_meq |
Default: 0. Number of equality constraints in quadratic programming setup. The |
qp_positive_derivative , qp_monotonic_increase |
Default: FALSE. Logical flags to constrain the function to have positive first derivatives/be monotonically increasing using quadratic programming with respect to the order (ascending rows) of the input data set. |
qp_negative_derivative , qp_monotonic_decrease |
Default: FALSE. Logical flags to constrain the function to have negative first derivatives/be monotonically decreasing using quadratic programming with respect to the order (ascending rows) of the input data set. |
qp_range_upper |
Default: NULL. Numeric upper bound for constrained fitted values using quadratic programming. |
qp_range_lower |
Default: NULL. Numeric lower bound for constrained fitted values using quadratic programming. |
qp_Amat_fxn |
Default: NULL. Custom function for generating Amat matrix in quadratic programming. |
qp_bvec_fxn |
Default: NULL. Custom function for generating bvec vector in quadratic programming. |
qp_meq_fxn |
Default: NULL. Custom function for determining meq equality constraints in quadratic programming. |
constraint_values |
Default: |
constraint_vectors |
Default: |
return_G |
Default: TRUE. Logical switch to return the unscaled variance-covariance matrix without smoothing constraints ( |
return_Ghalf |
Default: TRUE. Logical switch to return the matrix square root of the unscaled variance-covariance matrix without smoothing constraints ( |
return_U |
Default: TRUE. Logical switch to return the constraint projection matrix |
estimate_dispersion |
Default: TRUE. Logical flag to estimate the dispersion parameter after fitting. |
unbias_dispersion |
Default NULL. Logical switch to multiply final dispersion estimates by |
return_varcovmat |
Default: TRUE. Logical switch to return the variance-covariance matrix of the estimated coefficients. This is needed for performing Wald inference. |
custom_penalty_mat |
Default: NULL. Optional |
cluster_args |
Default: |
dummy_dividor |
Default: 0.00000000000000000000012345672152894. Small numeric constant to prevent division by zero in computational routines. |
dummy_adder |
Default: 0.000000000000000002234567210529. Small numeric constant to prevent division by zero in computational routines. |
verbose |
Default: FALSE. Logical flag to print general progress messages during model fitting (does not include during tuning). |
verbose_tune |
Default: FALSE. Logical flag to print detailed progress messages during penalty tuning specifically. |
expansions_only |
Default: FALSE. Logical switch to return only basis expansions without full model fitting. Useful for setting up custom constraints and penalties. |
observation_weights |
Default: NULL. Numeric vector of observation-specific weights for generalized least squares estimation. |
do_not_cluster_on_these |
Default: c(). Vector specifying predictor columns to exclude from clustering procedures, in addition to the non-spline effects by default. |
neighbor_tolerance |
Default: 1 + 1e-8. Numeric tolerance for determining neighboring partitions using k-means clustering. Greater values means more partitions are likely to be considered neighbors. Intended for internal use only (modify at your own risk!). |
no_intercept |
Default: FALSE. Logical flag to remove intercept, constraining it to 0. The function automatically constructs constraint_vectors and constraint_values to achieve this. Calling formulas with a "0+" in it like |
VhalfInv |
Default: NULL. Matrix representing a fixed, custom square-root-inverse covariance structure for the response variable of longitudinal and spatial modeling. Must be an |
Vhalf |
Default: NULL. Matrix representing a fixed, custom square-root covariance structure for the response variable of longitudinal and spatial modeling. Must be an |
include_warnings |
Default: TRUE. Logical switch to control display of warning messages during model fitting. |
... |
Additional arguments passed to the unconstrained model fitting function. |
Value
A list containing the fitted model components, forming the core
structure used internally by lgspline
and its associated methods.
This function is primarily intended for internal use or advanced users needing
direct access to fitting components. The returned list contains numerous elements,
typically including:
- y
The original response vector provided.
- ytilde
The fitted values on the original response scale.
- X
A list, with each element the design matrix (
\textbf{X}_k
) for partition k.- A
The constraint matrix (
\textbf{A}
) encoding smoothness and any other linear equality constraints.- B
A list of the final fitted coefficient vectors (
\boldsymbol{\beta}_k
) for each partition k, on the original predictor/response scale.- B_raw
A list of fitted coefficient vectors on the internally standardized scale used during fitting.
- K, p, q, P, N
Key dimensions: number of internal knots (K), basis functions per partition (p), original predictors (q), total coefficients (P), and sample size (N).
- penalties
A list containing the final penalty components used (e.g.,
Lambda
,L1
,L2
,L_predictor_list
,L_partition_list
). Seecompute_Lambda
.- knot_scale_transf, knot_scale_inv_transf
Functions to transform predictors to/from the scale used for knot placement.
- knots
Matrix or vector of knot locations on the original predictor scale (NULL if K=0 or q > 1).
- partition_codes
Vector assigning each original observation to a partition.
- partition_bounds
Internal representation of partition boundaries.
- make_partition_list
List containing centers, knot midpoints, neighbor info, and assignment function from partitioning (NULL if K=0 or 1D). See
make_partitions
.- knot_expand_function, assign_partition
Internal functions for partitioning data. See
knot_expand_list
.- predict
The primary function embedded in the object for generating predictions on new data. See
predict.lgspline
.- family
The
family
object or custom list used.- estimate_dispersion, unbias_dispersion
Logical flags related to dispersion estimation settings.
- sigmasq_tilde
The estimated (or fixed, if
estimate_dispersion=FALSE
) dispersion parameter\tilde{\sigma}^2
.- backtransform_coefficients, forwtransform_coefficients
Functions to convert coefficients between standardized and original scales.
- mean_y, sd_y
Mean and standard deviation used for standardizing the response.
- og_order, order_list
Information mapping original data order to partitioned order.
- constraint_values, constraint_vectors
User-supplied additional linear equality constraints.
- expansion_scales
Scaling factors applied to basis expansions during fitting (if
standardize_expansions_for_fitting=TRUE
).- take_derivative, take_interaction_2ndderivative, get_all_derivatives_insample
Functions related to computing derivatives of the fitted spline. See
take_derivative
,take_interaction_2ndderivative
,make_derivative_matrix
.- numerics, power1_cols, ..., nonspline_cols
Integer vectors storing column indices identifying different types of terms in the basis expansion.
- return_varcovmat
Logical indicating if variance matrix calculation was requested.
- raw_expansion_names
Original generated names for basis expansion columns (before potential renaming if input predictors had names).
- std_X, unstd_X
Functions to standardize/unstandardize design matrices according to
expansion_scales
.- parallel_cluster_supplied
Logical indicating if a parallel cluster was used.
- weights
The original observation weights provided (potentially reformatted).
- VhalfInv
The fixed
\mathbf{V}^{-1/2}
matrix if supplied for GEEs.- quadprog_list
List containing components related to quadratic programming constraints, if used.
- G, Ghalf, U
Matrices related to the variance-covariance structure (
\mathbf{G}
,\mathbf{G}^{1/2}
,\mathbf{U}
), returned if requested via corresponding arguments. Seecompute_G_eigen
andget_U
.- trace_XUGX
The trace term
\text{trace}(\mathbf{X}\mathbf{U}\mathbf{G}\mathbf{X}^{T})
, used for effective degrees of freedom. Seecompute_trace_UGXX_wrapper
.- varcovmat
The final variance-covariance matrix of the estimated coefficients,
\sigma^2 \mathbf{U}\mathbf{G}
, returned ifreturn_varcovmat = TRUE
.
Note that the exact components returned depend heavily on the function
arguments (e.g., values of return_G
, return_varcovmat
, etc.).
If expansions_only = TRUE
, a much smaller list is returned containing
only pre-fitting components needed for inspection or setup (see lgspline
).
Compute Log-Likelihood for Weibull Accelerated Failure Time Model
Description
Calculates the log-likelihood for a Weibull accelerated failure time (AFT) survival model, supporting right-censored survival data.
Usage
loglik_weibull(log_y, log_mu, status, scale, weights = 1)
Arguments
log_y |
Numeric vector of logarithmic response/survival times |
log_mu |
Numeric vector of logarithmic predicted survival times |
status |
Numeric vector of censoring indicators (1 = event, 0 = censored) Indicates whether an event of interest occurred (1) or the observation was right-censored (0). In survival analysis, right-censoring occurs when the full survival time is unknown, typically because the study ended or the subject was lost to follow-up before the event of interest occurred. |
scale |
Numeric scalar representing the Weibull scale parameter |
weights |
Optional numeric vector of observation weights (default = 1) |
Details
The function computes log-likelihood contributions for a Weibull AFT model, explicitly accounting for right-censored observations. It supports optional observation weighting to accommodate complex sampling designs.
This both provides a tool for actually fitting Weibull AFT models, and boilerplate code for users who wish to incorporate Lagrangian multiplier smoothing splines into their own custom models.
Value
A numeric scalar representing the total log-likelihood of the model
Examples
## Minimal example of fitting a Weibull Accelerated Failure Time model
# Simulating survival data with right-censoring
set.seed(1234)
x1 <- rnorm(1000)
x2 <- rbinom(1000, 1, 0.5)
yraw <- rexp(exp(0.01*x1 + 0.01*x2))
# status: 1 = event occurred, 0 = right-censored
status <- rbinom(1000, 1, 0.25)
yobs <- ifelse(status, runif(1, 0, yraw), yraw)
df <- data.frame(
y = yobs,
x1 = x1,
x2 = x2
)
## Fit model using lgspline with Weibull AFT specifics
model_fit <- lgspline(y ~ spl(x1) + x2,
df,
unconstrained_fit_fxn = unconstrained_fit_weibull,
family = weibull_family(),
need_dispersion_for_estimation = TRUE,
dispersion_function = weibull_dispersion_function,
glm_weight_function = weibull_glm_weight_function,
shur_correction_function = weibull_shur_correction,
status = status,
opt = FALSE,
K = 1)
loglik_weibull(log(model_fit$y), log(model_fit$ytilde), status,
sqrt(model_fit$sigmasq_tilde))
Create Smoothing Spline Constraint Matrix
Description
Constructs constraint matrix \textbf{A}
enforcing continuity and smoothness at knot boundaries
by constraining function values, derivatives, and interactions between partitions.
Usage
make_constraint_matrix(
nc,
CKnots,
power1_cols,
power2_cols,
nonspline_cols,
interaction_single_cols,
interaction_quad_cols,
triplet_cols,
K,
include_constrain_fitted,
include_constrain_first_deriv,
include_constrain_second_deriv,
include_constrain_interactions,
include_2way_interactions,
include_3way_interactions,
include_quadratic_interactions,
colnm_expansions,
expansion_scales
)
Arguments
nc |
Integer; number of columns in basis expansion |
CKnots |
Matrix; basis expansions evaluated at knot points |
power1_cols |
Integer vector; indices of linear terms |
power2_cols |
Integer vector; indices of quadratic terms |
nonspline_cols |
Integer vector; indices of non-spline terms |
interaction_single_cols |
Integer vector; indices of linear interaction terms |
interaction_quad_cols |
Integer vector; indices of quadratic interaction terms |
triplet_cols |
Integer vector; indices of three-way interaction terms |
K |
Integer; number of interior knots ( |
include_constrain_fitted |
Logical; constrain function values at knots |
include_constrain_first_deriv |
Logical; constrain first derivatives at knots |
include_constrain_second_deriv |
Logical; constrain second derivatives at knots |
include_constrain_interactions |
Logical; constrain interaction terms at knots |
include_2way_interactions |
Logical; include two-way interactions |
include_3way_interactions |
Logical; include three-way interactions |
include_quadratic_interactions |
Logical; include quadratic interactions |
colnm_expansions |
Character vector; column names for basis expansions |
expansion_scales |
Numeric vector; scaling factors for standardization |
Value
Matrix \textbf{A}
of constraint coefficients. Columns correspond to
constraints, rows to coefficients across all K+1
partitions.
Compute First and Second Derivative Matrices
Description
Compute First and Second Derivative Matrices
Usage
make_derivative_matrix(
nc,
Cpredictors,
power1_cols,
power2_cols,
nonspline_cols,
interaction_single_cols,
interaction_quad_cols,
triplet_cols,
K,
include_2way_interactions,
include_3way_interactions,
include_quadratic_interactions,
colnm_expansions,
expansion_scales,
just_first_derivatives = FALSE,
just_spline_effects = TRUE
)
Arguments
nc |
Number of columns |
Cpredictors |
Predictor matrix |
power1_cols |
Indices of linear terms of spline effects |
power2_cols |
Indices of quadratic terms of spline effects |
nonspline_cols |
Indices of non-spline effects |
interaction_single_cols |
Indices of first-order interactions |
interaction_quad_cols |
Indices of quadratic interactions |
triplet_cols |
Indices of three-way interactions |
K |
Number of partitions minus 1 ( |
include_2way_interactions |
Include 2-way interactions |
include_3way_interactions |
Include 3-way interactions |
include_quadratic_interactions |
Include quadratic interactions |
colnm_expansions |
Column names |
expansion_scales |
Scale factors |
just_first_derivatives |
Only compute first derivatives |
just_spline_effects |
Only compute derivatives for spline effects |
Value
List containing first and second derivative matrices
Create Data Partitions Using Clustering
Description
Partitions data support into clusters using Voronoi-like diagrams.
Usage
make_partitions(
data,
cluster_args,
cluster_on_indicators,
K,
parallel,
cl,
do_not_cluster_on_these,
neighbor_tolerance
)
Arguments
data |
Numeric matrix of predictor variables |
cluster_args |
Parameters for clustering |
cluster_on_indicators |
Logical to include binary predictors |
K |
Number of partitions minus 1 ( |
parallel |
Logical to enable parallel processing |
cl |
Cluster object for parallel computation |
do_not_cluster_on_these |
Columns to exclude from clustering |
neighbor_tolerance |
Scaling factor for neighbor detection |
Value
A list containing: - centers: Cluster center coordinates - knots: Knot points between centers - assign_partition: Function to assign new data to partitions - neighbors: List of neighboring partition indices
Calculate Matrix Inverse Square Root
Description
Calculate Matrix Inverse Square Root
Usage
matinvsqrt(mat)
Arguments
mat |
A symmetric, positive-definite matrix |
Details
For matrix \textbf{M}
, computes \textbf{B}
where \textbf{B}\textbf{B} = \textbf{M}^{-1}
using eigenvalue decomposition:
1. Compute eigendecomposition \textbf{M} = \textbf{V}\textbf{D}\textbf{V}^T
2. Set eigenvalues below sqrt(.Machine$double.eps)
to 0
3. Take elementwise reciprocal square root: \textbf{D}^{-1/2}
4. Reconstruct as \textbf{B} = \textbf{V} \textbf{D}^{-1/2} \textbf{V}^T
For nearly singular matrices, eigenvalues below the numerical threshold
are set to 0, and their reciprocals in \textbf{D}^{-1/2}
are also set to 0.
This implementation is particularly useful for whitening procedures in GLMs with correlation structures and for computing variance-covariance matrices under constraints.
You can use this to help construct a custom VhalfInv_fxn
for fitting
correlation structures, see lgspline
.
Value
A matrix \textbf{B}
such that \textbf{B}\textbf{B} = \textbf{M}^{-1}
Examples
## Identity matrix
m1 <- diag(2)
matinvsqrt(m1) # Returns identity matrix
## Compound symmetry correlation matrix
rho <- 0.5
m2 <- matrix(rho, 3, 3) + diag(1-rho, 3)
B <- matinvsqrt(m2)
# Verify: B %**% B approximately equals solve(m2)
all.equal(B %**% B, solve(m2))
## Example for GLM correlation structure
n_blocks <- 2 # Number of subjects
block_size <- 3 # Measurements per subject
rho <- 0.7 # Within-subject correlation
# Correlation matrix for one subject
R <- matrix(rho, block_size, block_size) +
diag(1-rho, block_size)
## Full correlation matrix for all subjects
V <- kronecker(diag(n_blocks), R)
## Create whitening matrix
VhalfInv <- matinvsqrt(V)
# Example construction of VhalfInv_fxn for lgspline
VhalfInv_fxn <- function(par) {
rho <- tanh(par) # Transform parameter to (-1, 1)
R <- matrix(rho, block_size, block_size) +
diag(1-rho, block_size)
kronecker(diag(n_blocks), matinvsqrt(R))
}
Left-Multiply a List of Block-Diagonal Matrices by U
Description
Useful for computing UG where G is a list of block-diagonal matrices for each partition, and U is a square P by P dense matrix
Usage
matmult_U(U, G, nc, K)
Arguments
U |
Matrix of dimension P by P that projects onto the null space of A transpose |
G |
List of G matrices for each partition |
nc |
Numeric numeric of basis expansions of predictors per partition |
K |
Number of blocks |
Value
Matrix of UG
Multiply Block Diagonal Matrices in Parallel
Description
Multiplies two lists of matrices that form block diagonal structures, with optional parallel processing.
Usage
matmult_block_diagonal(
A,
B,
K,
parallel,
cl,
chunk_size,
num_chunks,
rem_chunks
)
Arguments
A |
List of matrices forming first block diagonal matrix |
B |
List of matrices forming second block diagonal matrix |
K |
Number of blocks minus 1 ( |
parallel |
Logical; whether to use parallel processing |
cl |
Cluster object for parallel processing |
chunk_size |
Number of blocks per chunk for parallel processing |
num_chunks |
Number of chunks for parallel processing |
rem_chunks |
Remaining blocks after chunking |
Details
When parallel=TRUE
, splits computation into chunks processed in parallel.
Handles remainder chunks separately. Uses matmult_block_diagonal_cpp()
for
actual multiplication.
The function expects A and B to contain corresponding blocks that can be matrix multiplied.
Value
List containing products of corresponding blocks
Examples
A <- list(matrix(1:4,2,2), matrix(5:8,2,2))
B <- list(matrix(1:4,2,2), matrix(5:8,2,2))
matmult_block_diagonal(A, B, K=1, parallel=FALSE, cl=NULL,
chunk_size=1, num_chunks=1, rem_chunks=0)
Calculate Matrix Square Root
Description
Calculate Matrix Square Root
Usage
matsqrt(mat)
Arguments
mat |
A symmetric, positive-definite matrix |
Details
For matrix \textbf{M}
, computes \textbf{B}
where \textbf{B}\textbf{B} = \textbf{M}
using eigenvalue decomposition:
1. Compute eigendecomposition \textbf{M} = \textbf{V}\textbf{D}\textbf{V}^T
2. Set eigenvalues below sqrt(.Machine$double.eps)
to 0 for stability
3. Take elementwise square root of eigenvalues: \textbf{D}^{1/2}
4. Reconstruct as \textbf{B} = \textbf{V} \textbf{D}^{1/2} \textbf{V}^T
This provides the unique symmetric positive-definite square root.
You can use this to help construct a custom Vhalf_fxn
for fitting
correlation structures, see lgspline
.
Value
A matrix \textbf{B}
such that \textbf{B}\textbf{B} = \textbf{M}
Examples
## Identity matrix
m1 <- diag(2)
matsqrt(m1) # Returns identity matrix
## Compound symmetry correlation matrix
rho <- 0.5
m2 <- matrix(rho, 3, 3) + diag(1-rho, 3)
B <- matsqrt(m2)
# Verify: B %**% B approximately equals m2
all.equal(B %**% B, m2)
## Example for correlation structure
n_blocks <- 2 # Number of subjects
block_size <- 3 # Measurements per subject
rho <- 0.7 # Within-subject correlation
# Correlation matrix for one subject
R <- matrix(rho, block_size, block_size) +
diag(1-rho, block_size)
# Full correlation matrix for all subjects
V <- kronecker(diag(n_blocks), R)
Vhalf <- matsqrt(V)
Compute Newton-Raphson Parameter Update with Numerical Stabilization
Description
Performs parameter update in iterative optimization.
Called by damped_newton_r
in the update step
Usage
nr_iterate(gradient_val, neghessian_val)
Arguments
gradient_val |
Numeric vector of gradient values ( |
neghessian_val |
Negative Hessian matrix ( |
Details
This helper function is a core component of Newton-Raphson optimization.
It provides a computationally-stable approach to computing \textbf{G}\textbf{u}
, for
information matrix \textbf{G}
and score vector \textbf{u}
, where the Newton-Raphson update
can be expressed as \boldsymbol{\beta}^{(m+1)} = \boldsymbol{\beta}^{(m)} + \textbf{G}\textbf{u}
.
Value
Numeric vector of parameter updates (\textbf{G}\textbf{u}
)
See Also
damped_newton_r
for the full optimization routine
Plot Method for Lagrangian Multiplier Smoothing Spline Models
Description
Creates visualizations of fitted spline models, supporting both 1D line plots and 2D surface plots with optional formula annotations and customizable aesthetics. (Wrapper for internal plot method)
Usage
## S3 method for class 'lgspline'
plot(
x,
show_formulas = FALSE,
digits = 4,
legend_pos = "topright",
custom_response_lab = "y",
custom_predictor_lab = NULL,
custom_predictor_lab1 = NULL,
custom_predictor_lab2 = NULL,
custom_formula_lab = NULL,
custom_title = "Fitted Function",
text_size_formula = NULL,
legend_args = list(),
new_predictors = NULL,
xlim = NULL,
ylim = NULL,
color_function = NULL,
add = FALSE,
vars = c(),
...
)
Arguments
x |
A fitted lgspline model object containing the model fit to be plotted |
show_formulas |
Logical; whether to display analytical formulas for each partition. Default FALSE |
digits |
Integer; Number of decimal places for coefficient display in formulas. Default 4 |
legend_pos |
Character; Position of legend for 1D plots ("top", "bottom", "left", "right", "topleft", etc.). Default "topright" |
custom_response_lab |
Character; Label for response variable axis. Default "y" |
custom_predictor_lab |
Character; Label for predictor axis in 1D plots. If NULL (default), uses predictor column name |
custom_predictor_lab1 |
Character; Label for first predictor axis (x1) in 2D plots. If NULL (default), uses first predictor column name |
custom_predictor_lab2 |
Character; Label for second predictor axis (x2) in 2D plots. If NULL (default), uses second predictor column name |
custom_formula_lab |
Character; Label for fitted response on link function scale. If NULL (default), uses "link(E[custom_response_lab])" for non-Gaussian models with non-identity link, otherwise uses custom_response_lab |
custom_title |
Character; Main plot title. Default "Fitted Function" |
text_size_formula |
Numeric; Text size for formula display. Passed to cex in legend() for 1D plots and hover font size for 2D plots. If NULL (default), uses 0.8 for 1D and 8 for 2D |
legend_args |
List; Additional arguments passed to legend() for 1D plots |
new_predictors |
Matrix; Optional new predictor values for prediction. If NULL (default), uses original fitting data |
xlim |
Numeric vector; Optional x-axis limits for 1D plots. Default NULL |
ylim |
Numeric vector; Optional y-axis limits for 1D plots. Default NULL |
color_function |
Function; Returns colors for plotting by partition, must return K+1 vector of valid colors. Defaults to NULL, in which case |
add |
Logical; If TRUE, adds to existing plot (1D only). Similar to add in
|
vars |
Numeric or character vector; Optional indices for selecting variables to plot. Can either be numeric (the column indices of "predictors" or "data") or character (the column names, if available from "predictors" or "data") |
... |
Additional arguments passed to underlying plot functions: |
Details
Produces different visualizations based on model dimensionality:
1D models: Line plot showing fitted function across partitions, with optional data points and formula annotations
2D models: Interactive 3D surface plot using plotly, with hover text showing predicted values and optional formula display
Partition boundaries are indicated by color changes in both 1D and 2D plots.
When plotting using "select_vars" option, it is recommended to use the "new_predictors" argument to set all terms not involved with plotting to 0 to avoid non-sensical results. But for some cases, it may be useful to set other predictors fixed at certain values. By default, observed values in the data set are used.
The function relies on linear expansions being present - if (for example) a user includes the argument "_1_" or "_2_" in "exclude_these_expansions", then this function will not be able to extract the predictors needed for plotting.
For this case, try constraining the effects of these terms to 0 instead using "constraint_vectors" and "constraint_values" argument, so they are kept in the expansions but their corresponding coefficients will be 0.
Value
Returns
- 1D
Invisibly returns NULL (base R plot is drawn to device).
- 2D
Plotly object showing interactive surface plot.
See Also
lgspline
for model fitting,
plot
for additional 1D plot parameters,
plot_ly
for additional 2D plot parameters
Examples
## Generate example data
set.seed(1234)
t_data <- runif(1000, -10, 10)
y_data <- 2*sin(t_data) + -0.06*t_data^2 + rnorm(length(t_data))
## Fit model with 10 partitions
model_fit <- lgspline(t_data, y_data, K = 9)
## Basic plot
plot(model_fit)
## Customized plot with formulas
plot(model_fit,
show_formulas = TRUE, # Show partition formulas
custom_response_lab = 'Price', # Custom axis labels
custom_predictor_lab = 'Size',
custom_title = 'Price vs Size', # Custom title
digits = 2, # Round coefficients
legend_pos = 'bottom', # Move legend
text_size_formula = 0.375, # Adjust formula text size
pch = 16, # Point style
cex.main = 1.25) # Title size
Predict Method for Fitted Lagrangian Multiplier Smoothing Spline
Description
Generates predictions, derivatives, and basis expansions from a fitted lgspline model. Supports both in-sample and out-of-sample prediction with optional parallel processing. (Wrapper for internal predict method)
Usage
## S3 method for class 'lgspline'
predict(
object,
newdata = NULL,
parallel = FALSE,
cl = NULL,
chunk_size = NULL,
num_chunks = NULL,
rem_chunks = NULL,
B_predict = NULL,
take_first_derivatives = FALSE,
take_second_derivatives = FALSE,
expansions_only = FALSE,
new_predictors = NULL,
...
)
Arguments
object |
A fitted lgspline model object containing model parameters and fit |
newdata |
Matrix or data.frame; New predictor values for out-of-sample prediction. If NULL (default), uses training data |
parallel |
Logical; whether to use parallel processing for prediction computations. Experimental feature - use with caution. Default FALSE |
cl |
Optional cluster object for parallel processing. Required if parallel=TRUE. Default NULL |
chunk_size |
Integer; Size of computational chunks for parallel processing. Default NULL |
num_chunks |
Integer; Number of chunks for parallel processing. Default NULL |
rem_chunks |
Integer; Number of remainder chunks for parallel processing. Default NULL |
B_predict |
Matrix; Optional custom coefficient matrix for prediction. Default NULL (uses object$B internally). |
take_first_derivatives |
Logical; whether to compute first derivatives of the fitted function. Default FALSE |
take_second_derivatives |
Logical; whether to compute second derivatives of the fitted function. Default FALSE |
expansions_only |
Logical; whether to return only basis expansions without computing predictions. Default FALSE |
new_predictors |
Matrix or data frame; overrides 'newdata' if provided. |
... |
Additional arguments passed to internal prediction methods. |
Details
Implements multiple prediction capabilities:
Standard prediction: Returns fitted values for new data points
Derivative computation: Calculates first and/or second derivatives
Basis expansion: Returns design matrix of basis functions
Correlation structures: Supports non-Gaussian GLM correlation via variance-covariance matrices
If newdata and new_predictor are left NULL, default input used for model fitting will be used. Priority will be awarded to new_predictor over newdata when both are not NULL.
To obtain fitted values, users may also call model_fit$predict() or model_fit$ytilde for an lgspline object "model_fit".
The parallel processing feature is experimental and should be used with caution. When enabled, computations are split across chunks and processed in parallel, which may improve performance for large datasets.
Value
Depending on the options selected, returns the following:
- predictions
Numeric vector of predicted values (default case, or if derivatives requested).
- first_deriv
Numeric vector of first derivatives (if take_first_derivatives = TRUE).
- second_deriv
Numeric vector of second derivatives (if take_second_derivatives = TRUE).
- expansions
List of basis expansions (if expansions_only = TRUE).
With derivatives included, output is in the form of a list with elements "preds", "first_deriv", and "second_deriv" for the vector of predictions, first derivatives, and second derivatives respectively.
See Also
lgspline
for model fitting,
plot.lgspline
for visualizing predictions
Examples
## Generate example data
set.seed(1234)
t <- runif(1000, -10, 10)
y <- 2*sin(t) + -0.06*t^2 + rnorm(length(t))
## Fit model
model_fit <- lgspline(t, y)
## Generate predictions for new data
newdata <- matrix(sort(rnorm(10000)), ncol = 1) # Ensure matrix format
preds <- predict(model_fit, newdata)
## Compute derivative
deriv1_res <- predict(model_fit, newdata,
take_first_derivatives = TRUE)
deriv2_res <- predict(model_fit, newdata,
take_second_derivatives = TRUE)
## Visualize results
oldpar <- par(no.readonly = TRUE) # Save current par settings
layout(matrix(c(1,1,2,2,3,3), byrow = TRUE, ncol = 2))
## Plot function
plot(newdata[,1], preds,
main = 'Fitted Function',
xlab = 't',
ylab = "f(t)", type = 'l')
## Plot first derivative
plot(newdata[,1],
deriv1_res$first_deriv,
main = 'First Derivative',
xlab = 't',
ylab = "f'(t)", type = 'l')
## Plot second derivative
plot(newdata[,1],
deriv2_res$second_deriv,
main = 'Second Derivative',
xlab = 't',
ylab = "f''(t)", type = 'l')
par(oldpar) # Reset to original par settings
Print Method for lgspline Objects
Description
Provides a standard print method for lgspline model objects to display key model characteristics.
Usage
## S3 method for class 'lgspline'
print(x, ...)
Arguments
x |
An lgspline model object |
... |
Additional arguments (not used) |
Value
Invisibly returns the original lgspline
object x
. This
function is called for printing a concise summary of the fitted model's key
characteristics (family, link, N, predictors, partitions, basis functions) to
the console.
Print Method for lgspline Object Summaries
Description
Print Method for lgspline Object Summaries
Usage
## S3 method for class 'summary.lgspline'
print(x, ...)
Arguments
x |
A summary.lgspline object, the result of calling |
... |
Not used. |
Value
Invisibly returns the original summary.lgspline
object x
.
Like other print methods, this function is called to display a formatted
summary of the fitted lgspline
model to the console.
This includes model dimensions, family information, dispersion estimate,
effective degrees of freedom, and a coefficient table for univariate inference
(if available) analogous to output from summary.glm
.
Log-Prior Distribution Evaluation for lgspline Models
Description
Evaluates the log-prior distribution on beta coefficients conditional upon dispersion and penalaties,
Usage
prior_loglik(model_fit, sigmasq = NULL)
Arguments
model_fit |
An lgspline model object |
sigmasq |
A scalar numeric representing the dispersion parameter. By default it is NULL, and the sigmasq_tilde associated with model_fit will be used. Otherwise, custom values can be supplied. |
Details
Returns the quadratic form of B^T(Lambda)B evaluated at the tuned or fixed penalties, scaled by negative one-half inverse dispersion.
Assuming fixed penalties, the prior distribution of \beta
is given as follows:
\beta | \sigma^2 \sim \mathcal{N}(\textbf{0}, \frac{1}{\sigma^2}\Lambda)
The log-likelihood obtained from this can be shown to be equivalent to the following,
with C
a constant with respect to \beta
.
\implies \log P(\beta|\sigma^2) = C-\frac{1}{2\sigma^2}\beta^{T}\Lambda\beta
This is useful for computing joint log-likelihoods and performing valid likelihood ratio tests between nested lgspline models.
Value
A numeric scalar for the prior-loglikelihood (the penalty on beta coefficients actually computed)
See Also
Examples
## Data
t <- sort(runif(100, -5, 5))
y <- sin(t) - 0.1*t^2 + rnorm(100)
## Model keeping penalties fixed
model_fit <- lgspline(t, y, opt = FALSE)
## Full joint log-likelihood, conditional upon known sigma^2 = 1
jntloglik <- sum(dnorm(model_fit$y,
model_fit$ytilde,
1,
log = TRUE)) +
prior_loglik(model_fit, sigmasq = 1)
print(jntloglik)
Compute softplus transform
Description
Computes the softplus transform, equivalent to the cumulant generating function
of a logistic regression model: \log(1+e^x)
.
Usage
softplus(x)
Arguments
x |
Numeric vector to apply softplus to |
Value
Softplus transformed vector
Examples
x <- runif(5)
softplus(x)
Standardize Vector to Z-Scores
Description
Centers a vector by its sample mean, then scales it by its sample standard deviation
(\text{x}-\text{mean}(\text{x}))/\text{sd}(\text{x})
.
Usage
std(x)
Arguments
x |
Numeric vector to standardize |
Value
Standardized vector with sample mean 0 and standard deviation 1
Examples
x <- c(1, 2, 3, 4, 5)
std(x)
print(mean(x))
print(sd(x))
Summary method for lgspline Objects
Description
Summary method for lgspline Objects
Usage
## S3 method for class 'lgspline'
summary(object, ...)
Arguments
object |
An lgspline model object |
... |
Not used. |
Value
An object of class summary.lgspline
. This object is a list
containing detailed information from lgspline
fit, prepared for
display. Its main components are:
- model_family
The
family
object or custom list specifying the distribution and link.- observations
The number of observations (N) used in the fit.
- predictors
The number of original predictor variables (q) supplied.
- knots
The number of partitions (K+1) minus 1.
- basis_functions
The number of basis functions (coefficients) estimated per partition (p).
- estimate_dispersion
A character string ("Yes" or "No") indicating if the dispersion parameter was estimated.
- cv
The critical value (
critical_value
from the fit) used by theprint.summary.lgspline
method for confidence intervals.- coefficients
A matrix summarizing univariate inference results. Columns typically include 'Estimate', 'Std. Error', test statistic ('t value' or 'z value'), 'Pr(>|t|)' or 'Pr(>|z|)', and confidence interval bounds ('CI LB', 'CI UB'). This table is fully populated only if
return_varcovmat=TRUE
was set in the originallgspline
call. Otherwise, it defaults to a single column of estimates.- sigmasq_tilde
The estimated (or fixed) dispersion parameter,
\tilde{\sigma}^2
.- trace_XUGX
The calculated trace term
\text{trace}(\mathbf{XUGX}^T)
, related to effective degrees of freedom.- N
Number of observations (N), re-included for convenience and printing.
Calculate Derivatives of Polynomial Terms
Description
Computes first or second derivatives of polynomial terms in a design matrix with respect to a specified variable. Handles polynomial terms up to fourth degree.
Usage
take_derivative(dat, var, second = FALSE, scale)
Arguments
dat |
Numeric matrix; design matrix containing polynomial basis expansions |
var |
Character; column name of variable to differentiate with respect to |
second |
Logical; if TRUE compute second derivative, if FALSE compute first derivative (default FALSE) |
scale |
Numeric; scaling factor for normalization |
Value
Numeric matrix containing derivatives of polynomial terms, with same dimensions as input matrix
Calculate Second Derivatives of Interaction Terms
Description
Computes partial second derivatives for interaction terms including two-way linear, quadratic, and three-way interactions. Handles special cases for each type.
This function is necessary to compute total second derivatives as the sum of
second partial "pure" derivatives (d^2/dx^2
) plus second partial "mixed"
derivative (d^2/dxdz
), for a predictor x and all other predictors z.
Usage
take_interaction_2ndderivative(
dat,
var,
interaction_single_cols,
interaction_quad_cols,
triplet_cols,
colnm_expansions,
power1_cols,
power2_cols
)
Arguments
dat |
Numeric matrix; design matrix containing basis expansions |
var |
Character; variable name to differentiate with respect to |
interaction_single_cols |
Integer vector; column indices for linear-linear interactions |
interaction_quad_cols |
Integer vector; column indices for linear-quadratic interactions |
triplet_cols |
Integer vector; column indices for three-way interactions |
colnm_expansions |
Character vector; column names of expansions for each partition |
power1_cols |
Integer vector; column indices of linear terms |
power2_cols |
Integer vector; column indices of quadratic terms |
Value
Numeric matrix of second derivatives, same dimensions as input
Tune Smoothing and Ridge Penalties via Generalized Cross Validation
Description
Optimizes smoothing spline and ridge regression penalties by minimizing GCV criterion. Uses BFGS optimization with analytical gradients or finite differences.
Usage
tune_Lambda(
y,
X,
X_gram,
smoothing_spline_penalty,
A,
K,
nc,
nr,
opt,
use_custom_bfgs,
C,
colnm_expansions,
wiggle_penalty,
flat_ridge_penalty,
invsoftplus_initial_wiggle,
invsoftplus_initial_flat,
unique_penalty_per_predictor,
unique_penalty_per_partition,
invsoftplus_penalty_vec,
meta_penalty,
family,
unconstrained_fit_fxn,
keep_weighted_Lambda,
iterate,
qp_score_function,
quadprog,
qp_Amat,
qp_bvec,
qp_meq,
tol,
sd_y,
delta,
constraint_value_vectors,
parallel,
parallel_eigen,
parallel_trace,
parallel_aga,
parallel_matmult,
parallel_unconstrained,
cl,
chunk_size,
num_chunks,
rem_chunks,
shared_env,
custom_penalty_mat,
order_list,
glm_weight_function,
shur_correction_function,
need_dispersion_for_estimation,
dispersion_function,
observation_weights,
homogenous_weights,
blockfit,
just_linear_without_interactions,
Vhalf,
VhalfInv,
verbose,
include_warnings,
...
)
Arguments
y |
List; response vectors by partition |
X |
List; design matrices by partition |
X_gram |
List; Gram matrices by partition |
smoothing_spline_penalty |
Matrix; integrated squared second derivative penalty |
A |
Matrix; smoothness constraints at knots |
K |
Integer; number of interior knots in 1-D, number of partitions - 1 in higher dimensions |
nc |
Integer; columns per partition |
nr |
Integer; total sample size |
opt |
Logical; TRUE to optimize penalties, FALSE to use initial values |
use_custom_bfgs |
Logical; TRUE for analytic gradient BFGS as natively implemented, FALSE for finite differences as implemented by |
wiggle_penalty , flat_ridge_penalty |
Initial penalty values |
invsoftplus_initial_wiggle , invsoftplus_initial_flat |
Initial grid search values (log scale) |
unique_penalty_per_predictor , unique_penalty_per_partition |
Logical; allow predictor/partition-specific penalties |
invsoftplus_penalty_vec |
Initial values for predictor/partition penalties (log scale) |
meta_penalty |
The "meta" ridge penalty, a regularization for predictor/partition penalties to pull them on log-scale towards 0 (1 on raw scale) |
family |
GLM family with optional custom tuning loss |
keep_weighted_Lambda , iterate |
Logical controlling GLM fitting |
qp_score_function , quadprog , qp_Amat , qp_bvec , qp_meq |
Quadratic programming parameters (see arguments of |
tol |
Numeric; convergence tolerance |
sd_y , delta |
Response standardization parameters |
constraint_value_vectors |
List; constraint values |
parallel |
Logical; enable parallel computation |
cl , chunk_size , num_chunks , rem_chunks |
Parallel computation parameters |
custom_penalty_mat |
Optional custom penalty matrix |
order_list |
List; observation ordering by partition |
glm_weight_function , shur_correction_function |
Functions for GLM weights and corrections |
need_dispersion_for_estimation , dispersion_function |
Control dispersion estimation |
observation_weights |
Optional observation weights |
homogenous_weights |
Logical; TRUE if all weights equal |
blockfit |
Logical; when TRUE, block-fitting (not per-partition fitting) approach is used, analogous to quadratic programming. |
just_linear_without_interactions |
Numeric; vector of columns of input predictor matrix that correspond to non-spline effects without interactions, used for block-fitting. |
Vhalf , VhalfInv |
Square root and inverse square root correlation structures for fitting GEEs. |
verbose |
Logical; print progress |
include_warnings |
Logical; print warnings/try-errors |
... |
Additional arguments passed to fitting functions |
Details
Uses BFGS optimization to minimize GCV criterion for penalty selection. Supports analytical gradients for efficiency with standard GLM families. Can optimize unique penalties per predictor/partition. Handles custom loss functions and GLM weights. Parallel computation available for large problems.
Value
List containing:
Lambda - Final combined penalty matrix
flat_ridge_penalty - Optimized ridge penalty
wiggle_penalty - Optimized smoothing penalty
other_penalties - Optimized predictor/partition penalties
L_predictor_list - Predictor-specific penalty matrices
L_partition_list - Partition-specific penalty matrices
See Also
-
optim
for Hessian-free optimization
Unconstrained Generalized Linear Model Estimation
Description
Fits generalized linear models without smoothing constraints using penalized maximum likelihood estimation. This is applied to each partition to obtain the unconstrained estimates, prior to imposing the smoothing constraints.
Usage
unconstrained_fit_default(
X,
y,
LambdaHalf,
Lambda,
keep_weighted_Lambda,
family,
tol,
K,
parallel,
cl,
chunk_size,
num_chunks,
rem_chunks,
order_indices,
weights,
...
)
Arguments
X |
Design matrix of predictors |
y |
Response variable vector |
LambdaHalf |
Square root of penalty matrix ( |
Lambda |
Penalty matrix ( |
keep_weighted_Lambda |
Logical flag to control penalty matrix handling: - 'TRUE': Return coefficients directly from weighted penalty fitting - 'FALSE': Apply damped Newton-Raphson optimization to refine estimates |
family |
Distribution family specification |
tol |
Convergence tolerance |
K |
Number of partitions minus one ( |
parallel |
Flag for parallel processing |
cl |
Cluster object for parallel computation |
chunk_size |
Processing chunk size |
num_chunks |
Number of computational chunks |
rem_chunks |
Remaining chunks |
order_indices |
Observation ordering indices |
weights |
Optional observation weights |
... |
Additional arguments passed to |
Value
Optimized parameter estimates for canonical generalized linear models.
For fitting non-canonical GLMs, use keep_weighted_Lambda = TRUE
since the
score and hessian equations below are no longer valid.
For Gamma(link='log') using keep_weighted_Lambda = TRUE
is misleading.
The information is weighted by a constant (shape parameter) rather than some
mean-variance relationship. So keep_weighted_Lambda = TRUE
is highly
recommended for log-link Gamma models. This constant flushes into the
penalty terms, and so the formulation of the information matrix is valid.
For other scenarios, like probit regression, there will be diagonal weights incorporated into the penalty matrix for providing initial MLE estimates, which technically imposes a prior distribution on beta coefficients that isn't by intent.
Heuristically, it shouldn't affect much, as these will be updated to their proper form when providing estimates under constraint; lgspline otherwise does use the correct form of score and information afterwards, regardless of canonical/non-canonical status, as long as 'glm_weight_function' and 'qp_score_function' are properly specified.
Unconstrained Weibull Accelerated Failure Time Model Estimation
Description
Estimates parameters for an unconstrained Weibull accelerated failure time (AFT) model supporting right-censored survival data.
This both provides a tool for actually fitting Weibull AFT Models, and boilerplate code for users who wish to incorporate Lagrangian multiplier smoothing splines into their own custom models.
Usage
unconstrained_fit_weibull(
X,
y,
LambdaHalf,
Lambda,
keep_weighted_Lambda,
family,
tol = 1e-08,
K,
parallel,
cl,
chunk_size,
num_chunks,
rem_chunks,
order_indices,
weights,
status
)
Arguments
X |
Design matrix of predictors |
y |
Survival/response times |
LambdaHalf |
Square root of penalty matrix ( |
Lambda |
Penalty matrix ( |
keep_weighted_Lambda |
Flag to retain weighted penalties |
family |
Distribution family specification |
tol |
Convergence tolerance (default 1e-8) |
K |
Number of partitions minus one ( |
parallel |
Flag for parallel processing |
cl |
Cluster object for parallel computation |
chunk_size |
Processing chunk size |
num_chunks |
Number of computational chunks |
rem_chunks |
Remaining chunks |
order_indices |
Observation ordering indices |
weights |
Optional observation weights |
status |
Censoring status indicator (1 = event, 0 = censored) Indicates whether an event of interest occurred (1) or the observation was right-censored (0). In survival analysis, right-censoring occurs when the full survival time is unknown, typically because the study ended or the subject was lost to follow-up before the event of interest occurred. |
Details
Estimation Approach: The function employs a two-stage optimization strategy for fitting accelerated failure time models via maximum likelihood:
1. Outer Loop: Estimate Scale Parameter using Brent's method
2. Inner Loop: Estimate Regression Coefficients (given scale) using damped Newton-Raphson.
Value
Optimized beta parameter estimates (\boldsymbol{\beta}
) for Weibull AFT model
Examples
## Simulate survival data with covariates
set.seed(1234)
n <- 1000
t1 <- rnorm(n)
t2 <- rbinom(n, 1, 0.5)
## Generate survival times with Weibull-like structure
lambda <- exp(0.5 * t1 + 0.3 * t2)
yraw <- rexp(n, rate = 1/lambda)
## Introduce right-censoring
status <- rbinom(n, 1, 0.75)
y <- ifelse(status, yraw, runif(1, 0, yraw))
df <- data.frame(y = y, t1 = t1, t2 = t2)
## Fit model using lgspline with Weibull AFT unconstrained estimation
model_fit <- lgspline(y ~ spl(t1) + t2,
df,
unconstrained_fit_fxn = unconstrained_fit_weibull,
family = weibull_family(),
need_dispersion_for_estimation = TRUE,
dispersion_function = weibull_dispersion_function,
glm_weight_function = weibull_glm_weight_function,
shur_correction_function = weibull_shur_correction,
status = status,
opt = FALSE,
K = 1)
## Print model summary
summary(model_fit)
Vector-Matrix Multiplication for Block Diagonal Matrices
Description
Performs vector-matrix multiplication for block diagonal matrices
Usage
vectorproduct_block_diagonal(A, b, K)
Arguments
A |
List of matrices A |
b |
List of vectors b |
K |
Number of blocks |
Value
List of resulting vectors
Univariate Wald Tests and Confidence Intervals for Lagrangian Multiplier Smoothing Splines
Description
Performs coefficient-specific Wald tests and constructs confidence intervals for fitted lgspline models. (Wrapper for internal wald_univariate method). For Gaussian family with identity-link, a t-distribution replaces a normal distribution (and t-intervals, t-tests etc.) over Wald when mentioned.
Usage
wald_univariate(object, scale_vcovmat_by = 1, cv, ...)
Arguments
object |
A fitted lgspline model object containing coefficient estimates and variance-covariance matrix (requires return_varcovmat = TRUE in fitting). |
scale_vcovmat_by |
Numeric; Scaling factor for variance-covariance matrix. Adjusts standard errors and test statistics. Default 1. |
cv |
Numeric; Critical value for confidence interval construction. If missing, defaults to value specified in lgspline() fit ('object$critical_value') or 'qnorm(0.975)' as a fallback. Common choices:
|
... |
Additional arguments passed to the internal 'wald_univariate' method. |
Details
For each coefficient, provides:
Point estimates
Standard errors from the model's variance-covariance matrix
Two-sided test statistics and p-values
Confidence intervals using specified critical values
Value
A data frame with rows for each coefficient (across all partitions) and columns:
- estimate
Numeric; Coefficient estimate.
- std_error
Numeric; Standard error.
- statistic
Numeric; Wald or t-statistic (estimate/std_error).
- p_value
Numeric; Two-sided p-value based on normal or t-distribution.
- lower_ci
Numeric; Lower confidence bound (estimate - cv*std_error).
- upper_ci
Numeric; Upper confidence bound (estimate + cv*std_error).
See Also
Examples
## Simulate some data and fit using default settings
set.seed(1234)
t <- runif(1000, -10, 10)
y <- 2*sin(t) + -0.06*t^2 + rnorm(length(t))
# Ensure varcovmat is returned for Wald tests
model_fit <- lgspline(t, y, return_varcovmat = TRUE)
## Use default critical value (likely qnorm(0.975) if not set in fit)
wald_default <- wald_univariate(model_fit)
print(wald_default)
## Specify t-distribution critical value
eff_df <- NA
if(!is.null(model_fit$N) && !is.null(model_fit$trace_XUGX)) {
eff_df <- model_fit$N - model_fit$trace_XUGX
}
if (!is.na(eff_df) && eff_df > 0) {
wald_t <- wald_univariate(
model_fit,
cv = stats::qt(0.975, eff_df)
)
print(wald_t)
} else {
warning("Effective degrees of freedom invalid.")
}
Estimate Weibull Dispersion for Accelerated Failure Time Model
Description
Computes the scale parameter for a Weibull accelerated failure time (AFT) model, supporting right-censored survival data.
This both provides a tool for actually fitting Weibull AFT Models, and boilerplate code for users who wish to incorporate Lagrangian multiplier smoothing splines into their own custom models.
Usage
weibull_dispersion_function(
mu,
y,
order_indices,
family,
observation_weights,
status
)
Arguments
mu |
Predicted survival times |
y |
Observed response/survival times |
order_indices |
Indices to align status with response |
family |
Weibull AFT model family specification |
observation_weights |
Optional observation weights |
status |
Censoring indicator (1 = event, 0 = censored) Indicates whether an event of interest occurred (1) or the observation was right-censored (0). In survival analysis, right-censoring occurs when the full survival time is unknown, typically because the study ended or the subject was lost to follow-up before the event of interest occurred. |
Value
Squared scale estimate for the Weibull AFT model (dispersion)
See Also
weibull_scale
for the underlying scale estimation function
Examples
## Simulate survival data with covariates
set.seed(1234)
n <- 1000
t1 <- rnorm(n)
t2 <- rbinom(n, 1, 0.5)
## Generate survival times with Weibull-like structure
lambda <- exp(0.5 * t1 + 0.3 * t2)
yraw <- rexp(n, rate = 1/lambda)
## Introduce right-censoring
status <- rbinom(n, 1, 0.75)
y <- ifelse(status, yraw, runif(1, 0, yraw))
## Example of using dispersion function
mu <- mean(y)
order_indices <- seq_along(y)
weights <- rep(1, n)
## Estimate dispersion
dispersion_est <- weibull_dispersion_function(
mu = mu,
y = y,
order_indices = order_indices,
family = weibull_family(),
observation_weights = weights,
status = status
)
print(dispersion_est)
Weibull Family for Survival Model Specification
Description
Creates a compatible family object for Weibull accelerated failure time (AFT) models with customizable tuning options.
This both provides a tool for actually fitting Weibull AFT Models, and boilerplate code for users who wish to incorporate Lagrangian multiplier smoothing splines into their own custom models.
Usage
weibull_family()
Details
Provides a comprehensive family specification for Weibull AFT models, including Family name, link function, inverse link function, and custom loss function for model tuning
Supports right-censored survival data with flexible parameter estimation.
Value
A list containing family-specific components for survival model estimation
Examples
## Simulate survival data with covariates
set.seed(1234)
n <- 1000
t1 <- rnorm(n)
t2 <- rbinom(n, 1, 0.5)
## Generate survival times with Weibull-like structure
lambda <- exp(0.5 * t1 + 0.3 * t2)
yraw <- rexp(n, rate = 1/lambda)
## Introduce right-censoring
status <- rbinom(n, 1, 0.75)
y <- ifelse(status, yraw, runif(1, 0, yraw))
## Prepare data
df <- data.frame(y = y, t1 = t1, t2 = t2, status = status)
## Fit model using custom Weibull family
model_fit <- lgspline(y ~ spl(t1) + t2,
df,
unconstrained_fit_fxn = unconstrained_fit_weibull,
family = weibull_family(),
need_dispersion_for_estimation = TRUE,
dispersion_function = weibull_dispersion_function,
glm_weight_function = weibull_glm_weight_function,
shur_correction_function = weibull_shur_correction,
status = status,
opt = FALSE,
K = 1)
summary(model_fit)
Weibull GLM Weight Function for Constructing Information Matrix
Description
Computes diagonal weight matrix \textbf{W}
for the information matrix
\textbf{G} = (\textbf{X}^{T}\textbf{W}\textbf{X} + \textbf{L})^{-1}
in Weibull accelerated failure time (AFT) models.
Usage
weibull_glm_weight_function(
mu,
y,
order_indices,
family,
dispersion,
observation_weights,
status
)
Arguments
mu |
Predicted survival times |
y |
Observed response/survival times |
order_indices |
Order of observations when partitioned to match "status" to "response" |
family |
Weibull AFT family |
dispersion |
Estimated dispersion parameter ( |
observation_weights |
Weights of observations submitted to function |
status |
Censoring indicator (1 = event, 0 = censored) Indicates whether an event of interest occurred (1) or the observation was right-censored (0). In survival analysis, right-censoring occurs when the full survival time is unknown, typically because the study ended or the subject was lost to follow-up before the event of interest occurred. |
Details
This function generates weights used in constructing the information matrix
after unconstrained estimates have been found. Specifically, it is used in
the construction of the \textbf{U}
and \textbf{G}
matrices following initial unconstrained
parameter estimation.
These weights are analogous to the variance terms in generalized linear
models (GLMs). Like logistic regression uses \mu(1-\mu)
, Poisson regression uses
e^{\mu}
, and Linear regression uses constant weights, Weibull AFT models use
\exp((\log y - \log \mu)/s)
where s
is the scale (= \sqrt{\text{dispersion}}
) parameter.
Value
Vector of weights for constructing the diagonal weight matrix \textbf{W}
in the information matrix \textbf{G} = (\textbf{X}^{T}\textbf{W}\textbf{X} + \textbf{L})^{-1}
.
Examples
## Demonstration of glm weight function in constrained model estimation
set.seed(1234)
n <- 1000
t1 <- rnorm(n)
t2 <- rbinom(n, 1, 0.5)
## Generate survival times
lambda <- exp(0.5 * t1 + 0.3 * t2)
yraw <- rexp(n, rate = 1/lambda)
## Introduce right-censoring
status <- rbinom(n, 1, 0.75)
y <- ifelse(status, yraw, runif(1, 0, yraw))
## Fit model demonstrating use of custom glm weight function
model_fit <- lgspline(y ~ spl(t1) + t2,
data.frame(y = y, t1 = t1, t2 = t2),
unconstrained_fit_fxn = unconstrained_fit_weibull,
family = weibull_family(),
need_dispersion_for_estimation = TRUE,
dispersion_function = weibull_dispersion_function,
glm_weight_function = weibull_glm_weight_function,
shur_correction_function = weibull_shur_correction,
status = status,
opt = FALSE,
K = 1)
print(summary(model_fit))
Compute gradient of log-likelihood of Weibull accelerated failure model without penalization
Description
Calculates the gradient of log-likelihood for a Weibull accelerated failure time (AFT) survival model, supporting right-censored survival data.
Usage
weibull_qp_score_function(
X,
y,
mu,
order_list,
dispersion,
VhalfInv,
observation_weights,
status
)
Arguments
X |
Design matrix |
y |
Response vector |
mu |
Predicted mean vector |
order_list |
List of observation indices per partition |
dispersion |
Dispersion parameter (scale^2) |
VhalfInv |
Inverse square root of correlation matrix (if applicable) |
observation_weights |
Observation weights |
status |
Censoring indicator (1 = event, 0 = censored) |
Details
Needed if using "blockfit", correlation structures, or quadratic programming with Weibull AFT models.
Value
A numeric vector representing the gradient with respect to coefficients.
Examples
set.seed(1234)
t1 <- rnorm(1000)
t2 <- rbinom(1000, 1, 0.5)
yraw <- rexp(exp(0.01*t1 + 0.01*t2))
status <- rbinom(1000, 1, 0.25)
yobs <- ifelse(status, runif(1, 0, yraw), yraw)
df <- data.frame(
y = yobs,
t1 = t1,
t2 = t2
)
## Example using blockfit for t2 as a linear term - output does not look
# different, but internal methods used for fitting change
model_fit <- lgspline(y ~ spl(t1) + t2,
df,
unconstrained_fit_fxn = unconstrained_fit_weibull,
family = weibull_family(),
need_dispersion_for_estimation = TRUE,
qp_score_function = weibull_qp_score_function,
dispersion_function = weibull_dispersion_function,
glm_weight_function = weibull_glm_weight_function,
shur_correction_function = weibull_shur_correction,
K = 1,
blockfit = TRUE,
opt = FALSE,
status = status,
verbose = TRUE)
print(summary(model_fit))
Estimate Scale for Weibull Accelerated Failure Time Model
Description
Computes maximum log-likelihood scale estimate of Weibull accelerated failure time (AFT) survival model.
This both provides a tool for actually fitting Weibull AFT Models, and boilerplate code for users who wish to incorporate Lagrangian multiplier smoothing splines into their own custom models.
Usage
weibull_scale(log_y, log_mu, status, weights = 1)
Arguments
log_y |
Logarithm of response/survival times |
log_mu |
Logarithm of predicted survival times |
status |
Censoring indicator (1 = event, 0 = censored) Indicates whether an event of interest occurred (1) or the observation was right-censored (0). In survival analysis, right-censoring occurs when the full survival time is unknown, typically because the study ended or the subject was lost to follow-up before the event of interest occurred. |
weights |
Optional observation weights (default = 1) |
Details
Calculates maximum log-likelihood estimate of scale for Weibull AFT model accounting for right-censored observations using Brent's method for optimization.
Value
Scalar representing the estimated scale
Examples
## Simulate exponential data with censoring
set.seed(1234)
mu <- 2 # mean of exponential distribution
n <- 500
y <- rexp(n, rate = 1/mu)
## Introduce censoring (25% of observations)
status <- rbinom(n, 1, 0.75)
y_obs <- ifelse(status, y, NA)
## Compute scale estimate
scale_est <- weibull_scale(
log_y = log(y_obs[!is.na(y_obs)]),
log_mu = log(mu),
status = status[!is.na(y_obs)]
)
print(scale_est)
Correction for the Variance-Covariance Matrix for Uncertainty in Scale
Description
Computes the shur complement \textbf{S}
such that \textbf{G}^* = (\textbf{G}^{-1} + \textbf{S})^{-1}
properly
accounts for uncertainty in estimating dispersion when estimating
variance-covariance. Otherwise, the variance-covariance matrix is optimistic
and assumes the scale is known, when it was in fact estimated. Note that the
parameterization adds the output of this function elementwise (not subtract)
so for most cases, the output of this function will be negative or a
negative definite/semi-definite matrix.
Usage
weibull_shur_correction(
X,
y,
B,
dispersion,
order_list,
K,
family,
observation_weights,
status
)
Arguments
X |
Block-diagonal matrices of spline expansions |
y |
Block-vector of response |
B |
Block-vector of coefficient estimates |
dispersion |
Scalar, estimate of dispersion, |
order_list |
List of partition orders |
K |
Number of partitions minus 1 ( |
family |
Distribution family |
observation_weights |
Optional observation weights (default = 1) |
status |
Censoring indicator (1 = event, 0 = censored) Indicates whether an event of interest occurred (1) or the observation was right-censored (0). In survival analysis, right-censoring occurs when the full survival time is unknown, typically because the study ended or the subject was lost to follow-up before the event of interest occurred. |
Details
Adjusts the variance-covariance matrix unscaled for coefficients to account
for uncertainty in estimating the Weibull scale parameter, that otherwise
would be lost if simply using \textbf{G}=(\textbf{X}^{T}\textbf{W}\textbf{X} + \textbf{L})^{-1}
. This is accomplished
using a correction based on the Shur complement so we avoid having to
construct the entire variance-covariance matrix, or modifying the procedure
for lgspline
substantially.
For any model with nuisance parameters that must have uncertainty accounted
for, this tool will be helpful.
This both provides a tool for actually fitting Weibull accelerated failure time (AFT) models, and boilerplate code for users who wish to incorporate Lagrangian multiplier smoothing splines into their own custom models.
Value
List of p \times p
matrices representing the shur-complement corrections \textbf{S}_k
to be
elementwise added to each block of the information matrix, before inversion.
Examples
## Minimal example of fitting a Weibull Accelerated Failure Time model
# Simulating survival data with right-censoring
set.seed(1234)
t1 <- rnorm(1000)
t2 <- rbinom(1000, 1, 0.5)
yraw <- rexp(exp(0.01*t1 + 0.01*t2))
# status: 1 = event occurred, 0 = right-censored
status <- rbinom(1000, 1, 0.25)
yobs <- ifelse(status, runif(1, 0, yraw), yraw)
df <- data.frame(
y = yobs,
t1 = t1,
t2 = t2
)
## Fit model using lgspline with Weibull shur correction
model_fit <- lgspline(y ~ spl(t1) + t2,
df,
unconstrained_fit_fxn = unconstrained_fit_weibull,
family = weibull_family(),
need_dispersion_for_estimation = TRUE,
dispersion_function = weibull_dispersion_function,
glm_weight_function = weibull_glm_weight_function,
shur_correction_function = weibull_shur_correction,
status = status,
opt = FALSE,
K = 1)
print(summary(model_fit))
## Fit model using lgspline without Weibull shur correction
naive_fit <- lgspline(y ~ spl(t1) + t2,
df,
unconstrained_fit_fxn = unconstrained_fit_weibull,
family = weibull_family(),
need_dispersion_for_estimation = TRUE,
dispersion_function = weibull_dispersion_function,
glm_weight_function = weibull_glm_weight_function,
status = status,
opt = FALSE,
K = 1)
print(summary(naive_fit))