Title: Regularized Principal Component Analysis for Spatial Data
Version: 1.3.5
Description: Provide regularized principal component analysis incorporating smoothness, sparseness and orthogonality of eigen-functions by using the alternating direction method of multipliers algorithm (Wang and Huang, 2017, <doi:10.1080/10618600.2016.1157483>). The method can be applied to either regularly or irregularly spaced data, including 1D, 2D, and 3D.
License: GPL-3
ByteCompile: true
BugReports: https://github.com/egpivo/SpatPCA/issues
Depends: R (≥ 3.4.0)
Imports: Rcpp (≥ 1.0.10), RcppParallel (≥ 5.1.7), ggplot2
LinkingTo: Rcpp, RcppArmadillo, RcppParallel
Suggests: knitr, rmarkdown, testthat (≥ 2.1.0), dplyr (≥ 1.0.3), gifski, tidyr, fields, scico, plot3D, pracma, RColorBrewer, maps, covr, styler, V8
SystemRequirements: GNU make
VignetteBuilder: knitr
Encoding: UTF-8
RoxygenNote: 7.2.3
URL: https://github.com/egpivo/SpatPCA
Config/testthat/edition: 3
NeedsCompilation: yes
Packaged: 2023-11-12 12:12:19 UTC; joseph
Author: Wen-Ting Wang ORCID iD [aut, cre], Hsin-Cheng Huang ORCID iD [aut]
Maintainer: Wen-Ting Wang <egpivo@gmail.com>
Repository: CRAN
Date/Publication: 2023-11-13 09:33:19 UTC

Regularized Principal Component Analysis for Spatial Data

Description

A new regularization approach to estimate the leading spatial patterns via smoothness and sparseness penalties, and spatial predictions for spatial data that may be irregularly located in space (including 1D, 2D and 3D), and obtain the spatial prediction at the designated locations.

Details

Package: SpatPCA
Type: Package
Version: 1.3.3.4
Date: 2021-02-11
License: GPL version 3

Author(s)

Wen-Ting Wang <egpivo@gmail.com> and Hsin-Cheng Huang <hchuang@stat.sinica.edu.tw>


Internal function: Validate input data for a spatpca object

Description

Internal function: Validate input data for a spatpca object

Usage

checkInputData(Y, x, M)

Arguments

Y

Data matrix

x

Location matrix.

M

Number of folds for cross-validation

Value

NULL.


Internal function: Validate new locations for a spatpca object

Description

Internal function: Validate new locations for a spatpca object

Usage

checkNewLocationsForSpatpcaObject(spatpca_object, x_new)

Arguments

spatpca_object

An spatpca class object

x_new

New location matrix.

Value

NULL.


Internal function: Detrend Y by column-wise centering

Description

Internal function: Detrend Y by column-wise centering

Usage

detrend(Y, is_Y_detrended)

Arguments

Y

Data matrix

Value

Detrended data matrix


Interpolated Eigen-function

Description

Produce Eigen-function values based on new locations

Usage

eigenFunction(new_location, original_location, Phi)

Arguments

new_location

A location matrix

original_location

A location matrix

Phi

An eigenvector matrix

Value

A predictive estimate matrix

Examples

pesudo_sequence <- seq(-5, 5, length = 2)
original_location <- as.matrix(expand.grid(x = pesudo_sequence, y = pesudo_sequence))
new_location <- matrix(c(0.1, 0.2), nrow = 1, ncol = 2)
Phi <- matrix(c(1, 0, 0, 0), nrow = 4, ncol = 1)
thin_plate_matrix <- eigenFunction(new_location, original_location, Phi)

Internal function: Fetch the upper bound of the number of eigenfunctions

Description

Internal function: Fetch the upper bound of the number of eigenfunctions

Usage

fetchUpperBoundNumberEigenfunctions(Y, M)

Arguments

Y

Data matrix

M

Number of folds for cross-validation

Value

integer


Display the cross-validation results

Description

Display the M-fold cross-validation results

Usage

## S3 method for class 'spatpca'
plot(x, ...)

Arguments

x

An spatpca class object for plot method

...

Not used directly

Value

NULL.

See Also

spatpca

Examples

x_1D <- as.matrix(seq(-5, 5, length = 10))
Phi_1D <- exp(-x_1D^2) / norm(exp(-x_1D^2), "F")
set.seed(1234)
Y_1D <- rnorm(n = 100, sd = 3) %*% t(Phi_1D) + matrix(rnorm(n = 100 * 10), 100, 10)
cv_1D <- spatpca(x = x_1D, Y = Y_1D, num_cores = 2)
plot(cv_1D)

Spatial predictions on new locations

Description

Predict the response on new locations with the estimated spatial structures.

Usage

predict(spatpca_object, x_new, eigen_patterns_on_new_site = NULL)

Arguments

spatpca_object

An spatpca class object

x_new

New location matrix.

eigen_patterns_on_new_site

Eigen-patterns on x_new

Value

A prediction matrix of Y at the new locations, x_new.

See Also

spatpca

Examples

# 1D: artificial irregular locations
x_1D <- as.matrix(seq(-5, 5, length = 10))
Phi_1D <- exp(-x_1D^2) / norm(exp(-x_1D^2), "F")
set.seed(1234)
Y_1D <- rnorm(n = 100, sd = 3) %*% t(Phi_1D) + matrix(rnorm(n = 100 * 10), 100, 10)
removed_location <- sample(1:10, 3)
removed_x_1D <- x_1D[-removed_location]
removed_Y_1D <- Y_1D[, -removed_location]
new_x_1D <- as.matrix(seq(-5, 5, length = 20))
cv_1D <- spatpca(x = removed_x_1D, Y = removed_Y_1D, tau2 = 1:100, num_cores = 2)
predictions <- predict(cv_1D, x_new = new_x_1D)


Spatial dominant patterns on new locations

Description

Estimate K eigenfunctions on new locations

Usage

predictEigenfunction(spatpca_object, x_new)

Arguments

spatpca_object

An spatpca class object

x_new

New location matrix.

Value

A matrix with K Eigenfunction values on new locations.

See Also

spatpca

Examples

# 1D: artificial irregular locations
x_1D <- as.matrix(seq(-5, 5, length = 10))
Phi_1D <- exp(-x_1D^2) / norm(exp(-x_1D^2), "F")
set.seed(1234)
Y_1D <- rnorm(n = 100, sd = 3) %*% t(Phi_1D) + matrix(rnorm(n = 100 * 10), 100, 10)
rm_loc <- sample(1:10, 2)
x_1Drm <- x_1D[-rm_loc]
Y_1Drm <- Y_1D[, -rm_loc]
x_1Dnew <- as.matrix(seq(-5, 5, length = 20))
cv_1D <- spatpca(x = x_1Drm, Y = Y_1Drm, tau2 = 1:100, num_cores = 2)
dominant_patterns <- predictEigenfunction(cv_1D, x_new = x_1Dnew)


Internal function: Scale one-dimension locations

Description

Internal function: Scale one-dimension locations

Usage

scaleLocation(location)

Arguments

location

Location matrix

Value

scaled location matrix


Internal function: Set the number of cores for parallel computing

Description

Internal function: Set the number of cores for parallel computing

Usage

setCores(num_cores = NULL)

Arguments

num_cores

Number of number of cores for parallel computing. Default is NULL.

Value

Logical


Internal function: Set tuning parameter - gamma

Description

Internal function: Set tuning parameter - gamma

Usage

setGamma(gamma, Y)

Arguments

gamma

Vector of a nonnegative hyper parameter sequence for tuning eigenvalues. Default is NULL.

Y

Data matrix

Value

Modified vector of a nonnegative hyper parameter sequence for tuning eigenvalues.


Internal function: Set tuning parameter - l2

Description

Internal function: Set tuning parameter - l2

Usage

setL2(tau2)

Arguments

tau2

Vector of a nonnegative sparseness parameter sequence. Default is NULL.

Value

Modified vector of a nonnegative tuning parameter sequence for ADMM use


Internal function: Set the number of eigenfunctions for a spatpca object

Description

Internal function: Set the number of eigenfunctions for a spatpca object

Usage

setNumberEigenfunctions(K, Y, M)

Arguments

K

Optional user-supplied number of eigenfunctions.

Y

Data matrix

M

Number of folds for cross-validation

Value

integer


Internal function: Set tuning parameter - tau1

Description

Internal function: Set tuning parameter - tau1

Usage

setTau1(tau1, M)

Arguments

tau1

Vector of a nonnegative smoothness parameter sequence. Default is NULL.

M

Number of folds for cross-validation

Value

Modified vector of a nonnegative smoothness parameter sequence.


Internal function: Set tuning parameter - tau2

Description

Internal function: Set tuning parameter - tau2

Usage

setTau2(tau2, M)

Arguments

tau2

Vector of a nonnegative sparseness parameter sequence. Default is NULL.

M

Number of folds for cross-validation

Value

Modified vector of a nonnegative sparseness parameter sequence.


Internal function: Spatial prediction

Description

Internal function: Spatial prediction

Usage

spatialPrediction(phir, Yr, gamma, predicted_eignefunction)

Arguments

phir

A matrix of estimated eigenfunctions based on original locations

Yr

A data matrix

gamma

A gamma value

predicted_eignefunction

A vector of values of an eigenfunction on new locations

Value

A list of objects

prediction

A vector of spatial predictions

estimated_covariance

An estimated covariance matrix.

eigenvalue

A vector of estimated eigenvalues.

error

Error rate for the ADMM algorithm


Regularized PCA for spatial data

Description

Produce spatial dominant patterns and spatial predictions at the designated locations according to the specified tuning parameters or the selected tuning parameters by the M-fold cross-validation.

Usage

spatpca(
  x,
  Y,
  M = 5,
  K = NULL,
  is_K_selected = ifelse(is.null(K), TRUE, FALSE),
  tau1 = NULL,
  tau2 = NULL,
  gamma = NULL,
  is_Y_detrended = FALSE,
  maxit = 100,
  thr = 1e-04,
  num_cores = NULL
)

Arguments

x

Location matrix (p \times d). Each row is a location. d is the dimension of locations

Y

Data matrix (n \times p) stores the values at p locations with sample size n.

M

Optional number of folds for cross validation; default is 5.

K

Optional user-supplied number of eigenfunctions; default is NULL. If K is NULL or is_K_selected is TRUE, K is selected automatically.

is_K_selected

If TRUE, K is selected automatically; otherwise, is_K_selected is set to be user-supplied K. Default depends on user-supplied K.

tau1

Optional user-supplied numeric vector of a non-negative smoothness parameter sequence. If NULL, 10 tau1 values in a range are used.

tau2

Optional user-supplied numeric vector of a non-negative sparseness parameter sequence. If NULL, none of tau2 is used.

gamma

Optional user-supplied numeric vector of a non-negative tuning parameter sequence. If NULL, 10 values in a range are used.

is_Y_detrended

If TRUE, center the columns of Y. Default is FALSE.

maxit

Maximum number of iterations. Default value is 100.

thr

Threshold for convergence. Default value is 10^{-4}.

num_cores

Number of cores used to parallel computing. Default value is NULL (See RcppParallel::defaultNumThreads())

Details

An ADMM form of the proposed objective function is written as

\min_{\mathbf{\Phi}} \|\mathbf{Y}-\mathbf{Y}\mathbf{\Phi}\mathbf{\Phi}'\|^2_F +\tau_1\mbox{tr}(\mathbf{\Phi}^T\mathbf{\Omega}\mathbf{\Phi})+\tau_2\sum_{k=1}^K\sum_{j=1}^p |\phi_{jk}|,

\mbox{subject to $ \mathbf{\Phi}^T\mathbf{\Phi}=\mathbf{I}_K$,} where \mathbf{Y} is a data matrix, {\mathbf{\Omega}} is a smoothness matrix, and \mathbf{\Phi}=\{\phi_{jk}\}.

Value

A list of objects including

eigenfn

Estimated eigenfunctions at the new locations, x_new.

selected_K

Selected K based on CV. Execute the algorithm when is_K_selected is TRUE.

selected_tau1

Selected tau1.

selected_tau2

Selected tau2.

selected_gamma

Selected gamma.

cv_score_tau1

cv scores for tau1.

cv_score_tau2

cv scores for tau2.

cv_score_gamma

cv scores for gamma.

tau1

Sequence of tau1-values used in the process.

tau2

Sequence of tau2-values used in the process.

gamma

Sequence of gamma-values used in the process.

detrended_Y

If is_Y_detrended is TRUE, detrended_Y means Y is detrended; else, detrended_Y is equal to Y.

scaled_x

Input location matrix. Only scale when it is one-dimensional

Author(s)

Wen-Ting Wang and Hsin-Cheng Huang

References

Wang, W.-T. and Huang, H.-C. (2017). Regularized principal component analysis for spatial data. Journal of Computational and Graphical Statistics 26 14-25.

See Also

predict

Examples

# The following examples only use two threads for parallel computing.
## 1D: regular locations
x_1D <- as.matrix(seq(-5, 5, length = 50))
Phi_1D <- exp(-x_1D^2) / norm(exp(-x_1D^2), "F")
set.seed(1234)
Y_1D <- rnorm(n = 100, sd = 3) %*% t(Phi_1D) + matrix(rnorm(n = 100 * 50), 100, 50)
cv_1D <- spatpca(x = x_1D, Y = Y_1D, num_cores = 2)
plot(x_1D, cv_1D$eigenfn[, 1], type = "l", main = "1st eigenfunction")
lines(x_1D, svd(Y_1D)$v[, 1], col = "red")
legend("topleft", c("SpatPCA", "PCA"), lty = 1:1, col = 1:2)


## 2D: Daily 8-hour ozone averages for sites in the Midwest (USA)
library(fields)
library(pracma)
library(maps)
data(ozone2)
x <- ozone2$lon.lat
Y <- ozone2$y
date <- as.Date(ozone2$date, format = "%y%m%d")
rmna <- !colSums(is.na(Y))
YY <- matrix(Y[, rmna], nrow = nrow(Y))
YY <- detrend(YY, "linear")
xx <- x[rmna, ]
cv <- spatpca(x = xx, Y = YY)
quilt.plot(xx, cv$eigenfn[, 1])
map("state", xlim = range(xx[, 1]), ylim = range(xx[, 2]), add = TRUE)
map.text("state", xlim = range(xx[, 1]), ylim = range(xx[, 2]), cex = 2, add = TRUE)
plot(date, YY %*% cv$eigenfn[, 1], type = "l", ylab = "1st Principal Component")
### new loactions
new_p <- 200
x_lon <- seq(min(xx[, 1]), max(xx[, 1]), length = new_p)
x_lat <- seq(min(xx[, 2]), max(xx[, 2]), length = new_p)
xx_new <- as.matrix(expand.grid(x = x_lon, y = x_lat))
eof <- spatpca(x = xx,
               Y = YY,
               K = cv$selected_K,
               tau1 = cv$selected_tau1,
               tau2 = cv$selected_tau2)
predicted_eof <- predictEigenfunction(eof, xx_new)
quilt.plot(xx_new,
           predicted_eof[,1],
           nx = new_p,
           ny = new_p,
           xlab = "lon.",
           ylab = "lat.")
map("state", xlim = range(x_lon), ylim = range(x_lat), add = TRUE)
map.text("state", xlim = range(x_lon), ylim = range(x_lat), cex = 2, add = TRUE)
## 3D: regular locations
p <- 10
x <- y <- z <- as.matrix(seq(-5, 5, length = p))
d <- expand.grid(x, y, z)
Phi_3D <- rowSums(exp(-d^2)) / norm(as.matrix(rowSums(exp(-d^2))), "F")
Y_3D <- rnorm(n = 100, sd = 3) %*% t(Phi_3D) + matrix(rnorm(n = 100 * p^3), 100, p^3)
cv_3D <- spatpca(x = d, Y = Y_3D, tau2 = seq(0, 1000, length = 10))
library(plot3D)
library(RColorBrewer)
cols <- colorRampPalette(brewer.pal(9, "Blues"))(p)
isosurf3D(x, y, z,
         colvar = array(cv_3D$eigenfn[, 1], c(p, p, p)),
         level= seq(min(cv_3D$eigenfn[, 1]), max(cv_3D$eigenfn[, 1]), length = p),
         ticktype = "detailed",
         colkey = list(side = 1),
         col = cols)


Internal function: M-fold Cross-validation

Description

Internal function: M-fold Cross-validation

Usage

spatpcaCV(sxyr, Yr, M, K, tau1r, tau2r, gammar, nkr, maxit, tol, l2r)

Arguments

sxyr

A location matrix

Yr

A data matrix

M

The number of folds for CV

K

The number of estimated eigen-functions

tau1r

A range of tau1

tau2r

A range of tau2

gammar

A range of gamma

nkr

A vector of fold numbers

maxit

A maximum number of iteration

tol

A tolerance rate

l2r

A given tau2

Value

A list of selected parameters


Internal function: M-fold CV of SpatPCA with selected K

Description

Internal function: M-fold CV of SpatPCA with selected K

Usage

spatpcaCVWithSelectedK(
  x,
  Y,
  M,
  tau1,
  tau2,
  gamma,
  shuffle_split,
  maxit,
  thr,
  l2
)

Arguments

x

Location matrix

Y

Data matrix

M

The number of folds for cross validation; default is 5.

tau1

Vector of a non-negative smoothness parameter sequence. If NULL, 10 tau1 values in a range are used.

tau2

Vector of a non-negative sparseness parameter sequence. If NULL, none of tau2 is used.

gamma

Vector of a non-negative hyper parameter sequence for tuning eigenvalues. If NULL, 10 values in a range are used.

shuffle_split

Vector of indices for random splitting Y into training and test sets

maxit

Maximum number of iterations. Default value is 100.

thr

Threshold for convergence. Default value is 10^{-4}.

l2

Vector of a non-negative tuning parameter sequence for ADMM use

Value

A list of objects including

cv_result

A list of resultant objects produced by spatpcaCV

selected_K

Selected K based on CV.


Thin-plane spline matrix

Description

Produce a thin-plane spline matrix based on a given location matrix

Usage

thinPlateSplineMatrix(location)

Arguments

location

A location matrix

Value

A thin-plane spline matrix

Examples

pesudo_sequence <- seq(-5, 5, length = 5)
two_dim_location <- as.matrix(expand.grid(x = pesudo_sequence, y = pesudo_sequence))
thin_plate_matrix <- thinPlateSplineMatrix(two_dim_location)

mirror server hosted at Truenetwork, Russian Federation.