Type: | Package |
Title: | Spatial Constrained Clusterwise Regression |
Version: | 1.2 |
Date: | 2025-06-11 |
Maintainer: | Francesco Vidoli <fvidoli@gmail.com> |
Description: | A collection of functions for estimating spatial regimes, aggregations of neighboring spatial units that are homogeneous in functional terms. The term spatial regime, therefore, should not be understood as a synonym for cluster. More precisely, the term cluster does not presuppose any functional relationship between the variables considered, while the term regime is linked to a regressive relationship underlying the spatial process. |
Depends: | spdep, quantreg, GWmodel, plm, spatialreg |
License: | GPL-2 | GPL-3 [expanded from: GPL (≥ 2)] |
Suggests: | R.rsp, utils |
VignetteBuilder: | R.rsp |
NeedsCompilation: | no |
Packaged: | 2025-06-11 13:19:38 UTC; Fra |
Author: | Francesco Vidoli [aut, cre], Roberto Benedetti [aut] |
Repository: | CRAN |
Date/Publication: | 2025-06-11 13:40:02 UTC |
Spatial clusterwise regression by an iterated spatially weighted regression algorithm
Description
This function implements a spatial clusterwise regression based on the procedure suggested by Andreano et al. (2017) and Bille' et al. (2017).
Usage
Awsreg(data, coly,colx,kernel,kernel2,coords,bw,tau,niter,conv,eta,numout,sout)
Arguments
data |
A data.frame. |
coly |
The dependent variable in the c("y_ols") form. |
colx |
The covariates in the c("x1","x2") form. |
kernel |
Kernel function used to calculate distances between units (default is "bisquare", other values: "exponential", "gaussian","tricube"). |
kernel2 |
Kernel function used to calculate distances between units in the second step (default is "gaussian", other values: "exponential"). |
coords |
The coordinates in terms of longitude and latitude. |
bw |
The bandwidth parameter of the initial weights. |
tau |
The confidence test parameter of the difference between regression parameters. |
niter |
The maximum number of iterations. |
conv |
The smallest accepted difference between the weights in two successive iterations. |
eta |
The parameter that regulates which is the weight of the weights of the previous iteration in the moving average that updates the new weights. |
numout |
The minimum number of areal units accepted for each cluster. |
sout |
Minimum value of weights such as to be considered equal to zero. Parameter used essentially to control clusters consisting of too few areal units. |
Details
Author really thanks Bille' A.G. for her contribution to revising the original code.
Value
A object of Awsreg class with:
groups |
Estimated clusters. |
Author(s)
R. Benedetti
References
Andreano, M.S., Benedetti, R., and Postiglione, P. (2017). "Spatial regimes in regional European growth: an iterated spatially weighted regression approach", Quality & Quantity. 51, 6, 2665-2684.
Bille', A.G., Benedetti, R., and Postiglione, P. (2017). "A two-step approach to account for unobserved spatial heterogeneity", Spatial Economic Analysis, 12, 4, 452-471.
Examples
data(SimData)
SimData = SimData[1:50,]
coords = cbind(SimData$long, SimData$lat)
#######################
dmat<-gw.dist(coords,focus=0,p=2,theta=0,longlat=FALSE)
bw<-bw.gwr(y_ols~A+L+K,
data=SpatialPointsDataFrame(coords,SimData),
approach="AIC",kernel="bisquare",
adaptive=TRUE,p=2,theta=0,longlat=FALSE,dMat=dmat)
#######################
aws<-Awsreg(data=SimData,
coly=c("y_ols"),
colx=c("A","L","K"),
kernel="bisquare",
kernel2="gaussian",
coords=coords,
bw=bw,
tau=0.001,
niter=200,
conv=0.001,
eta=0.5,
numout=15,
sout=1e-05)
SimData$regimes = aws$groups
plot(lat~long,SimData,col=regimes,pch=16)
Spatial clusterwise regression by a constrained version of the Simulated Annealing
Description
This function implements a spatial clusterwise regression based on the procedure suggested by Postiglione et al. (2013).
Usage
Sareg(data, coly,colx, cont, intemp, rho, niter, subit, ncl, bcont)
Arguments
data |
A data.frame |
coly |
The dependent variable in the c("y_ols") form. |
colx |
The covariates in the c("x1","x2") form. |
cont |
The contiguity matrix. |
intemp |
The initial temperature. |
rho |
The temperature decay rate parameter. |
niter |
The maximum number of iterations. |
subit |
The number of sub-iterations for each iteration. |
ncl |
The number of clusters. |
bcont |
A parameter that regulates the penalty of simulated annealing in non-contiguous configurations of the clusters. |
Value
A object of Sareg class with:
groups |
Estimated clusters. |
Author(s)
R. Benedetti
References
Postiglione, P., Benedetti, R., and Andreano, M.S. (2013). "Using Constrained Optimization for the Identification of Convergence Clubs", Computational Economics, 42, 151-174.
Examples
data(SimData)
SimData = SimData[1:50,]
coords = cbind(SimData$long, SimData$lat)
#######################
dmat <-gw.dist(coords,focus=0,p=2,theta=0,longlat=FALSE)
W <- matrix(0,nrow(dmat),ncol(dmat))
W[dmat < 0.2] <- 1
diag(W) <- 0
#######################
sa <- Sareg(data=SimData,
coly = c("y_ols"),
colx = c("A"),
W,
intemp=0.5,
rho=0.96,
niter=30,
subit=3,
ncl=2,
bcont=-4)
SimData$regimes = sa$groups
plot(lat~long,SimData,col=regimes,pch=16)
Simulated data for estimating spatial regimes.
Description
Simulated production function like data for estimating spatial regimes; data has been generated for the paper "F. Vidoli, G. Pignataro, R. Benedetti, F. Pammolli, "Spatially constrained cluster-wise regression: optimal territorial areas in Italian health care", forthcoming.
Usage
data(SimData)
Format
SimData is a simulated dataset with 500 observations and 7 variables.
- long
Longitude
- lat
Latitude
- A
Land input
- L
Labour input
- K
Capital input
- clu
Real regime
- y_ols
Production output
500 units (100 units for each of the 5 regimes) are generated and, for each unit, the longitude and latitude coordinates are randomly drawn by using two Uniform distributions from 0 to 50 and from -70 to 20, i.e. U(0,50) and U(-70,20), respectively. Consequently, we set the matrix of covariates which include the constant, A, L and K variables by drawing from U(1.5,4). For each regime, finally, a different (in the coefficients) spatial function is set assuming a linear functional form. More in particular, we set 5 different vectors of parameters (including the intercept): beta1 = (13,0.5,0.3,0.2), beta2 = (11,0.8,0.1,0.1), beta3 = (9,0.3,0.2,0.5), beta4 = (7,0.4,0.3,0.3) and beta5 = (5,0.2,0.6,0.2) and a normally distributed error term in N(0,1).
Author(s)
Vidoli F.
References
F. Vidoli, G. Pignataro and R. Benedetti "Identification of spatial regimes of the production function of Italian hospitals through spatially constrained cluster-wise regression", Socio-Economic Planning Sciences (in press) https://doi.org/10.1016/j.seps.2022.101223
Examples
data(SimData)
Spatial constrained clusterwise regression by Spatial 'K'luster Analysis by Tree Edge Removal
Description
This function implements a spatial constrained clusterwise regression based on the Skater procedure by Assuncao et al. (2002).
Usage
SkaterF(edges,data,coly,colx,ncuts,crit,method=1,ind_col,lat,long,tau.ch)
Arguments
edges |
A matrix with 2 colums with each row is an edge. |
data |
A data.frame with the informations over nodes. |
coly |
The dependent variable in the c("y_ols") form. |
colx |
The covariates in the c("x1","x2") form. |
ncuts |
The number of cuts. |
crit |
A scalar or two dimensional vector with with criteria for groups. Examples: limits of group size or limits of population size. If scalar, is the minimum criteria for groups. |
method |
1 (default) for OLS, 2 for Quantile regression, 3 for logit |
ind_col |
Parameter still not used in this version. |
lat |
Parameter still not used in this version. |
long |
Parameter still not used in this version. |
tau.ch |
Chosen quantile (for method = 2). |
Details
Author really thanks Renato M. Assuncao and Elias T. Krainski for their original code (skater, library spdep).
Value
A object of skaterF class with:
groups |
A vector with length equal the number of nodes. Each position identifies the group of node. |
edges.groups |
A list of length equal the number of groups with each element is a set of edges |
not.prune |
A vector identifying the groups with are not candidates to partition. |
candidates |
A vector identifying the groups with are candidates to partition. |
ssto |
The total dissimilarity in each step of edge removal. |
Author(s)
F. Vidoli
References
For method = 1: F. Vidoli, G. Pignataro, and R. Benedetti. (2022) "Identification of spatial regimes of the production function of italian hospitals through spatially constrained cluster-wise regression. In: Socio-Economic Planning Sciences, page 101223, doi: https://doi.org/10.1016/j.seps.2022.101223
For method = 2: Vidoli, F., Sacchi A. & Sanchez Carrera E. (2025) "Spatial regimes in heterogeneous territories: The efficiency of local public spending" In: Economic modelling https://doi.org/10.1016/j.econmod.2025.107139
Examples
data(SimData)
coords = cbind(SimData$long, SimData$lat)
#######################
neighbours = tri2nb(coords, row.names = NULL)
bh.nb <- neighbours
lcosts <- nbcosts(bh.nb, SimData)
nb <- nb2listw(bh.nb, lcosts, style="B")
mst.bh <- mstree(nb,5)
edges1 = mst.bh[,1:2]
#######################
ncuts1 = 4
crit1 = 10
coly1 = c("y_ols")
colx1 = c("A","L","K")
# OLS
sk = SkaterF(edges = edges1,
data= SimData,
coly = coly1,
colx= colx1,
ncuts=ncuts1,
crit=crit1,
method=1)
SimData$regimes = sk$groups
# plot(lat~long,SimData,col=regimes,pch=16)
## quantile 0.8
# sk2 = SkaterF(edges = edges1,
# data= SimData,
# coly = coly1,
# colx= colx1,
# ncuts=ncuts1,
# crit=crit1,
# method=2,tau.ch=0.8)
#
# SimData$regimes_q = sk2$groups
# plot(lat~long,SimData,col=regimes_q,pch=16)