unsurv: Unsupervised Clustering of Individualized Survival Curves

unsurv

License: MIT

unsurv provides tools for unsupervised clustering of individualized survival curves using a medoid-based (PAM) algorithm. It is designed for applications where each observation is represented by a full survival probability trajectory over time, such as:

The package provides:

Installation

From GitHub

install.packages("remotes")
remotes::install_github("ielbadisy/unsurv")

From CRAN (after submission)

install.packages("unsurv")

Overview

The core function is:

unsurv()

which clusters survival curves represented as an n × m matrix:

Example: Clustering survival curves

library(unsurv)

set.seed(123)

n <- 100
Q <- 50
times <- seq(0, 5, length.out = Q)

rates <- c(0.2, 0.5, 0.9)
group <- sample(1:3, n, TRUE)

S <- sapply(times, function(t)
  exp(-rates[group] * t)
)

S <- S + matrix(rnorm(n * Q, 0, 0.01), nrow = n)
S[S < 0] <- 0
S[S > 1] <- 1

fit <- unsurv(S, times, K = NULL, K_max = 6)

fit
#> unsurv (PAM) fit
#>   K:3
#>   distance:L2 silhouette_mean:0.915
#>   n:100 Q:50

Plot cluster medoids

plot(fit)

Each line represents the medoid survival curve for a cluster.

Predict cluster membership for new curves

predict(fit, S[1:5, ])
#> [1] 1 1 1 2 1

Stability assessment

Cluster stability can be evaluated using resampling:

stab <- unsurv_stability(
  S, times, fit,
  B = 20,
  frac = 0.7,
  mode = "subsample"
)

stab$mean
#> [1] 0.9384743

Higher values indicate more stable clustering.

Using ggplot visualization

library(ggplot2)
library(unsurv)
autoplot(fit)

Methodological details

Given survival curves:

\[ S_i(t_1), S_i(t_2), \dots, S_i(t_m) \]

the algorithm:

  1. Optionally enforces monotonicity
  2. Applies weighted feature transformation
  3. Computes pairwise distances (L1 or L2)
  4. Applies PAM clustering
  5. Selects optimal K via silhouette (if not specified)

Cluster medoids represent prototype survival profiles.

Typical workflow

fit <- unsurv(S, times)

clusters <- fit$clusters

pred <- predict(fit, new_S)

stab <- unsurv_stability(S, times, fit)

Vignette

A full walkthrough (simulation, fitting, visualization, prediction, stability) is available in the vignette:

vignette("unsurv-intro", package = "unsurv")

Applications

unsurv is useful for:

Relationship to other methods

Unlike clustering on covariates, unsurv clusters on the survival function itself, enabling:

Package structure

Core functions:

Function Description
unsurv fit clustering model
predict assign new curves
plot visualize medoids
summary summarize clustering
unsurv_stability evaluate stability
autoplot ggplot visualization

Citation

If you use unsurv, please cite:

citation("unsurv")
#> To cite package 'unsurv' in publications use:
#> 
#>   EL BADISY I (2026). _unsurv: Unsupervised Clustering of
#>   Individualized Survival Curves_. R package version 0.1.0.
#> 
#> A BibTeX entry for LaTeX users is
#> 
#>   @Manual{,
#>     title = {unsurv: Unsupervised Clustering of Individualized Survival Curves},
#>     author = {Imad {EL BADISY}},
#>     year = {2026},
#>     note = {R package version 0.1.0},
#>   }

Development status

Links

GitHub: https://github.com/ielbadisy/unsurv

License

MIT License © Imad EL BADISY

mirror server hosted at Truenetwork, Russian Federation.