The smoothedIPW package implements methods to estimate
effects of generalized time-varying treatment strategies on the mean of
an outcome at one or more selected follow-up times of interest. The
package allows for treatment strategies with the following
components:
The package considers the setting where outcomes may be repeatedly, non-monotonically, informatively, and sparsely measured in the data source. The package also supports settings where outcomes are truncated by death, i.e. some individuals die during follow-up which renders the outcome of interest undefined at the follow-up time of interest.
Specifically, this package implements the time-smoothed inverse probability weighted (IPW) methods described in McGrath et al. (2025). Time-smoothing refers to using outcome measurements at intermediate time-points in order to gain precision. In settings with truncation by death, two different types of approaches for time-smoothing are available (i.e., the stacked and nonstacked methods), which rely on different model assumptions. Further details are given in McGrath et al. (2025).
You can install the development version of smoothedIPW
from GitHub with:
# install.packages("devtools")
devtools::install_github("stmcg/smoothedIPW")We first load the package.
library(smoothedIPW)We will estimate the effect of treatment strategies with the following three components:
We consider the follow-up time of interest to be \(t^*\), \(t^* \in \{6, 12, 18, 24\}\). In this example, we consider that there are no deaths over the study period. The second example considers a setting with deaths over the study period.
We will use the example data set data_null which
contains longitudinal data on 1,000 individuals over 25 time points.
This data set was generated so that the treatment has no effect on the
outcome at all time points. The data set data_null contains
the following columns:
id: Participant IDtime: Follow-up time indexL: Time-varying covariateZ: Medication initiated at baselineA: Adherence to the medication initiated at
baselineR: Indicator of outcome measurementY: OutcomeThe first 10 rows of data_null are:
data_null[1:10,]
#> id time L Z A R Y
#> <num> <int> <int> <int> <num> <int> <num>
#> 1: 1 0 1 0 1 0 NA
#> 2: 1 1 0 0 1 0 NA
#> 3: 1 2 1 0 1 0 NA
#> 4: 1 3 0 0 1 0 NA
#> 5: 1 4 0 0 1 1 -4.367446
#> 6: 1 5 1 0 1 0 NA
#> 7: 1 6 1 0 1 0 NA
#> 8: 1 7 1 0 1 0 NA
#> 9: 1 8 0 0 0 0 NA
#> 10: 1 9 0 0 1 0 NAThe package generally expects users to follow these naming conventions for the columns of the observed data set. The columns for the time-varying covariate(s) are an exception to this, which can take on any names.
We first need to add some variables to the data set before applying inverse probability weighting. Specifically, we need to add:
C_artificial: An indicator specifying when an
individual should be artificially censored from the dataA_model_eligible: An indicator specifying what records
should be used for fitting the treatment adherence modelWe will also need to add columns for the baseline value of the
time-varying covariates. In our case, we will add a column
L_baseline for the baseline value of L.
These columns can be added by the prep_data function, as
shown below:
data_null_processed <- prep_data(data = data_null, grace_period_length = 2,
baseline_vars = 'L')To see this, let us inspect the processed dataset for individual
id = 2 in the first 10 time intervals:
data_null_processed[id == 2 & time < 10]
#> id time L Z A R Y A_model_eligible
#> <num> <int> <int> <int> <num> <int> <num> <num>
#> 1: 2 0 1 0 1 1 -6.896476 0
#> 2: 2 1 1 0 1 0 NA 0
#> 3: 2 2 1 0 1 1 -10.390042 0
#> 4: 2 3 1 0 1 0 NA 0
#> 5: 2 4 1 0 1 0 NA 0
#> 6: 2 5 1 0 1 0 NA 0
#> 7: 2 6 1 0 0 0 NA 0
#> 8: 2 7 0 0 0 1 -4.286635 0
#> 9: 2 8 1 0 0 1 11.385070 1
#> 10: 2 9 0 0 1 0 NA 0
#> C_artificial L_baseline
#> <num> <int>
#> 1: 0 1
#> 2: 0 1
#> 3: 0 1
#> 4: 0 1
#> 5: 0 1
#> 6: 0 1
#> 7: 0 1
#> 8: 0 1
#> 9: 1 1
#> 10: 1 1Observe that A_model_eligible becomes 1 when
time = 8 because the individual is at the end of their
grace period (i.e., has already went two consecutive intervals without
adhering to the mediation); C_artificial switches to 1 in
this time interval because the individual did not adhere to the
mediation in this interval (the end of their grace period), thus
violating the treatment strategy of interest.
We will use the time-smoothed IPW method, which is implemented in the
ipw function. This method involves specifying the following
models:
A_model: Treatment adherence modelR_model_denominator: Outcome measurement indicator
model (used in the denominator of weights)R_model_numerator: (Optional) Outcome measurement
indicator model (used in the numerator of weights for
stabilization)Y_model: Outcome (marginal structural) modelAn example application of ipw is below:
res_est <- ipw(data = data_null_processed,
time_smoothed = TRUE,
outcome_times = c(6, 12, 18, 24),
A_model = A ~ L + Z,
R_model_denominator = R ~ L + A + Z,
R_model_numerator = R ~ L_baseline + Z,
Y_model = Y ~ L_baseline * (time + Z))The estimated counterfactual outcome mean for each medication at each follow-up time of interest (\(t^*\)) is given below.
res_est
#>
#> =======================================================================
#> Point Estimates: Counterfactual Outcome Mean
#> =======================================================================
#>
#> Settings:
#> -----------------------------------------------------------------------
#> Method: Time-smoothed IPW
#> Outcome times: 6, 12, 18, 24
#>
#> Estimates:
#> -----------------------------------------------------------------------
#> time Z=0 Z=1
#> 6 0.007124865 -0.03536354
#> 12 -0.019445369 -0.06193377
#> 18 -0.046015603 -0.08850401
#> 24 -0.072585838 -0.11507424To obtain 95% confidence intervals around our estimates, we can apply
the get_CI function. It constructs percentile-based
bootstrap confidence intervals using n_boot bootstrap
replicates. We use 10 bootstrap replicates for ease of computation.
set.seed(1234)
res_ci <- get_CI(res_est, data = data_null_processed, n_boot = 10)
res_ci
#>
#> =======================================================================
#> Confidence Intervals: Counterfactual Outcome Mean
#> =======================================================================
#>
#> Settings:
#> -----------------------------------------------------------------------
#> Method: Time-smoothed IPW
#> Outcome times: 6, 12, 18, 24
#> Bootstrap samples: 10
#> Confidence level: 95%
#>
#> Confidence Intervals:
#> -----------------------------------------------------------------------
#>
#> Outcome Mean under Z = 0:
#> Time Estimate CI Lower CI Upper
#> 6 0.007124865 -0.1774043 0.13580696
#> 12 -0.019445369 -0.2534594 0.08688074
#> 18 -0.046015603 -0.3295146 0.04689198
#> 24 -0.072585838 -0.4055698 0.05887030
#>
#> Outcome Mean under Z = 1:
#> Time Estimate CI Lower CI Upper
#> 6 -0.03536354 -0.2027095 0.05405786
#> 12 -0.06193377 -0.2137749 0.02011806
#> 18 -0.08850401 -0.2910591 0.03815230
#> 24 -0.11507424 -0.4132964 0.06315579We next consider an example where some participants die during follow-up. We consider the same treatment strategies as in the first example.
We use the example data set data_null_deaths, which is
similar to data_null but includes deaths during follow-up.
This results in fewer total observations (21,713 vs 25,000) because
individuals who die have no records after their death time. The data set
contains an additional column:
D: Indicator of whether death occurred at that time
pointThe rows of data_null_deaths for one individual who died
at time 5 are shown below:
data_null_deaths[id == 151,]
#> id time L Z A R Y D
#> <num> <int> <int> <int> <num> <int> <num> <num>
#> 1: 151 0 0 0 1 0 NA 0
#> 2: 151 1 0 0 1 0 NA 0
#> 3: 151 2 0 0 1 1 -2.826145 0
#> 4: 151 3 1 0 0 0 NA 0
#> 5: 151 4 1 0 1 1 -8.082282 0
#> 6: 151 5 1 0 1 NA NA 1We prepare the data set in the same way as before using
prep_data:
data_null_deaths_processed <- prep_data(data = data_null_deaths, grace_period_length = 2, baseline_vars = 'L')When deaths are present, we can choose between two different
time-smoothing methods: the nonstacked method and stacked method. Users
can specify the smoothing method by the smoothing_method
argument (options: 'nonstacked' and 'stacked')
in the ipw function.
res_est_deaths <- ipw(data = data_null_deaths_processed,
time_smoothed = TRUE,
smoothing_method = 'nonstacked',
outcome_times = c(6, 12, 18, 24),
A_model = A ~ L + Z,
R_model_denominator = R ~ L + A + Z,
R_model_numerator = R ~ L_baseline + Z,
Y_model = Y ~ L_baseline * (time + Z))The estimated counterfactual outcome mean for each medication at each follow-up time of interest is given below.
res_est_deaths
#>
#> =======================================================================
#> Point Estimates: Counterfactual Outcome Mean
#> =======================================================================
#>
#> Settings:
#> -----------------------------------------------------------------------
#> Method: Time-smoothed IPW (nonstacked)
#> Outcome times: 6, 12, 18, 24
#>
#> Estimates:
#> -----------------------------------------------------------------------
#> time Z=0 Z=1
#> 6 0.03219486 -0.2047087
#> 12 -0.33221811 -0.3572665
#> 18 -0.38437663 -0.3460489
#> 24 -0.44631635 -0.3221055Confidence intervals can be obtained using bootstrap in the same way as in the case without deaths:
set.seed(1234)
res_ci_deaths <- get_CI(res_est_deaths, data = data_null_deaths_processed, n_boot = 10)
res_ci_deaths
#>
#> =======================================================================
#> Confidence Intervals: Counterfactual Outcome Mean
#> =======================================================================
#>
#> Settings:
#> -----------------------------------------------------------------------
#> Method: Time-smoothed IPW (nonstacked)
#> Outcome times: 6, 12, 18, 24
#> Bootstrap samples: 10
#> Confidence level: 95%
#>
#> Confidence Intervals:
#> -----------------------------------------------------------------------
#>
#> Outcome Mean under Z = 0:
#> Time Estimate CI Lower CI Upper
#> 6 0.03219486 -0.5188776 0.37035189
#> 12 -0.33221811 -0.5737509 -0.02752269
#> 18 -0.38437663 -0.6908641 -0.02397294
#> 24 -0.44631635 -0.6551153 -0.22480044
#>
#> Outcome Mean under Z = 1:
#> Time Estimate CI Lower CI Upper
#> 6 -0.2047087 -0.7996861 0.18012098
#> 12 -0.3572665 -0.7339370 -0.13441412
#> 18 -0.3460489 -0.6605846 -0.05504764
#> 24 -0.3221055 -0.7281114 -0.07134614If you use smoothedIPW in your research, please
cite:
McGrath S, Kawahara T, Petimar J, Rifas-Shiman SL, Díaz I, Block JP, Young JG. (2025). Time-smoothed inverse probability weighted estimation of effects of generalized time-varying treatment strategies on repeated outcomes truncated by death. arXiv preprint arXiv:2509.13971.
BibTeX entry:
@article{mcgrath2025time,
title={Time-smoothed inverse probability weighted estimation of effects of generalized time-varying treatment strategies on repeated outcomes truncated by death},
author={McGrath, Sean and Kawahara, Takuya and Petimar, Joshua and Rifas-Shiman, Sheryl L and D{\'\i}az, Iv{\'a}n and Block, Jason P and Young, Jessica G},
journal={arXiv preprint arXiv:2509.13971},
year={2025}
}