smoothedIPW

R-CMD-check Codecov test coverage

Table of Contents

Description

The smoothedIPW package implements methods to estimate effects of generalized time-varying treatment strategies on the mean of an outcome at one or more selected follow-up times of interest. The package allows for treatment strategies with the following components:

The package considers the setting where outcomes may be repeatedly, non-monotonically, informatively, and sparsely measured in the data source. The package also supports settings where outcomes are truncated by death, i.e. some individuals die during follow-up which renders the outcome of interest undefined at the follow-up time of interest.

Specifically, this package implements the time-smoothed inverse probability weighted (IPW) methods described in McGrath et al. (2025). Time-smoothing refers to using outcome measurements at intermediate time-points in order to gain precision. In settings with truncation by death, two different types of approaches for time-smoothing are available (i.e., the stacked and nonstacked methods), which rely on different model assumptions. Further details are given in McGrath et al. (2025).

Installation

You can install the development version of smoothedIPW from GitHub with:

# install.packages("devtools")
devtools::install_github("stmcg/smoothedIPW")

Example 1: No Deaths

We first load the package.

library(smoothedIPW)

We will estimate the effect of treatment strategies with the following three components:

We consider the follow-up time of interest to be \(t^*\), \(t^* \in \{6, 12, 18, 24\}\). In this example, we consider that there are no deaths over the study period. The second example considers a setting with deaths over the study period.

Data Set

We will use the example data set data_null which contains longitudinal data on 1,000 individuals over 25 time points. This data set was generated so that the treatment has no effect on the outcome at all time points. The data set data_null contains the following columns:

The first 10 rows of data_null are:

data_null[1:10,]
#>        id  time     L     Z     A     R         Y
#>     <num> <int> <int> <int> <num> <int>     <num>
#>  1:     1     0     1     0     1     0        NA
#>  2:     1     1     0     0     1     0        NA
#>  3:     1     2     1     0     1     0        NA
#>  4:     1     3     0     0     1     0        NA
#>  5:     1     4     0     0     1     1 -4.367446
#>  6:     1     5     1     0     1     0        NA
#>  7:     1     6     1     0     1     0        NA
#>  8:     1     7     1     0     1     0        NA
#>  9:     1     8     0     0     0     0        NA
#> 10:     1     9     0     0     1     0        NA

The package generally expects users to follow these naming conventions for the columns of the observed data set. The columns for the time-varying covariate(s) are an exception to this, which can take on any names.

Applying IPW

Preparing the data set

We first need to add some variables to the data set before applying inverse probability weighting. Specifically, we need to add:

We will also need to add columns for the baseline value of the time-varying covariates. In our case, we will add a column L_baseline for the baseline value of L.

These columns can be added by the prep_data function, as shown below:

data_null_processed <- prep_data(data = data_null, grace_period_length = 2,
                                 baseline_vars = 'L')

To see this, let us inspect the processed dataset for individual id = 2 in the first 10 time intervals:

data_null_processed[id == 2 & time < 10]
#>        id  time     L     Z     A     R          Y A_model_eligible
#>     <num> <int> <int> <int> <num> <int>      <num>            <num>
#>  1:     2     0     1     0     1     1  -6.896476                0
#>  2:     2     1     1     0     1     0         NA                0
#>  3:     2     2     1     0     1     1 -10.390042                0
#>  4:     2     3     1     0     1     0         NA                0
#>  5:     2     4     1     0     1     0         NA                0
#>  6:     2     5     1     0     1     0         NA                0
#>  7:     2     6     1     0     0     0         NA                0
#>  8:     2     7     0     0     0     1  -4.286635                0
#>  9:     2     8     1     0     0     1  11.385070                1
#> 10:     2     9     0     0     1     0         NA                0
#>     C_artificial L_baseline
#>            <num>      <int>
#>  1:            0          1
#>  2:            0          1
#>  3:            0          1
#>  4:            0          1
#>  5:            0          1
#>  6:            0          1
#>  7:            0          1
#>  8:            0          1
#>  9:            1          1
#> 10:            1          1

Observe that A_model_eligible becomes 1 when time = 8 because the individual is at the end of their grace period (i.e., has already went two consecutive intervals without adhering to the mediation); C_artificial switches to 1 in this time interval because the individual did not adhere to the mediation in this interval (the end of their grace period), thus violating the treatment strategy of interest.

Point estimation

We will use the time-smoothed IPW method, which is implemented in the ipw function. This method involves specifying the following models:

An example application of ipw is below:

res_est <- ipw(data = data_null_processed,
               time_smoothed = TRUE,
               outcome_times = c(6, 12, 18, 24),
               A_model = A ~ L + Z,
               R_model_denominator = R ~ L + A + Z,
               R_model_numerator = R ~ L_baseline + Z,
               Y_model = Y ~ L_baseline * (time + Z))

The estimated counterfactual outcome mean for each medication at each follow-up time of interest (\(t^*\)) is given below.

res_est
#> 
#> =======================================================================
#>   Point Estimates: Counterfactual Outcome Mean
#> =======================================================================
#> 
#> Settings:
#> -----------------------------------------------------------------------
#>   Method: Time-smoothed IPW
#>   Outcome times: 6, 12, 18, 24
#> 
#> Estimates:
#> -----------------------------------------------------------------------
#>  time          Z=0         Z=1
#>     6  0.007124865 -0.03536354
#>    12 -0.019445369 -0.06193377
#>    18 -0.046015603 -0.08850401
#>    24 -0.072585838 -0.11507424
Interval estimation

To obtain 95% confidence intervals around our estimates, we can apply the get_CI function. It constructs percentile-based bootstrap confidence intervals using n_boot bootstrap replicates. We use 10 bootstrap replicates for ease of computation.

set.seed(1234)
res_ci <- get_CI(res_est, data = data_null_processed, n_boot = 10)
res_ci
#> 
#> =======================================================================
#>   Confidence Intervals: Counterfactual Outcome Mean
#> =======================================================================
#> 
#> Settings:
#> -----------------------------------------------------------------------
#>   Method: Time-smoothed IPW
#>   Outcome times: 6, 12, 18, 24
#>   Bootstrap samples: 10
#>   Confidence level: 95%
#> 
#> Confidence Intervals:
#> -----------------------------------------------------------------------
#> 
#> Outcome Mean under Z = 0:
#>  Time     Estimate   CI Lower   CI Upper
#>     6  0.007124865 -0.1774043 0.13580696
#>    12 -0.019445369 -0.2534594 0.08688074
#>    18 -0.046015603 -0.3295146 0.04689198
#>    24 -0.072585838 -0.4055698 0.05887030
#> 
#> Outcome Mean under Z = 1:
#>  Time    Estimate   CI Lower   CI Upper
#>     6 -0.03536354 -0.2027095 0.05405786
#>    12 -0.06193377 -0.2137749 0.02011806
#>    18 -0.08850401 -0.2910591 0.03815230
#>    24 -0.11507424 -0.4132964 0.06315579

Example 2: With Deaths

We next consider an example where some participants die during follow-up. We consider the same treatment strategies as in the first example.

Data Set

We use the example data set data_null_deaths, which is similar to data_null but includes deaths during follow-up. This results in fewer total observations (21,713 vs 25,000) because individuals who die have no records after their death time. The data set contains an additional column:

The rows of data_null_deaths for one individual who died at time 5 are shown below:

data_null_deaths[id == 151,]
#>       id  time     L     Z     A     R         Y     D
#>    <num> <int> <int> <int> <num> <int>     <num> <num>
#> 1:   151     0     0     0     1     0        NA     0
#> 2:   151     1     0     0     1     0        NA     0
#> 3:   151     2     0     0     1     1 -2.826145     0
#> 4:   151     3     1     0     0     0        NA     0
#> 5:   151     4     1     0     1     1 -8.082282     0
#> 6:   151     5     1     0     1    NA        NA     1

Applying IPW

Preparing the data set

We prepare the data set in the same way as before using prep_data:

data_null_deaths_processed <- prep_data(data = data_null_deaths, grace_period_length = 2, baseline_vars = 'L')
Point estimation

When deaths are present, we can choose between two different time-smoothing methods: the nonstacked method and stacked method. Users can specify the smoothing method by the smoothing_method argument (options: 'nonstacked' and 'stacked') in the ipw function.

res_est_deaths <- ipw(data = data_null_deaths_processed,
                      time_smoothed = TRUE,
                      smoothing_method = 'nonstacked',
                      outcome_times = c(6, 12, 18, 24),
                      A_model = A ~ L + Z,
                      R_model_denominator = R ~ L + A + Z,
                      R_model_numerator = R ~ L_baseline + Z,
                      Y_model = Y ~ L_baseline * (time + Z))

The estimated counterfactual outcome mean for each medication at each follow-up time of interest is given below.

res_est_deaths
#> 
#> =======================================================================
#>   Point Estimates: Counterfactual Outcome Mean
#> =======================================================================
#> 
#> Settings:
#> -----------------------------------------------------------------------
#>   Method: Time-smoothed IPW (nonstacked)
#>   Outcome times: 6, 12, 18, 24
#> 
#> Estimates:
#> -----------------------------------------------------------------------
#>  time         Z=0        Z=1
#>     6  0.03219486 -0.2047087
#>    12 -0.33221811 -0.3572665
#>    18 -0.38437663 -0.3460489
#>    24 -0.44631635 -0.3221055
Interval estimation

Confidence intervals can be obtained using bootstrap in the same way as in the case without deaths:

set.seed(1234)
res_ci_deaths <- get_CI(res_est_deaths, data = data_null_deaths_processed, n_boot = 10)
res_ci_deaths
#> 
#> =======================================================================
#>   Confidence Intervals: Counterfactual Outcome Mean
#> =======================================================================
#> 
#> Settings:
#> -----------------------------------------------------------------------
#>   Method: Time-smoothed IPW (nonstacked)
#>   Outcome times: 6, 12, 18, 24
#>   Bootstrap samples: 10
#>   Confidence level: 95%
#> 
#> Confidence Intervals:
#> -----------------------------------------------------------------------
#> 
#> Outcome Mean under Z = 0:
#>  Time    Estimate   CI Lower    CI Upper
#>     6  0.03219486 -0.5188776  0.37035189
#>    12 -0.33221811 -0.5737509 -0.02752269
#>    18 -0.38437663 -0.6908641 -0.02397294
#>    24 -0.44631635 -0.6551153 -0.22480044
#> 
#> Outcome Mean under Z = 1:
#>  Time   Estimate   CI Lower    CI Upper
#>     6 -0.2047087 -0.7996861  0.18012098
#>    12 -0.3572665 -0.7339370 -0.13441412
#>    18 -0.3460489 -0.6605846 -0.05504764
#>    24 -0.3221055 -0.7281114 -0.07134614

Citation

If you use smoothedIPW in your research, please cite:

McGrath S, Kawahara T, Petimar J, Rifas-Shiman SL, Díaz I, Block JP, Young JG. (2025). Time-smoothed inverse probability weighted estimation of effects of generalized time-varying treatment strategies on repeated outcomes truncated by death. arXiv preprint arXiv:2509.13971.

BibTeX entry:

@article{mcgrath2025time,
  title={Time-smoothed inverse probability weighted estimation of effects of generalized time-varying treatment strategies on repeated outcomes truncated by death},
  author={McGrath, Sean and Kawahara, Takuya and Petimar, Joshua and Rifas-Shiman, Sheryl L and D{\'\i}az, Iv{\'a}n and Block, Jason P and Young, Jessica G},
  journal={arXiv preprint arXiv:2509.13971},
  year={2025}
}

mirror server hosted at Truenetwork, Russian Federation.