| Title: | Analyzing Randomized Experiments as Multi-Arm Bandits | 
| Version: | 0.3.0 | 
| Description: | Simulates the results of completed randomized controlled trials, as if they had been conducted as adaptive Multi-Arm Bandit (MAB) trials instead. Augmented inverse probability weighted estimation (AIPW), outlined by Hadad et al. (2021) <doi:10.1073/pnas.2014602118>, is used to robustly estimate the probability of success for each treatment arm under the adaptive design. Provides customization options to simulate perfect/imperfect information, stationary/non-stationary bandits, blocked treatment assignments, along with control augmentation, and other hybrid strategies for assigning treatment arms. The methods used in simulation were inspired by Offer-Westort et al. (2021) <doi:10.1111/ajps.12597>. | 
| License: | GPL (≥ 3) | 
| URL: | https://github.com/Noch05/whatifbandit | 
| BugReports: | https://github.com/Noch05/whatifbandit/issues | 
| Depends: | R (≥ 4.1.0) | 
| Imports: | bandit, data.table, dplyr, furrr, ggplot2, lubridate, purrr, randomizr, rlang, tibble, tidyr | 
| Suggests: | future, knitr, rmarkdown, testthat (≥ 3.0.0) | 
| VignetteBuilder: | knitr | 
| Config/testthat/edition: | 3 | 
| Encoding: | UTF-8 | 
| LazyData: | true | 
| RoxygenNote: | 7.3.2 | 
| NeedsCompilation: | no | 
| Packaged: | 2025-10-29 16:28:47 UTC; noaho | 
| Author: | Noah Ochital  | 
| Maintainer: | Noah Ochital <noahochital@icloud.com> | 
| Repository: | CRAN | 
| Date/Publication: | 2025-11-03 10:20:02 UTC | 
whatifbandit: Analyzing Randomized Experiments as Multi-Arm Bandits
Description
Simulates the results of completed randomized controlled trials, as if they had been conducted as adaptive Multi-Arm Bandit (MAB) trials instead. Augmented inverse probability weighted estimation (AIPW), outlined by Hadad et al. (2021) doi:10.1073/pnas.2014602118, is used to robustly estimate the probability of success for each treatment arm under the adaptive design. Provides customization options to simulate perfect/imperfect information, stationary/non-stationary bandits, blocked treatment assignments, along with control augmentation, and other hybrid strategies for assigning treatment arms. The methods used in simulation were inspired by Offer-Westort et al. (2021) doi:10.1111/ajps.12597.
Author(s)
Maintainer: Noah Ochital noahochital@icloud.com (ORCID) [copyright holder]
Other contributors:
Ryan T. Moore rtm@american.edu (ORCID) [contributor, copyright holder]
See Also
Useful links:
Calculate Adaptive AIPW Estimates
Description
Averages the observation level AIPW scores created by get_iaipw() across each period, then assigns
each estimate an adaptive weight based on Hadad et. al (2021), and
another weight based on the size of each period to calculate the final AIPW estimate and variance for each treatment.
Sample means and variances are also provided for comparison.
Usage
adaptive_aipw(data, assignment_probs, conditions, periods, verbose)
Arguments
data | 
 A data.frame, data.table, or tibble containing input data from the trial. This should be the results of a traditional Randomized Controlled Trial (RCT). Any data.frames will be converted to tibbles internally.  | 
assignment_probs | 
 A tibble/data.table containing the probabilities of being assigned each treatment at a given period.  | 
periods | 
 Numeric value of length 1; number of total periods in the simulation.  | 
verbose | 
 Logical; whether or not to print intermediate messages. Default is FALSE.  | 
Details
The formulas for the calculations in this function can be found in Hadad et al. (2021) at equation 5 (estimate), equation 11 (variance), equation 15 (allocation rate).
The formulas specified assume that each period is 1 observation but in the cases for this simulation where periods contain multiple observations the individual estimates from each period are averaged and weighted by size before being used in the final calculations.
The AIPW estimator is unbiased, consistent, and asymptotically normal under the conditions of the simulated trial of the so can be used for valid inference with a normal distribution. The sample means and variances are provided for comparison but these are biased and inconsistent under the conditions of the simulated trial
Value
A tibble/data.table containing the AIPW estimate of treatment success, AIPW variance, sample proportion of successful treatments (sample mean), and sample mean variance.
References
Hadad, Vitor, David A. Hirshberg, Ruohan Zhan, Stefan Wager, and Susan Athey. 2021. "Confidence Intervals for Policy Evaluation in Adaptive Experiments." Proceedings of the National Academy of Sciences of the United States of America 118 (15): e2014602118. doi:10.1073/pnas.2014602118.
Adaptively Assign Treatments in a Period
Description
Assigns new treatments for an assignment wave based on the assignment probabilities provided from
get_bandit(), and the proportion of randomly assigned observations specified in random_assign_prop.
Assignments are made randomly with the given probabilities using randomizr::block_ra() or
randomizr::complete_ra().
Usage
assign_treatments(
  current_data,
  probs,
  blocking = NULL,
  conditions,
  condition_col,
  random_assign_prop
)
Arguments
current_data | 
 A tibble/data.table with only observations from the current sampling period.  | 
probs | 
 Named numeric Vector; probability of assignment for each treatment condition.  | 
blocking | 
 Logical; whether or not to use treatment blocking. Treatment blocking is used to ensure an even-enough distribution of treatment conditions across blocks. For example, blocking by gender would mean the randomized assignment should split treatments evenly not just throughout the sample (so for 4 arms, 25-25-25-25), but also within each block, so 25% of men would receive each treatment and 25% of women the same.  | 
condition_col | 
 Column in   | 
random_assign_prop | 
 A numeric value ranging from 0 to 1; proportion of each wave to be assigned new treatments randomly,
1 -   | 
Details
The number of rows which are randomly assigned in each period is random_assign_prop multiplied by
the number of rows in the period. If this number is less than 1, then Bernoulli draws are made for each row
with probability random_assign_prob to determine if that row will be assigned randomly. Else, the number of random
rows is rounded to the nearest whole number, and then that many rows are selected to be assigned through
complete random assignment. The row selections are also random.
Value
Updated tibble/data.table with the new treatment conditions for each observation, and whether imputation is required. If this treatment is different then from under the original experiment, the 'impute_req' is 1, and else is 0 for the observation.
See Also
Checks Validity of Thompson sampling probabilities
Description
Checks if the Thompson sampling probabilities either sum arbitrarily close to 0 or if any of them are NA, indicating the direct calculation failed or did not converge.
Usage
bandit_invalid(bandit)
Arguments
bandit | 
 a numeric vector of Thompson sampling probabilities.  | 
Value
Logical; TRUE if the vector is invalid, FALSE if valid
Checking For Valid Assignment Methods
Description
Helper to validate_inputs(). This function accepts arguments relating
to how treatment waves are assigned, and checks if they are valid, and if all
supporting arguments are passed as necessary.
Usage
check_assign_method(assignment_method, time_unit, verbose, period_length)
Arguments
assignment_method | 
 A character string; one of "date", "batch", or "individual", to define the assignment into treatment waves. When using "batch" or "individual", ensure your dataset is pre-arranged in the proper order observations should be considered so that groups are assigned correctly. For "date", observations will be considered in chronological order. "individual" assignment can be computationally intensive for larger datasets.  | 
time_unit | 
 A character string specifying the unit of time for assigning periods when   | 
verbose | 
 Logical; whether or not to print intermediate messages. Default is FALSE.  | 
period_length | 
 A numeric value of length 1; represents the length of each treatment period.
If assignment method is "date", this length refers the number of units specified in   | 
Value
Throws an error if the user is missing necessary arguments to assign treatments or passes invalid ones.
Checking existence and declaration of columns
Description
Helper to validate_inputs(). This function accepts the user's
settings for the Multi-Arm-Bandit trial, and checks whether columns in the data have been properly
specified based on these settings.
Usage
check_cols(
  assignment_method,
  time_unit,
  perfect_assignment,
  data_cols,
  data,
  verbose
)
Arguments
assignment_method | 
 A character string; one of "date", "batch", or "individual", to define the assignment into treatment waves. When using "batch" or "individual", ensure your dataset is pre-arranged in the proper order observations should be considered so that groups are assigned correctly. For "date", observations will be considered in chronological order. "individual" assignment can be computationally intensive for larger datasets.  | 
time_unit | 
 A character string specifying the unit of time for assigning periods when   | 
perfect_assignment | 
 Logical; if TRUE, assumes perfect information for treatment assignment (i.e., all outcomes are observed regardless of the date). If FALSE, hides outcomes not yet theoretically observed, based on the dates treatments would have been assigned for each wave. This is useful when simulating batch-based assignment where treatments were assigned on a given day whether or not all the information from a prior batch was available and you have exact dates treatments were assigned.  | 
data_cols | 
 A named character vector containing the names of columns in  
  | 
data | 
 A data.frame, data.table, or tibble containing input data from the trial. This should be the results of a traditional Randomized Controlled Trial (RCT). Any data.frames will be converted to tibbles internally.  | 
verbose | 
 Logical; whether or not to print intermediate messages. Default is FALSE.  | 
Value
Throws an error if columns which are required have not been declared
or are not present in the data, or are the wrong primitive data type. Additionally throws warning messages,
if unnecessary columns have been provided, only when verbose is TRUE.
Checking for Valid Input Data
Description
Helper to validate_inputs(). This function accepts the data and checks
whether it has Unique ID's whether the period length is valid.
Usage
check_data(
  data,
  data_cols,
  assignment_method,
  period_length,
  time_unit,
  perfect_assignment
)
Arguments
data | 
 A data.frame, data.table, or tibble containing input data from the trial. This should be the results of a traditional Randomized Controlled Trial (RCT). Any data.frames will be converted to tibbles internally.  | 
data_cols | 
 A named character vector containing the names of columns in  
  | 
assignment_method | 
 A character string; one of "date", "batch", or "individual", to define the assignment into treatment waves. When using "batch" or "individual", ensure your dataset is pre-arranged in the proper order observations should be considered so that groups are assigned correctly. For "date", observations will be considered in chronological order. "individual" assignment can be computationally intensive for larger datasets.  | 
period_length | 
 A numeric value of length 1; represents the length of each treatment period.
If assignment method is "date", this length refers the number of units specified in   | 
time_unit | 
 A character string specifying the unit of time for assigning periods when   | 
perfect_assignment | 
 Logical; if TRUE, assumes perfect information for treatment assignment (i.e., all outcomes are observed regardless of the date). If FALSE, hides outcomes not yet theoretically observed, based on the dates treatments would have been assigned for each wave. This is useful when simulating batch-based assignment where treatments were assigned on a given day whether or not all the information from a prior batch was available and you have exact dates treatments were assigned.  | 
Value
Throws an error if the data does not meet the specifications of the trial based on user settings.
Checking Imputation Info
Description
Subsets or adds to the tibble/data.frame created by imputation_precompute(),
and sorts it to ensure compatibility with randomizr::block_ra().
Usage
check_impute(imputation_information, current_data, current_period)
Arguments
imputation_information | 
 The   | 
current_data | 
 A tibble/data.table with only observations from the current sampling period.  | 
current_period | 
 Numeric value of length 1; current treatment wave of the simulation.  | 
Details
randomizr::block_ra() does not see the names
of the probabilities passed per block, so the imputation information must be subsetted
to only contain blocks which are present in a period, and sorted to comply with
randomizr::block_ra()'s internal ordering.
When blocks are required but do not exist in the information provided it is added to the tibble/data.table, with an estimated conditional probability of success as the average across other blocks.
When blocks are present but not required, they are removed from the tibble/data.table.
Check Level
Description
Checking if the level argument in the S3 generic methods
is valid for a confidence interval.
Usage
check_level(level)
Arguments
level | 
 Numeric value of length 1; indicates confidence interval Width (i.e 0.90, 0.95, 0.99). Defaults to 0.95.  | 
Value
Throws an error if level is invalid, else does nothing.
Checking if Inputs are Logical Values (TRUE and FALSE)
Description
Helper to validate_inputs(). This function accepts the user's
settings for logical values in the Multi-Arm-Bandit trial, and checks whether they are valid.
Usage
check_logical(...)
Arguments
... | 
 Arguments to check.  | 
Value
Throws an error if any input is not TRUE or FALSE
Checking If Inputs Are Positive Integers or a Valid String
Description
Helper to validate_inputs(). This function accepts the user's
settings for integer arguments and checks if they are valid positive
integers or are a one of the valid strings for the argument.
Usage
check_posint(...)
Arguments
... | 
 Arguments to check.  | 
Value
Throws an error if any input is not a positive whole number or a valid string.
Checking for Proportions
Description
Helper to validate_inputs(). This function accepts the user's
settings for proportion arguments and checks if they are valid proportions between 0 and 1
Usage
check_prop(...)
Arguments
... | 
 Arguments to check.  | 
Value
Throws an error if any input is not a valid proportion between 0 and 1
Column arguments shared across functions
Description
Topic holding common arguments across many functions. Used to expedite documentation, through
inheritParams tag from roxygen2.
Arguments
id_col | 
 Column in   | 
success_col | 
 Column in   | 
condition_col | 
 Column in   | 
date_col | 
 Column in   | 
month_col | 
 Column in   | 
success_date_col | 
 Column in   | 
assignment_date_col | 
 Column in   | 
Condenses results into a list for multiple_mab_simulation()
Description
Takes the output from furrr::future_map() in multiple_mab_simulation()
and condenses it to return to the user.
Usage
condense_results(data, keep_data, mabs, times)
Arguments
data | 
 A data.frame, data.table, or tibble containing input data from the trial. This should be the results of a traditional Randomized Controlled Trial (RCT). Any data.frames will be converted to tibbles internally.  | 
keep_data | 
 Logical; Whether or not to keep the final data from each trial. Recommended FALSE.  | 
mabs | 
 output from   | 
times | 
 A numeric value of length 1, the number of simulations to conduct.  | 
Details
This function iterates over every element in the output from furrr::future_map()
and extracts the required element to place to condense into the final list, outputted to the user
in multiple_mab_simulation. It condenses the long list into tibbles or data.tables, keeping each element
together. For example it extracts all the bandits objects from the output lists, across all trials, and
binds them into a single tibble/data.table.
Value
multiple.mab class object, which is a named list containing:
-  
final_data_nest:tibble or data.table containing the nested tibbles/data.tables from each trial. Only provided whenkeep_datais TRUE. -  
bandits: A tibble or data.table containing the UCB1 values or Thompson sampling posterior distributions for each period. Wide format, each row is a period, and each columns is a treatment. -  
assignment_probs: A tibble or data.table containing the probability of being assigned each treatment arm at a given period. Wide format, each row is a period, and each columns is a treatment. -  
estimates: A tibble or data.table containing the AIPW (Augmented Inverse Probability Weighting) treatment effect estimates and variances, and traditional sample means and variances, for each treatment arm. Long format, treatment arm, and estimate type are columns along with the mean and variance. -  
settings: A named list of the configuration settings used in the trial. 
Creating proper conditions vector
Description
This function creates a character vector of treatment conditions
using the conditions column in the provided data, and if control_augment is greater
than 0, it also labels the control condition. Throws an error of control_condition is not
present.
Usage
create_conditions(control_condition, data, condition_col, control_augment)
Arguments
control_condition | 
 Value of the control condition. Only necessary when   | 
data | 
 A data.frame, data.table, or tibble containing input data from the trial. This should be the results of a traditional Randomized Controlled Trial (RCT). Any data.frames will be converted to tibbles internally.  | 
condition_col | 
 Column in   | 
control_augment | 
 A numeric value ranging from 0 to 1; proportion of each wave guaranteed to receive the "Control" treatment.
Default is 0. It is not recommended to use this in conjunction with   | 
Value
Character vector of unique treatment conditions. Throws error if an invalid specification is used.
Create Treatment Wave Cutoffs
Description
Used to assign each observation a new treatment assignment period, based
on user-supplied specifications, and user supplied data from
date_col and month_col in data_cols, and the period_length. Creates a new
column indicating with period each observation belongs to.
Usage
create_cutoff(
  data,
  data_cols,
  period_length = NULL,
  assignment_method,
  time_unit
)
Arguments
data | 
 A data.frame, data.table, or tibble containing input data from the trial. This should be the results of a traditional Randomized Controlled Trial (RCT). Any data.frames will be converted to tibbles internally.  | 
data_cols | 
 A named character vector containing the names of columns in  
  | 
period_length | 
 A numeric value of length 1; represents the length of each treatment period.
If assignment method is "date", this length refers the number of units specified in   | 
assignment_method | 
 A character string; one of "date", "batch", or "individual", to define the assignment into treatment waves. When using "batch" or "individual", ensure your dataset is pre-arranged in the proper order observations should be considered so that groups are assigned correctly. For "date", observations will be considered in chronological order. "individual" assignment can be computationally intensive for larger datasets.  | 
time_unit | 
 A character string specifying the unit of time for assigning periods when   | 
Details
The assignment periods do not strictly have to line up with the original experiment, it is up to the researcher to test the possible options.
Month based assignment can be specified either using the months inside the month_col or date_col,
if month_col is passed into the function it will be used.
Value
Updated tibble/data.table with the new period_number column. period_number is an integer
representing an observation's new assignment period.
Create Necessary Columns for Multi-Arm Bandit Trial
Description
Initializes partially empty columns in data to initialize them for the simulation.
These are initialized as NA except for observations with period_number = 1, whose values are copied
from the provided columns, and used as the starting point for the simulation.
Usage
create_new_cols(data, data_cols, block_cols, blocking, perfect_assignment)
Arguments
data | 
 A data.frame, data.table, or tibble containing input data from the trial. This should be the results of a traditional Randomized Controlled Trial (RCT). Any data.frames will be converted to tibbles internally.  | 
data_cols | 
 A named character vector containing the names of columns in  
  | 
block_cols | 
 A character vector of variables to block by. This vector should not be named.  | 
blocking | 
 Logical; whether or not to use treatment blocking. Treatment blocking is used to ensure an even-enough distribution of treatment conditions across blocks. For example, blocking by gender would mean the randomized assignment should split treatments evenly not just throughout the sample (so for 4 arms, 25-25-25-25), but also within each block, so 25% of men would receive each treatment and 25% of women the same.  | 
perfect_assignment | 
 Logical; if TRUE, assumes perfect information for treatment assignment (i.e., all outcomes are observed regardless of the date). If FALSE, hides outcomes not yet theoretically observed, based on the dates treatments would have been assigned for each wave. This is useful when simulating batch-based assignment where treatments were assigned on a given day whether or not all the information from a prior batch was available and you have exact dates treatments were assigned.  | 
Value
Updated tibble/data.table with 6 new columns:
-  
mab_success: New variable to hold new success from Multi-arm bandit procedure, NA until assigned. -  
mab_condition: New variable to hold new treatment condition from Multi-arm bandit procedure, NA until assigned. -  
impute_req: Binary indicator for imputation requirement, NA until assigned. -  
new_success_date: New variable to hold the new success date under Multi-arm bandit procedure, NA until assigned. -  
block: New variable indicating the variables to block by for assignment. -  
treatment_block: New variable combining block with original treatment condition. 
Create Prior Periods
Description
Used during run_mab_trial() to create a vector of prior periods dynamically based on the specified
number of prior periods.
Usage
create_prior(prior_periods, current_period)
Arguments
prior_periods | 
 A numeric value of length 1, or the character string "All"; number of previous periods to use in the treatment assignment model. This is used to implement the stationary/non-stationary bandit. For example, a non-stationary bandit assumes the true probability of success for each treatment changes over time, so to account for that, not all prior data should be used when making decisions because it could be "out of date".  | 
current_period | 
 The current period of the simulation. Defined by loop structure inside   | 
Value
Numeric vector containing the prior treatment periods to be used when aggregating the results for the current treatment assignment period.
See Also
Ends Multi-Arm Bandit Trial
Description
Condenses output from run_mab_trial() into
manageable structure.
Usage
end_mab_trial(data, bandits, algorithm, periods, conditions, ndraws)
Arguments
data | 
 Finalized data from   | 
bandits | 
 Finalized bandits list from   | 
algorithm | 
 A character string specifying the MAB algorithm to use. Options are "thompson" or "ucb1". Algorithm defines the adaptive assignment process. Mathematical details on these algorithms can be found in Kuleshov and Precup 2014 and Slivkins 2024.  | 
periods | 
 Numeric value of length 1; total number of periods in Multi-Arm-Bandit trial.  | 
ndraws | 
 A numeric value; When Thompson sampling direct calculations fail, draws from a simulated posterior
will be used to approximate the Thompson sampling probabilities. This is the number of simulations to use, the default
is 5000 to match the default parameter   | 
Details
Takes the bandit lists provided, and condenses them using dplyr::bind_rows()
into tibbles or data.tables, and then pivots the table
to wide format where each treatment arm is a column, and the rows
represent periods.
At this step the final UCB1 or Thompson sampling probabilities are calculated. The entire table is shifted backward by one period so that each row reflects the calculation that occurs after completing a period. For example prior to this change, row 11, would indicate the calculations from period 11 before assignment, but now that occured after period 11's imputations.
This has the impact of removing the original first row, where all the assignment probabilities are equal, and modifying the last row to represent the final calculation after the conclusion of the simulation.
The assignment probabilities are not changed in this way, so for each period they still reflect the assignment probabilities used in that period.
Value
A named list containing:
-  
final_data: The processed tibble or data.table, containing new columns pertaining to the results of the trial. -  
bandits: A tibble or data.table containing the UCB1 values or Thompson sampling posterior distributions for each period. -  
assignment_probs: A tibble or data.table containing the probability of being assigned each treatment arm at a given period. 
See Also
Calculates Number of Observations Assigned to Each Treatment
Description
Takes the output from mab_simulation(), and
calculates the number of observations assigned to each treatment group in the adaptive trial.
Usage
get_assignment_quantities(simulation, conditions)
Arguments
simulation | 
 Output from   | 
conditions | 
 Character vector containing the names of all the treatment conditions in the trial.  | 
Value
Named numeric vector containing number of observations assigned to each treatment group
Calculate Multi-Arm Bandit Decision Based on Algorithm
Description
Calculates the best treatment for a given period using either a UCB1 or Thompson sampling algorithm.
Thompson sampling is done using bandit::best_binomial_bandit() from
the bandit
package and UCB1 values are calculated using the well-defined formula that can be found
in Kuleshov and Precup (2014).
Usage
get_bandit(
  past_results,
  algorithm,
  conditions,
  current_period,
  control_augment = 0,
  ndraws
)
Arguments
past_results | 
 A tibble/data.table containing summary of prior periods, with
successes, number of observations, and success rates, which is created by   | 
algorithm | 
 A character string specifying the MAB algorithm to use. Options are "thompson" or "ucb1". Algorithm defines the adaptive assignment process. Mathematical details on these algorithms can be found in Kuleshov and Precup 2014 and Slivkins 2024.  | 
current_period | 
 Numeric value of length 1; current period of the adaptive trial simulation.  | 
control_augment | 
 A numeric value ranging from 0 to 1; proportion of each wave guaranteed to receive the "Control" treatment.
Default is 0. It is not recommended to use this in conjunction with   | 
ndraws | 
 A numeric value; When Thompson sampling direct calculations fail, draws from a simulated posterior
will be used to approximate the Thompson sampling probabilities. This is the number of simulations to use, the default
is 5000 to match the default parameter   | 
Details
The Thompson assignment_probabilities are the same as the bandit vector except when
control_augment or random_assign_prop are greater than 0, as these arguments will alter the probabilities
of assignment.
Thompson sampling is calculated using the bandit
package but the direct calculation can result in errors or overflow. If this occurs, a simulation based method
from the same package is used instead to estimate the posterior distribution.
If this occurs a warning will be presented. ndraws specifies the number of iterations for the
simulation based method, and the default value is 5000.
The UCB1 algorithm only selects 1 treatment at each period, with no probability matching
so assignment_probabilities will always have 1 element equal to 1, and the rest equal to 0, unless
control_augment or random_assign_prop are greater than 0, which will alter the probabilities of assignment.
For example, if the original vector is (0, 0, 1), and control_augment = 0.2,
the new vector is (0.2, 0, 0.8) assuming the first element is control. If instead the 3rd element
were the control group the resulting vector would not be changed because it already meets the
control group threshold.
Value
A list of length 2 containing:
-  
bandit: Bandit object, either a named numeric vector of Thompson sampling probabilities or a tibble/data.table of UCB1 values. -  
assignment_probabilities:Named numeric vector with a value for each condition containing the probability of being assigned that treatment. 
References
Kuleshov, Volodymyr, and Doina Precup. 2014. "Algorithms for Multi-Armed Bandit Problems." arXiv. doi:10.48550/arXiv.1402.6028.
Loecher, Thomas Lotze and Markus. 2022. "Bandit: Functions for Simple a/B Split Test and Multi-Armed Bandit Analysis." https://cran.r-project.org/package=bandit.
Thompson sampling Algorithm
Description
Thompson sampling Algorithm
Usage
## S3 method for class 'thompson'
get_bandit(past_results, conditions, current_period, ndraws)
Arguments
past_results | 
 A tibble/data.table containing summary of prior periods, with
successes, number of observations, and success rates, which is created by   | 
current_period | 
 Numeric value of length 1; current period of the adaptive trial simulation.  | 
ndraws | 
 A numeric value; When Thompson sampling direct calculations fail, draws from a simulated posterior
will be used to approximate the Thompson sampling probabilities. This is the number of simulations to use, the default
is 5000 to match the default parameter   | 
Details
Thompson sampling is calculated using the bandit package but the direct calculation can fail. If this occurs, a simulation based method is used instead to estimate the posterior distribution, and the user receives a warning.
Value
A named list of length 2, where element 1 is the named numeric vector of Thompson
sampling probabilities, and element 2 is a reference to the same vector. The second element is
adjusted later in the simulation based on what the user has set for control_augment and random_assign_prop to reflect the
probability of assignment to a given treatment at that period.
UCB1 Sampling Algorithm
Description
Calculates upper confidence bounds for each treatment arm
Usage
## S3 method for class 'ucb1'
get_bandit(past_results, conditions, current_period)
Arguments
past_results | 
 A tibble/data.table containing summary of prior periods, with
successes, number of observations, and success rates, which is created by   | 
current_period | 
 Numeric value of length 1; current period of the adaptive trial simulation.  | 
Value
A named list with 2 elements: a tibble/data.table containing UCB1 and success rate for each condition, and a named numeric vector of assignment probabilities, where the highest UCB1 out of the treatments is assigned 1, and the rest 0.
Calculate Observation Level AIPW For Each Treatment Condition
Description
Calculates the augmented inverse probability weighted estimate (AIPW) of treatment success for each observation and treatment (i.e. on the level of a single unit). This method scales the estimated probabilities of success by the probability of being assigned the treatment, and weighted by a the conditional expectation of success from prior periods of an adaptive trial. The conditional expectation function used is a grouped mean by treatment arm.
Usage
get_iaipw(data, assignment_probs, periods, conditions, verbose)
Arguments
data | 
 A data.frame, data.table, or tibble containing input data from the trial. This should be the results of a traditional Randomized Controlled Trial (RCT). Any data.frames will be converted to tibbles internally.  | 
assignment_probs | 
 A tibble/data.table containing the probabilities of being assigned each treatment at a given period.  | 
periods | 
 Numeric value of length 1; number of total periods in the simulation.  | 
verbose | 
 Logical; whether or not to print intermediate messages. Default is FALSE.  | 
Details
The specification for the individual AIPW estimates can be found in Hadad et al. (2021). The formulas in equation 5, formed the basis for this function's calculations. Here the regression adjustment used is the grouped mean of success by treatment, up until the current period of estimation (so at period 5, the grouped mean would be calculated using the results from periods 1 through 4).
Value
A tibble/data.frame, containing the data used in the Multi-Arm-Bandit, with new columns pertaining to the individual AIPW estimate for each person and condition, and probability of assignment for each treatment at each period.
References
Hadad, Vitor, David A. Hirshberg, Ruohan Zhan, Stefan Wager, and Susan Athey. 2021. "Confidence Intervals for Policy Evaluation in Adaptive Experiments." Proceedings of the National Academy of Sciences of the United States of America 118 (15): e2014602118. doi:10.1073/pnas.2014602118.
Gather Past Results for Given Assignment Period
Description
Summarizes results of prior periods to use for the current Multi-Arm-Bandit assignment. This function calculates the number of success under each treatment and the total number of observations assigned to each treatment which are used to calculate UCB1 values or Thompson sampling probabilities.
Usage
get_past_results(
  current_data,
  prior_data,
  perfect_assignment,
  assignment_date_col = NULL,
  conditions
)
Arguments
current_data | 
 A tibble/data.table with only observations from the current sampling period.  | 
prior_data | 
 A tibble/data.table with only the observations from the prior index.  | 
perfect_assignment | 
 Logical; if TRUE, assumes perfect information for treatment assignment (i.e., all outcomes are observed regardless of the date). If FALSE, hides outcomes not yet theoretically observed, based on the dates treatments would have been assigned for each wave. This is useful when simulating batch-based assignment where treatments were assigned on a given day whether or not all the information from a prior batch was available and you have exact dates treatments were assigned.  | 
assignment_date_col | 
 Column in   | 
Details
When perfect_assignment is FALSE, the maximum value from the specified
assignment_date_col in the current data is taken as the last possible date
the researchers conducting the experiment could have learned about a treatment outcome.
All successes that occur past this date are masked and treated as failures for the purposes
of assigning this treatments periods, as it simulates the researchers not having
received that information yet.
Value
A tibble/data.table containing the number of successes, and number of people for each treatment condition.
See Also
Precomputing Key Values for Outcome Imputation
Description
Pre-computes key values required for the outcome imputation step of the Multi-Arm-Bandit
procedure. Calculates the probabilities of success for each treatment block (treatment arm + any blocking specified),
using the grouped means of the original experimental data. When perfect_assignment is FALSE, the average date of
success is calculated for each treatment block at every period.
Usage
imputation_precompute(data, whole_experiment, perfect_assignment, data_cols)
Arguments
data | 
 A data.frame, data.table, or tibble containing input data from the trial. This should be the results of a traditional Randomized Controlled Trial (RCT). Any data.frames will be converted to tibbles internally.  | 
whole_experiment | 
 Logical; if TRUE, uses all past experimental data for imputing outcomes. If FALSE, uses only data available up to the current period. In large datasets or with a high number of periods, setting this to FALSE can be more computationally intensive, though not a significant contributor to total run time.  | 
perfect_assignment | 
 Logical; if TRUE, assumes perfect information for treatment assignment (i.e., all outcomes are observed regardless of the date). If FALSE, hides outcomes not yet theoretically observed, based on the dates treatments would have been assigned for each wave. This is useful when simulating batch-based assignment where treatments were assigned on a given day whether or not all the information from a prior batch was available and you have exact dates treatments were assigned.  | 
data_cols | 
 A named character vector containing the names of columns in  
  | 
Details
imputation_precompute() is an optimization, meant to reduce the cost of calculating these variables
within the simulation loop. When whole_experiment is TRUE, original_summary is a single tibble/data.table,
and used through the simulation. When whole_experiment is FALSE, original_summary is a list of tibbles/data.tables,
each containing the cumulative probabilities of all periods up to the index i.
If perfect_assignment is FALSE, dates_summary is not calculated, and is NULL.
No covariates are used in the calculation, these are all simply grouped averages.
Value
A named list containing:
-  
original_summary: The tibble(s)/data.table(s) containing the probability of success for each treatment block, at each period. -  
dates_summary: A tibble/data.table containing the average success date for each treatment block at each treatment period. 
Outcome Imputation Preparation
Description
Executes all preparations necessary to impute outcomes for
each iteration of the simulation loop. Adds an additional column to the current data,
subsets necessary information from the imputation_precompute() output, and checks to ensure
compatibility with randomizr::block_ra().
Usage
imputation_preparation(
  current_data,
  block_cols,
  imputation_information,
  whole_experiment,
  blocking,
  perfect_assignment,
  current_period
)
Arguments
current_data | 
 A tibble/data.table with only observations from the current sampling period.  | 
block_cols | 
 A character vector of variables to block by. This vector should not be named.  | 
imputation_information | 
 Object created by   | 
whole_experiment | 
 Logical; if TRUE, uses all past experimental data for imputing outcomes. If FALSE, uses only data available up to the current period. In large datasets or with a high number of periods, setting this to FALSE can be more computationally intensive, though not a significant contributor to total run time.  | 
blocking | 
 Logical; whether or not to use treatment blocking. Treatment blocking is used to ensure an even-enough distribution of treatment conditions across blocks. For example, blocking by gender would mean the randomized assignment should split treatments evenly not just throughout the sample (so for 4 arms, 25-25-25-25), but also within each block, so 25% of men would receive each treatment and 25% of women the same.  | 
perfect_assignment | 
 Logical; if TRUE, assumes perfect information for treatment assignment (i.e., all outcomes are observed regardless of the date). If FALSE, hides outcomes not yet theoretically observed, based on the dates treatments would have been assigned for each wave. This is useful when simulating batch-based assignment where treatments were assigned on a given day whether or not all the information from a prior batch was available and you have exact dates treatments were assigned.  | 
current_period | 
 Numeric value of length 1; current treatment wave of the simulation.  | 
Details
The goal of this function is to set up the imputation procedure and prevent
errors from occurring. randomizr::block_ra() does not see the names
of the probabilities passed per block, so the imputation information must be subsetted
to contain only the treatment blocks which exist in a given period.
These checks are not implemented in tryCatch() block because they have to happen
in every iteration.
impute_block is the observation's new treatment block, combining any
blocking variables with their new treatment assigned via the Multi-Arm-Bandit
procedure.
Value
A named list containing:
-  
current_data: A tibble/data.table containingimpute_blockcolumn to guide the outcome imputations -  
impute_success: A tibble/data.table object containing probabilities of success bytreatment_blockused to impute outcomes. Subsetted from theimputation_precompute()output. -  
impute_dates: Named date vector by treatment condition, containing the dates of success to impute if perfect_assignment is FALSE. Subsetted from theimputation_precompute()output. 
Imputing New Outcomes of Multi-Arm-Bandit Trial
Description
Imputes outcomes for the current treatment assignment period.
Uses randomizr::block_ra() to impute the outcomes for observations
who were assigned new treatments. The probabilities used to guide the imputation
of the outcomes are pre-computed using the existing data from the original randomized experiment.
Usage
impute_success(
  current_data,
  imputation_info,
  id_col,
  success_col,
  prior_data = NULL,
  perfect_assignment,
  dates = NULL,
  success_date_col,
  current_period = NULL
)
Arguments
current_data | 
 Updated tibble/data.frame object containing new treatments from   | 
imputation_info | 
 A tibble/data.frame containing conditional probability of success by treatment block, for each
combination that exists in   | 
id_col | 
 Column in   | 
success_col | 
 Column in   | 
prior_data | 
 A tibble/data.frame containing all the data from previous periods. Used to join together at the end for the next iteration of the simulation.  | 
perfect_assignment | 
 Logical; if TRUE, assumes perfect information for treatment assignment (i.e., all outcomes are observed regardless of the date). If FALSE, hides outcomes not yet theoretically observed, based on the dates treatments would have been assigned for each wave. This is useful when simulating batch-based assignment where treatments were assigned on a given day whether or not all the information from a prior batch was available and you have exact dates treatments were assigned.  | 
dates | 
 Named date vector containing average success date by treatment block to impute new success dates for observations whose change in treatment also changes their outcome from failure to success.  | 
success_date_col | 
 Column in   | 
current_period | 
 Numeric value of length 1; current treatment wave of the simulation.  | 
Details
When perfect_assignment is FALSE, dates of success are imputed according to the average
by each period and treatment block (treatment arm + any blocking). These imputations are required because
these observations do not currently have dates of success, as no success was observed during the original experiment.
Therefore if they go through the next iteration of the simulation without being imputed,
the new successes will still be treated as failues becasue of the date masking mechanism.
Observations that were successful in the original experiment, got assigned a new treatment, and then imputed as success again, will have their original date kept. This assumes that the treatment has no individual treatment effect on the date of success, which may or may not be valid depending on the context of the experiment.
See Also
Simulates Multi-Arm Bandit Trial From Prepared Inputs
Description
Internal helper to single_mab_simulation()
and multiple_mab_simulation(). Centralizes necessary functions to conduct a
single Multi-Arm-Bandit Trial with adaptive inference. It assumes all inputs have
been preprocessed by pre_mab_simulation().
Usage
mab_simulation(
  data,
  time_unit,
  perfect_assignment,
  algorithm,
  period_length,
  prior_periods,
  whole_experiment,
  conditions,
  blocking,
  block_cols,
  data_cols,
  verbose,
  assignment_method,
  control_augment,
  imputation_information,
  ndraws,
  random_assign_prop
)
Arguments
data | 
 A data.frame, data.table, or tibble containing input data from the trial. This should be the results of a traditional Randomized Controlled Trial (RCT). Any data.frames will be converted to tibbles internally.  | 
time_unit | 
 A character string specifying the unit of time for assigning periods when   | 
perfect_assignment | 
 Logical; if TRUE, assumes perfect information for treatment assignment (i.e., all outcomes are observed regardless of the date). If FALSE, hides outcomes not yet theoretically observed, based on the dates treatments would have been assigned for each wave. This is useful when simulating batch-based assignment where treatments were assigned on a given day whether or not all the information from a prior batch was available and you have exact dates treatments were assigned.  | 
algorithm | 
 A character string specifying the MAB algorithm to use. Options are "thompson" or "ucb1". Algorithm defines the adaptive assignment process. Mathematical details on these algorithms can be found in Kuleshov and Precup 2014 and Slivkins 2024.  | 
period_length | 
 A numeric value of length 1; represents the length of each treatment period.
If assignment method is "date", this length refers the number of units specified in   | 
prior_periods | 
 A numeric value of length 1, or the character string "All"; number of previous periods to use in the treatment assignment model. This is used to implement the stationary/non-stationary bandit. For example, a non-stationary bandit assumes the true probability of success for each treatment changes over time, so to account for that, not all prior data should be used when making decisions because it could be "out of date".  | 
whole_experiment | 
 Logical; if TRUE, uses all past experimental data for imputing outcomes. If FALSE, uses only data available up to the current period. In large datasets or with a high number of periods, setting this to FALSE can be more computationally intensive, though not a significant contributor to total run time.  | 
blocking | 
 Logical; whether or not to use treatment blocking. Treatment blocking is used to ensure an even-enough distribution of treatment conditions across blocks. For example, blocking by gender would mean the randomized assignment should split treatments evenly not just throughout the sample (so for 4 arms, 25-25-25-25), but also within each block, so 25% of men would receive each treatment and 25% of women the same.  | 
block_cols | 
 A character vector of variables to block by. This vector should not be named.  | 
data_cols | 
 A named character vector containing the names of columns in  
  | 
verbose | 
 Logical; whether or not to print intermediate messages. Default is FALSE.  | 
assignment_method | 
 A character string; one of "date", "batch", or "individual", to define the assignment into treatment waves. When using "batch" or "individual", ensure your dataset is pre-arranged in the proper order observations should be considered so that groups are assigned correctly. For "date", observations will be considered in chronological order. "individual" assignment can be computationally intensive for larger datasets.  | 
control_augment | 
 A numeric value ranging from 0 to 1; proportion of each wave guaranteed to receive the "Control" treatment.
Default is 0. It is not recommended to use this in conjunction with   | 
imputation_information | 
 Object created by   | 
ndraws | 
 A numeric value; When Thompson sampling direct calculations fail, draws from a simulated posterior
will be used to approximate the Thompson sampling probabilities. This is the number of simulations to use, the default
is 5000 to match the default parameter   | 
random_assign_prop | 
 A numeric value ranging from 0 to 1; proportion of each wave to be assigned new treatments randomly,
1 -   | 
Value
: A named list containing:
-  
final_data: The processed tibble or data.table, containing new columns pertaining to the results of the trial. -  
bandits: A tibble or data.table containing the UCB1 values or Thompson sampling posterior distributions for each period. -  
assignment_probs: A tibble or data.table containing the probability of being assigned each treatment arm at a given period. -  
estimates: A tibble or data.table containing the AIPW (Augmented Inverse Probability Weighting) treatment effect estimates and variances, and traditional sample means and variances, for each treatment arm. -  
settings: A named list of the configuration settings used in the trial. 
See Also
Run Multiple Multi-Arm-Bandit Trials with Inference in Parallel
Description
Performs multiple Multi-Arm Bandit Trials using the same
simulation and inference backend as single_mab_simulation(). Allows for
easy execution of multiple trials under the same settings to gauge the variance
of the procedure across execution states. Additionally supports parallel processing
through the future and
furrr packages.
Usage
multiple_mab_simulation(
  data,
  assignment_method,
  algorithm,
  prior_periods,
  perfect_assignment,
  whole_experiment,
  blocking,
  data_cols,
  times,
  seeds,
  control_augment = 0,
  random_assign_prop = 0,
  ndraws = 5000,
  control_condition = NULL,
  time_unit = NULL,
  period_length = NULL,
  block_cols = NULL,
  verbose = FALSE,
  check_args = TRUE,
  keep_data = FALSE
)
Arguments
data | 
 A data.frame, data.table, or tibble containing input data from the trial. This should be the results of a traditional Randomized Controlled Trial (RCT). Any data.frames will be converted to tibbles internally.  | 
assignment_method | 
 A character string; one of "date", "batch", or "individual", to define the assignment into treatment waves. When using "batch" or "individual", ensure your dataset is pre-arranged in the proper order observations should be considered so that groups are assigned correctly. For "date", observations will be considered in chronological order. "individual" assignment can be computationally intensive for larger datasets.  | 
algorithm | 
 A character string specifying the MAB algorithm to use. Options are "thompson" or "ucb1". Algorithm defines the adaptive assignment process. Mathematical details on these algorithms can be found in Kuleshov and Precup 2014 and Slivkins 2024.  | 
prior_periods | 
 A numeric value of length 1, or the character string "All"; number of previous periods to use in the treatment assignment model. This is used to implement the stationary/non-stationary bandit. For example, a non-stationary bandit assumes the true probability of success for each treatment changes over time, so to account for that, not all prior data should be used when making decisions because it could be "out of date".  | 
perfect_assignment | 
 Logical; if TRUE, assumes perfect information for treatment assignment (i.e., all outcomes are observed regardless of the date). If FALSE, hides outcomes not yet theoretically observed, based on the dates treatments would have been assigned for each wave. This is useful when simulating batch-based assignment where treatments were assigned on a given day whether or not all the information from a prior batch was available and you have exact dates treatments were assigned.  | 
whole_experiment | 
 Logical; if TRUE, uses all past experimental data for imputing outcomes. If FALSE, uses only data available up to the current period. In large datasets or with a high number of periods, setting this to FALSE can be more computationally intensive, though not a significant contributor to total run time.  | 
blocking | 
 Logical; whether or not to use treatment blocking. Treatment blocking is used to ensure an even-enough distribution of treatment conditions across blocks. For example, blocking by gender would mean the randomized assignment should split treatments evenly not just throughout the sample (so for 4 arms, 25-25-25-25), but also within each block, so 25% of men would receive each treatment and 25% of women the same.  | 
data_cols | 
 A named character vector containing the names of columns in  
  | 
times | 
 A numeric value of length 1, the number of simulations to conduct.  | 
seeds | 
 An integer vector of   | 
control_augment | 
 A numeric value ranging from 0 to 1; proportion of each wave guaranteed to receive the "Control" treatment.
Default is 0. It is not recommended to use this in conjunction with   | 
random_assign_prop | 
 A numeric value ranging from 0 to 1; proportion of each wave to be assigned new treatments randomly,
1 -   | 
ndraws | 
 A numeric value; When Thompson sampling direct calculations fail, draws from a simulated posterior
will be used to approximate the Thompson sampling probabilities. This is the number of simulations to use, the default
is 5000 to match the default parameter   | 
control_condition | 
 Value of the control condition. Only necessary when   | 
time_unit | 
 A character string specifying the unit of time for assigning periods when   | 
period_length | 
 A numeric value of length 1; represents the length of each treatment period.
If assignment method is "date", this length refers the number of units specified in   | 
block_cols | 
 A character vector of variables to block by. This vector should not be named.  | 
verbose | 
 Logical; Toggles progress bar from   | 
check_args | 
 Logical; Whether or not to robustly check whether arguments are valid. Default is TRUE, and recommended not to be changed.  | 
keep_data | 
 Logical; Whether or not to keep the final data from each trial. Recommended FALSE.  | 
Details
Note that when called if data.table has not been attached already it will be when future.map() runs and a message
may print. This does not mean that if you pass a tibble or data.frame, that data.table will
used.
Implementation
This function simulates multiple adaptive Multi-Arm-Bandit Trials, using experimental
data from a traditional randomized experiment. It follows the same core procedure as
single_mab_simulation() (see details, there for a description), but conducts
more than one simulation. This allows researchers to gauge the variance
of the simulation procedure itself, and use that to form an empirical sampling distribution
of the AIPW estimates, instead of relying around asymptotic normality
[Hadad et al. (2021)] for inference.
The settings specified here have the same meaning as in single_mab_simulation(), outside of the additional
parameters like times and seeds which define the number of multiple trials and random seeds to ensure reproducibility.
An important note is that seeds can only take integer values, so they must be declared or coerced as valid integers,
passing doubles (even ones that are mathematical integers) will result in an error. It is recommended to use sample.int(),
with a known seed beforehand to generate the values. Additionally, it is highly recommended to
set keep_data to FALSE as the memory used by the function will exponentially increase. This can cause
significant performance issues, especially if your system must swap to disk because memory is full.
Parallel Processing
The function provides support for parallel processing via the future and
furrr packages. When conducting a large
number of simulations, parallelization can improve performance if sufficient system resources are available.
Parallel processing must be explicitly set by the user, through future::plan().
Windows users should set the plan to "multisession", while Linux and MacOS users can use "multicore" or "multisession".
Users running in a High Performance Computing environment (HPC), are encouraged to use
future.batchtools,
for their respective HPC scheduler.
Note that parallel processing is not guaranteed to work on all systems, and may require additional setup or debugging effort
from the user. For any issues, users are encouraged to consult the documentation of the above packages.
Value
An object of class multiple.mab, containing:
-  
final_data_nest:A tibble or data.table containing the nested tibbles/data.tables from each trial. Only provided whenkeep_datais TRUE. -  
bandits: A tibble or data.table containing the UCB1 values or Thompson sampling posterior distributions for each period. Wide format, each row is a period, and each columns is a treatment. Each row in this table represents the calculation from the given period after its values were imputed, so row 2 represents the calculations made in period 3, but represent the impact of period 2's new assignments. -  
assignment_probs: A tibble or data.table containing the probability of being assigned each treatment arm at a given period. Wide format, each row is a period, and each columns is a treatment. Each row represents the probability of being assigned each treatment at each period, these have not been shifted like the bandits table. -  
estimates: A tibble or data.table containing the AIPW (Augmented Inverse Probability Weighting) treatment effect estimates and variances, and traditional sample means and variances, for each treatment arm. Long format, treatment arm, and estimate type are columns along with the mean and variance. -  
assignment_quantities: A tibble or data.table containing the number of units assigned to each treatment for each simulation in the set of repeated simulations. -  
settings: A named list of the configuration settings used in the trial. 
References
Bengtsson, Henrik. 2025. "Future: Unified Parallel and Distributed Processing in R for Everyone." https://cran.r-project.org/package=future.
Bengtsson, Henrik. 2025. "Future.Batchtools: A Future API for Parallel and Distributed Processing Using ‘Batchtools.’" https://cran.r-project.org/package=future.batchtools.
Hadad, Vitor, David A. Hirshberg, Ruohan Zhan, Stefan Wager, and Susan Athey. 2021. "Confidence Intervals for Policy Evaluation in Adaptive Experiments." Proceedings of the National Academy of Sciences of the United States of America 118 (15): e2014602118. doi:10.1073/pnas.2014602118.
Kuleshov, Volodymyr, and Doina Precup. 2014. "Algorithms for Multi-Armed Bandit Problems." arXiv. doi:10.48550/arXiv.1402.6028.
Loecher, Thomas Lotze and Markus. 2022. "Bandit: Functions for Simple a/B Split Test and Multi-Armed Bandit Analysis." https://cran.r-project.org/package=bandit.
Offer‐Westort, Molly, Alexander Coppock, and Donald P. Green. 2021. "Adaptive Experimental Design: Prospects and Applications in Political Science." American Journal of Political Science 65 (4): 826–44. doi:10.1111/ajps.12597..
Slivkins, Aleksandrs. 2024. "Introduction to Multi-Armed Bandits." arXiv. doi:10.48550/arXiv.1904.07272.
Vaughan, Davis, Matt Dancho, and RStudio. 2022. "Furrr: Apply Mapping Functions in Parallel Using Futures." https://cran.r-project.org/package=furrr.
See Also
single_mab_simulation(), furrr, future
Examples
# Multiple_mab_simulation() is a useful tool for running multiple trials
# using the same configuration settings, in different random states
data(tanf)
tanf <- tanf[1:50, ]
# The seeds passed must be integers, so it is highly recommended to create them
# before using `sample.int()`
seeds <- sample.int(10000, 5)
## Sequential Execution
x <- multiple_mab_simulation(
  data = tanf,
  assignment_method = "Batch",
  period_length = 25,
  whole_experiment = TRUE,
  blocking = FALSE,
  perfect_assignment = TRUE,
  algorithm = "Thompson",
  prior_periods = "All",
  control_augment = 0,
  data_cols = c(
    condition_col = "condition",
    id_col = "ic_case_id",
    success_col = "success"
  ),
  verbose = FALSE, times = 5, seeds = seeds, keep_data = FALSE
)
print(x)
## Parallel Execution using future:
## Check the future and furrr documentation for more details on possible options
if (requireNamespace("future", quietly = TRUE)) {
    # Set a Proper "plan"
    future::plan("multisession", workers = 2)
    multiple_mab_simulation(
      data = tanf,
      assignment_method = "Batch",
      period_length = 25,
      whole_experiment = TRUE,
      blocking = FALSE,
      perfect_assignment = TRUE,
      algorithm = "Thompson",
      prior_periods = "All",
      control_augment = 0,
      data_cols = c(
        condition_col = "condition",
        id_col = "ic_case_id",
        success_col = "success"
      ),
      verbose = FALSE, times = 5, seeds = seeds, keep_data = TRUE
    )
    # Always Set back to sequential to close processes
   future::plan("sequential")
}
Plot Generic for mab objects
Description
Uses ggplot2::ggplot() to plot the results of a single
Multi-Arm-Bandit trial. Provides options to select the type of plot,
and to change how the plot looks. Objects created can be added to
with + like any other ggplot plot, but arguments to change
the underlying geom must be passed to the function initially.
Usage
## S3 method for class 'mab'
plot(x, type, level = 0.95, save = FALSE, path = NULL, ...)
Arguments
x | 
 A   | 
type | 
 String; Type of plot requested; valid types are: 
  | 
level | 
 Numeric value of length 1; indicates confidence interval Width (i.e 0.90, 0.95, 0.99). Defaults to 0.95.  | 
save | 
 Logical; Whether or not to save the plot to disk; FALSE by default.  | 
path | 
 String; File directory to save file if necessary.  | 
... | 
 Arguments to pass to   | 
Details
This function provides minimalist plots to quickly view the results of any
Multi-Arm-Bandit trial, and has the ability to be customized through the ...
inside the call and + afterwards. However, all the data necessary is
provided in the output of single_mab_simulation() for extreme
customization or professional plots, it is highly recommended
to start completely from scratch and not use the generic.
The confidence intervals applied follow a standard normal distribution because it is assumed the AIPW estimators are asymptotically normal as shown in Hadad et al. (2021)
Value
Minimal ggplot object, that can be customized and added to with + (to change scales, labels, legend, theme, etc.).
References
Hadad, Vitor, David A. Hirshberg, Ruohan Zhan, Stefan Wager, and Susan Athey. 2021. "Confidence Intervals for Policy Evaluation in Adaptive Experiments." Proceedings of the National Academy of Sciences of the United States of America 118 (15): e2014602118. doi:10.1073/pnas.2014602118.
Examples
# Objects returned by `single_mab_simulation()` have a `mab` class.
# This class has a plot generic that has several minimal plots to examine
# the trial quickly
# Loading Data and running a quick simulation
data(tanf)
x <- single_mab_simulation(
  data = tanf,
  algorithm = "Thompson",
  assignment_method = "Batch",
  period_length = 600,
  whole_experiment = TRUE,
  perfect_assignment = TRUE,
  blocking = FALSE,
  prior_periods = "All",
  data_cols = c(
    condition_col = "condition",
    id_col = "ic_case_id",
    success_col = "success"
  )
)
# View best treatment arms over the simulation
y <- plot(x, type = "arm")
y
# Adding a new title
y + ggplot2::labs(title = "Your New Title")
# type = assign creates a similar plot, but shows probability of assignment instead
# Plotting Augmented Inverse Probability Estimates with confidence interval
# By default it provides 95% Normal Confidence Intervals but this can be adjusted
plot(x, type = "estimate")
# Adjusting height of internal geom* argument. (`geom_errorbarh()`)
plot(x, type = "estimate", height = 0.4)
Plot Generic For multiple.mab Objects
Description
Uses ggplot2::ggplot() to plot the results of multiple
Multi-Arm-Bandit trials.
Usage
## S3 method for class 'multiple.mab'
plot(
  x,
  type,
  quantity,
  cdf = NULL,
  level = 0.95,
  save = FALSE,
  path = NULL,
  ...
)
Arguments
x | 
 A   | 
type | 
 String; Type of plot requested; valid types are: 
  | 
quantity | 
 The quantities to plot when   | 
cdf | 
 String; specifies the type of CDF to use when analyzing the estimates.
valid CDFs are the 'empirical' CDF, the 'normal' CDF. Used when type =   | 
level | 
 Numeric value of length 1; indicates confidence interval Width (i.e 0.90, 0.95, 0.99). Defaults to 0.95.  | 
save | 
 Logical; Whether or not to save the plot to disk; FALSE by default.  | 
path | 
 String; File directory to save file.  | 
... | 
 Arguments to pass to   | 
Details
This function provides minimalist plots to quickly view the results of the procedure
and has the ability to be customized through the ...
in the call and + afterwords. However, all the data necessary is
provided in the output of multiple_mab_simulation() for extreme
customization or professional plots, it is highly recommended
to start completely from scratch and not use the generic.
Value
Minimal ggplot object, that can be customized and added to with + (to change scales, labels, legend, theme, etc.).
Examples
# Objects returned by `single_mab_simulation()` have a `mab` class.
# This class has a plot generic has several minimal plots to examine the trials
# quickly
#
#
data(tanf)
tanf <- tanf[1:20, ]
# Simulating a few trials
seeds <- sample.int(100, 5)
conditions <- as.character(unique(tanf$condition))
x <- multiple_mab_simulation(
  data = tanf,
  assignment_method = "Batch",
  period_length = 10,
  whole_experiment = TRUE,
  blocking = FALSE,
  perfect_assignment = TRUE,
  algorithm = "Thompson",
  prior_periods = "All",
  control_augment = 0,
  data_cols = c(
    condition_col = "condition",
    id_col = "ic_case_id",
    success_col = "success"
  ),
  verbose = FALSE,
  times = 5,
  seeds = seeds,
  keep_data = FALSE
)
# View number of times each treatment was the best.
plot(x, type = "summary")
# View a histogram of the AIPW estimates for each treatment.
plot(x, type = "hist", quantity = "estimate")
# Plotting AIPW confidence intervals using the empirical cdf, from the simulated
# trials.
plot(x, type = "estimate", cdf = "empirical")
# Changing the title, like any ggplot2 object.
plot(x, type = "summary") + ggplot2::labs(title = "Your New Title")
# Changing the bin width of the histograms.
plot(x, type = "hist", quantity = "assignment", geom = list(binwidth = 0.05))
Plot Treatment Arms Over Time
Description
Helper to plot.mab(); plots treatment arms over time.
Usage
plot_arms(x, ...)
Arguments
x | 
 A   | 
... | 
 Arguments to pass to   | 
Value
ggplot object
Minimal ggplot object, that can be customized and added to with + (to change scales, labels, legend, theme, etc.).
Plot Cumulative Assignment Probability Over Time
Description
Plot Cumulative Assignment Probability Over Time
Usage
plot_assign(x, ...)
Arguments
x | 
 A   | 
... | 
 Arguments to pass to   | 
Value
ggplot object
Minimal ggplot object, that can be customized and added to with + (to change scales, labels, legend, theme, etc.).
Plot AIPW Estimates
Description
Plot summary of AIPW estimates and variances for each treatment arm.
Usage
plot_estimates(x, level = 0.95, ...)
Arguments
x | 
 A   | 
level | 
 Numeric value of length 1; indicates confidence interval Width (i.e 0.90, 0.95, 0.99). Defaults to 0.95.  | 
... | 
 Arguments to pass to   | 
Value
Minimal ggplot object, that can be customized and added to with + (to change scales, labels, legend, theme, etc.).
Plots Histograms of multiple_mab_simulation() Results
Description
Plots distribution of AIPW estimates over trials for plot.multiple.mab() or the distribution of the number of observations assigned to each treatment arm.
Usage
plot_hist(x, quantity, params)
Arguments
x | 
 A   | 
quantity | 
 The quantities to plot when   | 
params | 
 The dynamic dots (  | 
Value
Minimal ggplot object, that can be customized and added to with + (to change scales, labels, legend, theme, etc.).
Plots AIPW Confidence Intervals
Description
Plots the uncertainty AIPW estimates for each arm using the specified variance from the repeated trials for plot.multiple.mab().
Usage
plot_mult_estimates(x, cdf, level, ...)
Arguments
x | 
 A   | 
cdf | 
 String; specifies the type of CDF to use when analyzing the estimates.
valid CDFs are the 'empirical' CDF, the 'normal' CDF. Used when type =   | 
level | 
 Numeric value of length 1; indicates confidence interval Width (i.e 0.90, 0.95, 0.99). Defaults to 0.95.  | 
... | 
 Arguments to pass to   | 
Value
Minimal ggplot object, that can be customized and added to with + (to change scales, labels, legend, theme, etc.).
Plot Treatment Arms Over Multiple Trials
Description
Plots summary results for plot.multiple.mab(), shows then number of times each arm was selected as the best in a bar chart.
Usage
plot_summary(x, ...)
Arguments
x | 
 A   | 
... | 
 Arguments to pass to   | 
Value
Minimal ggplot object, that can be customized and added to with + (to change scales, labels, legend, theme, etc.).
Pre-Simulation Setup for an adaptive Multi-Arm-Bandit Trial
Description
Common function for all the actions that need to take place before running the Multi-Arm-Bandit simulation. Intakes the data and column names to check for valid arguments, format and create new columns as needed, and pre-compute key values to avoid doing so within the simulation loop.
Usage
pre_mab_simulation(
  data,
  assignment_method,
  algorithm,
  control_condition,
  prior_periods,
  perfect_assignment,
  whole_experiment,
  blocking,
  data_cols,
  control_augment,
  time_unit,
  period_length,
  block_cols,
  verbose,
  ndraws,
  random_assign_prop,
  check_args
)
Arguments
data | 
 A data.frame, data.table, or tibble containing input data from the trial. This should be the results of a traditional Randomized Controlled Trial (RCT). Any data.frames will be converted to tibbles internally.  | 
assignment_method | 
 A character string; one of "date", "batch", or "individual", to define the assignment into treatment waves. When using "batch" or "individual", ensure your dataset is pre-arranged in the proper order observations should be considered so that groups are assigned correctly. For "date", observations will be considered in chronological order. "individual" assignment can be computationally intensive for larger datasets.  | 
algorithm | 
 A character string specifying the MAB algorithm to use. Options are "thompson" or "ucb1". Algorithm defines the adaptive assignment process. Mathematical details on these algorithms can be found in Kuleshov and Precup 2014 and Slivkins 2024.  | 
control_condition | 
 Value of the control condition. Only necessary when   | 
prior_periods | 
 A numeric value of length 1, or the character string "All"; number of previous periods to use in the treatment assignment model. This is used to implement the stationary/non-stationary bandit. For example, a non-stationary bandit assumes the true probability of success for each treatment changes over time, so to account for that, not all prior data should be used when making decisions because it could be "out of date".  | 
perfect_assignment | 
 Logical; if TRUE, assumes perfect information for treatment assignment (i.e., all outcomes are observed regardless of the date). If FALSE, hides outcomes not yet theoretically observed, based on the dates treatments would have been assigned for each wave. This is useful when simulating batch-based assignment where treatments were assigned on a given day whether or not all the information from a prior batch was available and you have exact dates treatments were assigned.  | 
whole_experiment | 
 Logical; if TRUE, uses all past experimental data for imputing outcomes. If FALSE, uses only data available up to the current period. In large datasets or with a high number of periods, setting this to FALSE can be more computationally intensive, though not a significant contributor to total run time.  | 
blocking | 
 Logical; whether or not to use treatment blocking. Treatment blocking is used to ensure an even-enough distribution of treatment conditions across blocks. For example, blocking by gender would mean the randomized assignment should split treatments evenly not just throughout the sample (so for 4 arms, 25-25-25-25), but also within each block, so 25% of men would receive each treatment and 25% of women the same.  | 
data_cols | 
 A named character vector containing the names of columns in  
  | 
control_augment | 
 A numeric value ranging from 0 to 1; proportion of each wave guaranteed to receive the "Control" treatment.
Default is 0. It is not recommended to use this in conjunction with   | 
time_unit | 
 A character string specifying the unit of time for assigning periods when   | 
period_length | 
 A numeric value of length 1; represents the length of each treatment period.
If assignment method is "date", this length refers the number of units specified in   | 
block_cols | 
 A character vector of variables to block by. This vector should not be named.  | 
verbose | 
 Logical; whether or not to print intermediate messages. Default is FALSE.  | 
ndraws | 
 A numeric value; When Thompson sampling direct calculations fail, draws from a simulated posterior
will be used to approximate the Thompson sampling probabilities. This is the number of simulations to use, the default
is 5000 to match the default parameter   | 
random_assign_prop | 
 A numeric value ranging from 0 to 1; proportion of each wave to be assigned new treatments randomly,
1 -   | 
check_args | 
 Logical; Whether or not to robustly check whether arguments are valid. Default is TRUE, and recommended not to be changed.  | 
Details
If a data.frame is passed as input data it is internally converted into a tibble. If a data.table is passed it is copied to avoid modifying the original dataset in the users environment.
Value
Named list containing:
-  
data_cols: List of necessary columns indataas strings and symbols. -  
block_cols: List of columns to block by indataas strings and symbols. -  
data: Prepared tibble/data.table containing all the necessary columns to conduct the adaptive trial simulation. columns required formab_simulation(). -  
imputation_information: List containing necessary information for outcome and date imputation formab_simulation(). 
See Also
Print Generic For mab
Description
Custom Print Display for objects of mab class returned by single_mab_simulation().
Prevents the large list from being printed directly, and provides
useful information about the settings of each trial.
Usage
## S3 method for class 'mab'
print(x, ...)
Arguments
x | 
 A   | 
... | 
 Further arguments passed to or from other methods.  | 
Details
The items used to create the text summary can be found in the settings element of the output object.
... is provided to be compatible with print(), but no other arguments
change the output.
Value
Text summary of settings used for the Multi-Arm Bandit trial.
Examples
# Running a Trial
x <- single_mab_simulation(
  data = tanf,
  algorithm = "thompson",
  assignment_method = "batch",
  period_length = 1750,
  prior_periods = "All",
  blocking = FALSE,
  whole_experiment = TRUE,
  perfect_assignment = TRUE,
  data_cols = c(
    id_col = "ic_case_id",
    success_col = "success",
    condition_col = "condition"
  )
)
print(x)
Print Generic For multiple.mab
Description
Custom Print Display for multiple.mab objects returned by multiple_mab_simulation().
Prevents the large list output from being printed directly, and provides
useful information about the settings for the trials.
Usage
## S3 method for class 'multiple.mab'
print(x, ...)
Arguments
x | 
 A   | 
... | 
 Further arguments passed to or from other methods.  | 
Details
The items used to create the text summary can be found in the settings element of the output object.
... is provided to be compatible with print(), no other arguments
change output.
Value
Text summary of settings used for the Multi-Arm Bandit trials.
Examples
# Running Multiple Simulations
x <- multiple_mab_simulation(
  data = tanf,
  algorithm = "thompson",
  assignment_method = "Batch",
  period_length = 1750,
  prior_periods = "All",
  blocking = FALSE,
  whole_experiment = TRUE,
  perfect_assignment = TRUE,
  data_cols = c(
    id_col = "ic_case_id",
    success_col = "success",
    condition_col = "condition"
  ),
  times = 5, seeds = sample.int(5)
)
print(x)
Print Helper for mab and multiple.mab
Description
Common items for the print generics for mab and multiple.mab classes
Usage
print_mab(mab)
Arguments
mab | 
 A   | 
Value
Text summary of settings used for the Multi-Arm Bandit trial.
Runs Multi-Arm Bandit Trial
Description
Performs a full Multi-Arm Bandit (MAB) trial using Thompson sampling or UCB1.
The function provides loop around each step of the process for each treatment wave, performing adaptive
treatment assignment, and outcome imputation. Supports flexible customization passed
from single_mab_simulation() and multiple_mab_simulation() in treatment blocking strategy,
stationary/non-stationary bandits, control augmentation, and hybrid assignment.
Usage
run_mab_trial(
  data,
  time_unit,
  period_length = NULL,
  data_cols,
  block_cols,
  blocking,
  prior_periods,
  algorithm,
  whole_experiment,
  perfect_assignment,
  conditions,
  verbose,
  control_augment,
  imputation_information,
  ndraws,
  random_assign_prop
)
Arguments
data | 
 A data.frame, data.table, or tibble containing input data from the trial. This should be the results of a traditional Randomized Controlled Trial (RCT). Any data.frames will be converted to tibbles internally.  | 
time_unit | 
 A character string specifying the unit of time for assigning periods when   | 
period_length | 
 A numeric value of length 1; represents the length of each treatment period.
If assignment method is "date", this length refers the number of units specified in   | 
data_cols | 
 A named character vector containing the names of columns in  
  | 
block_cols | 
 A character vector of variables to block by. This vector should not be named.  | 
blocking | 
 Logical; whether or not to use treatment blocking. Treatment blocking is used to ensure an even-enough distribution of treatment conditions across blocks. For example, blocking by gender would mean the randomized assignment should split treatments evenly not just throughout the sample (so for 4 arms, 25-25-25-25), but also within each block, so 25% of men would receive each treatment and 25% of women the same.  | 
prior_periods | 
 A numeric value of length 1, or the character string "All"; number of previous periods to use in the treatment assignment model. This is used to implement the stationary/non-stationary bandit. For example, a non-stationary bandit assumes the true probability of success for each treatment changes over time, so to account for that, not all prior data should be used when making decisions because it could be "out of date".  | 
algorithm | 
 A character string specifying the MAB algorithm to use. Options are "thompson" or "ucb1". Algorithm defines the adaptive assignment process. Mathematical details on these algorithms can be found in Kuleshov and Precup 2014 and Slivkins 2024.  | 
whole_experiment | 
 Logical; if TRUE, uses all past experimental data for imputing outcomes. If FALSE, uses only data available up to the current period. In large datasets or with a high number of periods, setting this to FALSE can be more computationally intensive, though not a significant contributor to total run time.  | 
perfect_assignment | 
 Logical; if TRUE, assumes perfect information for treatment assignment (i.e., all outcomes are observed regardless of the date). If FALSE, hides outcomes not yet theoretically observed, based on the dates treatments would have been assigned for each wave. This is useful when simulating batch-based assignment where treatments were assigned on a given day whether or not all the information from a prior batch was available and you have exact dates treatments were assigned.  | 
verbose | 
 Logical; whether or not to print intermediate messages. Default is FALSE.  | 
control_augment | 
 A numeric value ranging from 0 to 1; proportion of each wave guaranteed to receive the "Control" treatment.
Default is 0. It is not recommended to use this in conjunction with   | 
imputation_information | 
 Object created by   | 
ndraws | 
 A numeric value; When Thompson sampling direct calculations fail, draws from a simulated posterior
will be used to approximate the Thompson sampling probabilities. This is the number of simulations to use, the default
is 5000 to match the default parameter   | 
random_assign_prop | 
 A numeric value ranging from 0 to 1; proportion of each wave to be assigned new treatments randomly,
1 -   | 
Details
The first period is used to start the trial, so the MAB loop starts at period number 2.
Value
A named list containing:
-  
final_data: The processed tibble or data.table, containing new columns pertaining to the results of the trial. -  
bandits: A tibble or data.table containing the UCB1 values or Thompson sampling posterior distributions for each period. -  
assignment_probs: A tibble or data.table containing the probability of being assigned each treatment arm at a given period. 
Run One Adaptive Simulation With Inference.
Description
Performs a single Multi-Arm Bandit (MAB) trial using experimental data from an original randomized controlled trial, and adaptive inference strategies as described in Hadad et al. (2021). Wraps around the internal implementation functions, and performs the full MAB pipeline: preparing inputs, assigning treatments and imputing successes, and adaptively weighted estimation. See the details and vignettes to learn more.
Usage
single_mab_simulation(
  data,
  assignment_method,
  algorithm,
  prior_periods,
  perfect_assignment,
  whole_experiment,
  blocking,
  data_cols,
  control_augment = 0,
  random_assign_prop = 0,
  ndraws = 5000,
  control_condition = NULL,
  time_unit = NULL,
  period_length = NULL,
  block_cols = NULL,
  verbose = FALSE,
  check_args = TRUE
)
Arguments
data | 
 A data.frame, data.table, or tibble containing input data from the trial. This should be the results of a traditional Randomized Controlled Trial (RCT). Any data.frames will be converted to tibbles internally.  | 
assignment_method | 
 A character string; one of "date", "batch", or "individual", to define the assignment into treatment waves. When using "batch" or "individual", ensure your dataset is pre-arranged in the proper order observations should be considered so that groups are assigned correctly. For "date", observations will be considered in chronological order. "individual" assignment can be computationally intensive for larger datasets.  | 
algorithm | 
 A character string specifying the MAB algorithm to use. Options are "thompson" or "ucb1". Algorithm defines the adaptive assignment process. Mathematical details on these algorithms can be found in Kuleshov and Precup 2014 and Slivkins 2024.  | 
prior_periods | 
 A numeric value of length 1, or the character string "All"; number of previous periods to use in the treatment assignment model. This is used to implement the stationary/non-stationary bandit. For example, a non-stationary bandit assumes the true probability of success for each treatment changes over time, so to account for that, not all prior data should be used when making decisions because it could be "out of date".  | 
perfect_assignment | 
 Logical; if TRUE, assumes perfect information for treatment assignment (i.e., all outcomes are observed regardless of the date). If FALSE, hides outcomes not yet theoretically observed, based on the dates treatments would have been assigned for each wave. This is useful when simulating batch-based assignment where treatments were assigned on a given day whether or not all the information from a prior batch was available and you have exact dates treatments were assigned.  | 
whole_experiment | 
 Logical; if TRUE, uses all past experimental data for imputing outcomes. If FALSE, uses only data available up to the current period. In large datasets or with a high number of periods, setting this to FALSE can be more computationally intensive, though not a significant contributor to total run time.  | 
blocking | 
 Logical; whether or not to use treatment blocking. Treatment blocking is used to ensure an even-enough distribution of treatment conditions across blocks. For example, blocking by gender would mean the randomized assignment should split treatments evenly not just throughout the sample (so for 4 arms, 25-25-25-25), but also within each block, so 25% of men would receive each treatment and 25% of women the same.  | 
data_cols | 
 A named character vector containing the names of columns in  
  | 
control_augment | 
 A numeric value ranging from 0 to 1; proportion of each wave guaranteed to receive the "Control" treatment.
Default is 0. It is not recommended to use this in conjunction with   | 
random_assign_prop | 
 A numeric value ranging from 0 to 1; proportion of each wave to be assigned new treatments randomly,
1 -   | 
ndraws | 
 A numeric value; When Thompson sampling direct calculations fail, draws from a simulated posterior
will be used to approximate the Thompson sampling probabilities. This is the number of simulations to use, the default
is 5000 to match the default parameter   | 
control_condition | 
 Value of the control condition. Only necessary when   | 
time_unit | 
 A character string specifying the unit of time for assigning periods when   | 
period_length | 
 A numeric value of length 1; represents the length of each treatment period.
If assignment method is "date", this length refers the number of units specified in   | 
block_cols | 
 A character vector of variables to block by. This vector should not be named.  | 
verbose | 
 Logical; whether or not to print intermediate messages. Default is FALSE.  | 
check_args | 
 Logical; Whether or not to robustly check whether arguments are valid. Default is TRUE, and recommended not to be changed.  | 
Details
For all the items laballed as a tibble or data.table, data.tables will be used if the user passed data is a
data.table, tibbles used otherwise.
Implementation
At each period, either the Thompson sampling probabilities or UCB1 values are calculated based on
the outcomes from the number of prior_periods specified. New treatments are then assigned randomly using the Thompson
sampling probabilities via the randomizr
package, or as the treatment with the highest UCB1 values, while implementing the specific
treatment blocking and control augmentation specified. More details on bandit algorithms can in
Kuleshov and Precup 2014 and
Slivkins 2024.
If a hybrid assignment is specified, here is where it is implemented in the simulation.
control_augment is a threshold probability for the control group, and the assignment probabilities
are changed to ensure that threshold is met. The other hybrid assignment is random_assign_prop. Here, the specified
proportion of the data is set aside to assign treatments randomly, while the rest of the data is assigned through the bandit procedure.
After assigning treatments, observations with new treatments have their outcomes imputed, with any
specified treatment blocking. The probabilities of success used to impute,
are estimated via the grouped means of successes from the original data either from the whole trial, or
up to that period, defined by whole_experiment.
If perfect_assignment is FALSE, new dates of success will be imputed using averages
of those dates in the period, grouped by treatment block. Observations for which
their treatment changed, but their outcome was success in the original and simulation, do not have their date changed.
When the next period starts, the success dates are checked against the maximum/latest assignment_date for the period, and
if any success occurs after that, it is treated as a failure for the purpose of the bandit decision algorithms.
At the end of the simulation the results are aggregated together to calculate the Adaptively Weighted Augmented Inverse Probability Estimator (Hadad et al. 2021) using the mean and variance formulas provided, under the constant allocation rate adaptive schema. These estimators are unbiased and asymptotically normal under the adaptive conditions which is why they are used. For a complete view of their properties, reading the paper is recommended.
Performance Concerns
This procedure has the potential to be computationally expensive and time-consuming. Performance depends on the relative size of each period, number of periods, and overall size of the dataset. This function has separate support for data.frames and data.tables. If a data.frame is passed, the function uses a combination of dplyr, tidyr and base R to shape data, and run the simulation. However, if a data.table is passed the function exclusively uses the data.table code for all the same operations.
In general, smaller batches run faster under base R, while larger ones could benefit from the performance
and memory efficiencies provided by data.table. However, we've observed larger datasets can cause numerical
instability with some calculations in the Thompson sampling procedure. Internal safeguards exist to prevent this, but
the best way to preempt any issues is to set prior_periods to a low number.
For more information about how to use the function, please view the vignette.
Value
An object of class mab, containing:
-  
final_data: The processed tibble or data.table, containing new columns pertaining to the results of the trial. Specifically Contains:-  
period_number: Assigned period for simulation. -  
mab_*: New treatment conditions and outcomes under the simulation. -  
impute_req: Whether observation required an imputed outcome. -  
*block: variables relating to the block specified for treatment blocking, and the concatenation of that block with an observations original treatment, and new treatment. -  
aipw_*Columns containing individual Augmented Inverse Probability Weighted estimates for each observation and treatment arm. -  
prior_rate_*: Columns containing success rate for each treatment arm, from all periods before the observations period of the simulation. -  
*_assign_prob: Columns containing probability of being assigned each treatment at the given period. 
 -  
 -  
bandits: A tibble or data.table containing the UCB1 values or Thompson sampling posterior distributions for each period. Wide format, each row is a period, and each columns is a treatment. Each row in this table represents the calculation from the given period after its values were imputed, so row 2 represents the calculations made in period 3, but represent the impact of period 2's new assignments. -  
assignment_probs: A tibble or data.table containing the probability of being assigned each treatment arm at a given period. Wide format, each row is a period, and each columns is a treatment. Each row represents the probability of being assigned each treatment at each period, these have not been shifted like the bandits table. -  
estimates: A tibble or data.table containing the AIPW (Augmented Inverse Probability Weighting) treatment effect estimates and variances, and traditional sample means and variances, for each treatment arm. Long format, treatment arm, and estimate type are columns along with the mean and variance. -  
settings: A named list of the configuration settings used in the trial. 
References
Hadad, Vitor, David A. Hirshberg, Ruohan Zhan, Stefan Wager, and Susan Athey. 2021. "Confidence Intervals for Policy Evaluation in Adaptive Experiments." Proceedings of the National Academy of Sciences of the United States of America 118 (15): e2014602118. doi:10.1073/pnas.2014602118.
Kuleshov, Volodymyr, and Doina Precup. 2014. "Algorithms for Multi-Armed Bandit Problems." arXiv. doi:10.48550/arXiv.1402.6028.
Loecher, Thomas Lotze and Markus. 2022. "Bandit: Functions for Simple a/B Split Test and Multi-Armed Bandit Analysis." https://cran.r-project.org/package=bandit.
Offer‐Westort, Molly, Alexander Coppock, and Donald P. Green. 2021. "Adaptive Experimental Design: Prospects and Applications in Political Science." American Journal of Political Science 65 (4): 826-44. doi:10.1111/ajps.12597.
Slivkins, Aleksandrs. 2024. "Introduction to Multi-Armed Bandits." arXiv. doi:10.48550/arXiv.1904.07272.
See Also
multiple_mab_simulation(), summary.mab(), plot.mab().
Examples
# Loading Example Data and defining conditions
data(tanf)
## Running Thompson sampling with 500 person large batches,
## with no blocks and imperfect assignment
single_mab_simulation(
  data = tanf,
  assignment_method = "Batch",
  algorithm = "Thompson",
  period_length = 500,
  prior_periods = "All",
  blocking = FALSE,
  whole_experiment = TRUE,
  perfect_assignment = FALSE,
  data_cols = c(
    condition_col = "condition",
    id_col = "ic_case_id",
    success_col = "success",
    success_date_col = "date_of_recert",
    assignment_date_col = "letter_sent_date"
  )
)
## Running UCB1 Sampling with 1 Month based batches and
## control augmentation set to 0.25, with perfect_assignment.
## When using control_augment > 0, conditions need to have proper names
# no_letter is control, the others are treatments
single_mab_simulation(
  data = tanf,
  assignment_method = "Date",
  time_unit = "Month",
  algorithm = "UCB1",
  period_length = 1,
  prior_periods = "All",
  blocking = FALSE,
  whole_experiment = TRUE,
  perfect_assignment = TRUE,
  control_condition = "no_letter",
  control_augment = 0.25,
  data_cols = c(
    condition_col = "condition",
    id_col = "ic_case_id",
    success_col = "success",
    date_col = "appt_date",
    month_col = "recert_month"
  )
)
## 5 Day Periods with Thompson, Treatment Blocking by Service Center,
## Whole experiment FALSE, and hybrid assignment 10% random, 90% bandit.
single_mab_simulation(
  data = tanf,
  assignment_method = "Date",
  time_unit = "Day",
  algorithm = "Thompson",
  period_length = 5,
  prior_periods = "All",
  blocking = TRUE,
  block_cols = c("service_center"),
  whole_experiment = TRUE,
  perfect_assignment = TRUE,
  random_assign_prop = 0.1,
  data_cols = c(
    condition_col = "condition",
    id_col = "ic_case_id",
    success_col = "success",
    date_col = "appt_date"
  )
)
Summary Generic For "mab" Class
Description
Summarizes the Results of a Single Multi-Arm Bandit Trial. Provides confidence intervals around the AIPW estimates, final calculations of the Thompson sampling probabilities or UCB1 values, and the number of observations assigned for each arm.
Usage
## S3 method for class 'mab'
summary(object, level = 0.95, ...)
Arguments
object | 
 A   | 
level | 
 Numeric value of length 1; indicates confidence interval Width (i.e 0.90, 0.95, 0.99). Defaults to 0.95.  | 
... | 
 Additional arguments.  | 
Details
The confidence intervals applied follow a standard normal distribution because it is assumed the AIPW estimators are asymptotically normal as shown in Hadad et al. (2021).
... is provided to be compatible with summary(), the function
does not have any additional arguments.
All of the data provided to create a table like this is present in the object
created by single_mab_simulation() but
this provides a simple shortcut, which is useful when testing many
different simulations.
Value
A tibble containing summary information from the trial with the columns:
-  
Treatment_Arm: Contains the treatment condition. -  
Probability_Of_Best_Arm/UCB1_Value: Final Thompson sampling probabilities or UCB1 values for each treatment. -  
estimated_probability_of_success: The AIPW estimates for the probability of success for each treatment. -  
SE: The standard error for the AIPW estimates. -  
lower_bound: The lower bound on the normal confidence interval for theestimated_probability_of_success. Default is 95%. -  
upper_bound: The upper bound on the normal confidence interval for theestimated_probability_of_success. Default is 95%. -  
num_assigned: The number of observations assigned to each treatment under the simulated trial. -  
level: The confidence level for the confidence interval, default is 95%. -  
periods: The total number of periods of the simulation. 
References
Hadad, Vitor, David A. Hirshberg, Ruohan Zhan, Stefan Wager, and Susan Athey. 2021. "Confidence Intervals for Policy Evaluation in Adaptive Experiments." Proceedings of the National Academy of Sciences of the United States of America 118 (15): e2014602118. doi:10.1073/pnas.2014602118.
Examples
# Objects returned by `single_mab_simulation()` have a `mab` class.
# This class has a summary generic that can produce quick results of the trial.
# Loading Data and running a quick simulation
data(tanf)
x <- single_mab_simulation(
  data = tanf,
  algorithm = "Thompson",
  assignment_method = "Batch",
  period_length = 600,
  whole_experiment = TRUE,
  perfect_assignment = TRUE,
  blocking = FALSE,
  prior_periods = "All",
  data_cols = c(
    condition_col = "condition",
    id_col = "ic_case_id",
    success_col = "success"
  )
)
# Creating summary table
## Defaults to 95% confidence interval
summary(x) |> print(width = Inf)
## 70% confidence level
summary(x, level = 0.7) |> print(width = Inf)
Summary Generic For "multiple.mab" Class
Description
Summarizes results of multiple Multi-Arm Bandit Trials. Provides empirically estimated and normally approximated confidence intervals on AIPW estimates for probability of success, the number of times each arm was the chosen as the best treatment across all simulations, and the average for how many units were assigned to each treatment across all the simulations.
Usage
## S3 method for class 'multiple.mab'
summary(object, level = 0.95, ...)
Arguments
object | 
 A   | 
level | 
 Numeric value of length 1; indicates confidence interval Width (i.e 0.90, 0.95, 0.99). Defaults to 0.95.  | 
... | 
 Additional arguments.  | 
Details
The empirically estimated variances and confidence intervals, use the variance measured directly in the AIPW estimates for each treatment over all the simulations. The normal confidence intervals are estimated using an average of the measured variances across the simulations.
The best arm at the end of each trial is chosen by the highest UCB1 value or Thompson sampling probability. These values indicate which treatment would be chosen next, or have the highest probability of being chosen next, therefore representing the current best treatment.
Additionally, an average and standard deviation for the number of units assigned to each treatment across all the simulations is provided.
... is provided to be compatible with summary(), the function
does not have any additional arguments.
Value
A tibble containing summary information from the repeated trials with the columns:
-  
Treatment_Arm: Contains the treatment condition. -  
average_probability_of_success: The average of the AIPW estimates for the probability of success for each treatment across the trials. -  
SE_avg: The standard error for the AIPW estimates, calculated as the square root of the average of the variances. -  
SE_empirical: The standard error estimated empirically as the standard deviation of the all the calculated AIPW estimates for probability of success. -  
lower_normal: The lower bound on the normal confidence interval for theestimated_probability_of_success. Default is 95%. -  
upper_normal: The upper bound on the normal confidence interval for theestimated_probability_of_success. Default is 95%. -  
lower_empirical: The lower bound on the empirical confidence interval for theestimated_probability_of_success. Calculated using the observed distribution of AIPW estimated probabilities of success. Default is 95%. -  
upper_empirical: The upper bound on the empirical confidence interval for theestimated_probability_of_success. Calculated using the observed distribution of AIPW estimated probabilities of success. Default is 95%. -  
times_best: The number of times each treatment arm was selected as the best for an individual simulation. -  
average_num_assigned: The average number of observations assigned to each treatment under the simulated trials. -  
sd_num_assigned: The standard deviation for the number of observations assigned to each treatment under the simulated trials. -  
level: The confidence level for the confidence interval, default is 95%. 
Examples
# Objects returned by `multiple_mab_simulation()` have a `multiple.mab` class.
# This class has a summary generic that can produce quick results of the trials
data(tanf)
tanf <- tanf[1:100, ]
# Simulating a few trials
seeds <- sample.int(10000, 5)
x <- multiple_mab_simulation(
  data = tanf,
  assignment_method = "Batch",
  period_length = 20,
  whole_experiment = TRUE,
  blocking = FALSE,
  perfect_assignment = TRUE,
  algorithm = "Thompson",
  prior_periods = "All",
  control_augment = 0,
  data_cols = c(
    condition_col = "condition",
    id_col = "ic_case_id",
    success_col = "success"
  ),
  verbose = FALSE,
  times = 5,
  seeds = seeds,
  keep_data = FALSE
)
# Creating summary table
## Defaults to 95% confidence interval
summary(x) |> print(width = Inf)
## 70% confidence level
summary(x, level = 0.7) |> print(width = Inf)
Public TANF Recipient Data From Washington D.C
Description
A modified version of the data set used in https://thelabprojects.dc.gov/benefits-reminder-letter with one additional column added for analysis.
Usage
data(tanf)
Format
An object of class tbl_df (inherits from tbl, data.frame) with 3517 rows and 21 columns.
Details
Variables are as follows:
- ic_case_id
 Unique, anonymized case identifier.
- service_center
 DC Department of Human Services Center assigned each case.
- condition
 The assigned letter condition: "No Letter", "Open Appointment", or "Specific Appointment".
- recert_month
 Recertification Month.
- letter_sent_date
 Date the second (treatment) letter was sent.
- recert_id
 Administrative recertification identifier.
- return_to_sender
 Indicates whether letter was returned as undeliverable
- pdc_status
 PDC Status
- renewal_date
 Date by which renewal must be completed.
- notice_date.x
 Date the first notice was sent (initial legal communication)
- days_betwn_notice_and_recert_due
 Number of days between the first notice and the recertification due date.
- cert_period_start
 Start date of the recertification period.
- cert_period_end
 End date of recertification period.
- recert_status
 Status of recertification process (Pending, Denied, etc.)
- denial_reason
 Reason for denial if recertification was not approved.
- recert_month_year
 Combined recertification month and year.
- notice_date.y
 Alternate record of first notice date.
- recert_status_dcas
 Official recertification status from DCAS
- date_of_recert
 Date the recertification was successfully submitted (if applicable).
- success
 Binary variable indicating successful recertification based on recert_status (newly added column).
Source
Validates Inputs For single_mab_simulation() and multiple_mab_simulation()
Description
This function checks to ensure that all required arguments
have been properly passed to the function before continuing with the simulation. When
errors are thrown, user-friendly messages are provided to indicate which argument
was misspecified. Additionally, when verbose = TRUE, additional warning
messages may be shown if unnecessary arguments are passed.
Usage
validate_inputs(
  data,
  assignment_method,
  algorithm,
  prior_periods,
  perfect_assignment,
  whole_experiment,
  blocking,
  data_cols,
  block_cols,
  time_unit,
  period_length,
  control_augment,
  verbose,
  ndraws,
  random_assign_prop
)
Arguments
data | 
 A data.frame, data.table, or tibble containing input data from the trial. This should be the results of a traditional Randomized Controlled Trial (RCT). Any data.frames will be converted to tibbles internally.  | 
assignment_method | 
 A character string; one of "date", "batch", or "individual", to define the assignment into treatment waves. When using "batch" or "individual", ensure your dataset is pre-arranged in the proper order observations should be considered so that groups are assigned correctly. For "date", observations will be considered in chronological order. "individual" assignment can be computationally intensive for larger datasets.  | 
algorithm | 
 A character string specifying the MAB algorithm to use. Options are "thompson" or "ucb1". Algorithm defines the adaptive assignment process. Mathematical details on these algorithms can be found in Kuleshov and Precup 2014 and Slivkins 2024.  | 
prior_periods | 
 A numeric value of length 1, or the character string "All"; number of previous periods to use in the treatment assignment model. This is used to implement the stationary/non-stationary bandit. For example, a non-stationary bandit assumes the true probability of success for each treatment changes over time, so to account for that, not all prior data should be used when making decisions because it could be "out of date".  | 
perfect_assignment | 
 Logical; if TRUE, assumes perfect information for treatment assignment (i.e., all outcomes are observed regardless of the date). If FALSE, hides outcomes not yet theoretically observed, based on the dates treatments would have been assigned for each wave. This is useful when simulating batch-based assignment where treatments were assigned on a given day whether or not all the information from a prior batch was available and you have exact dates treatments were assigned.  | 
whole_experiment | 
 Logical; if TRUE, uses all past experimental data for imputing outcomes. If FALSE, uses only data available up to the current period. In large datasets or with a high number of periods, setting this to FALSE can be more computationally intensive, though not a significant contributor to total run time.  | 
blocking | 
 Logical; whether or not to use treatment blocking. Treatment blocking is used to ensure an even-enough distribution of treatment conditions across blocks. For example, blocking by gender would mean the randomized assignment should split treatments evenly not just throughout the sample (so for 4 arms, 25-25-25-25), but also within each block, so 25% of men would receive each treatment and 25% of women the same.  | 
data_cols | 
 A named character vector containing the names of columns in  
  | 
block_cols | 
 A character vector of variables to block by. This vector should not be named.  | 
time_unit | 
 A character string specifying the unit of time for assigning periods when   | 
period_length | 
 A numeric value of length 1; represents the length of each treatment period.
If assignment method is "date", this length refers the number of units specified in   | 
control_augment | 
 A numeric value ranging from 0 to 1; proportion of each wave guaranteed to receive the "Control" treatment.
Default is 0. It is not recommended to use this in conjunction with   | 
verbose | 
 Logical; whether or not to print intermediate messages. Default is FALSE.  | 
ndraws | 
 A numeric value; When Thompson sampling direct calculations fail, draws from a simulated posterior
will be used to approximate the Thompson sampling probabilities. This is the number of simulations to use, the default
is 5000 to match the default parameter   | 
random_assign_prop | 
 A numeric value ranging from 0 to 1; proportion of each wave to be assigned new treatments randomly,
1 -   | 
Value
Throws an error if an argument is missing or misspecified.
See Also
Verbose Printer
Description
Shorthand Function for checking verbose and then printing if TRUE
Usage
verbose_log(log, message)
Arguments
log | 
 Logical; Whether or not to print the message, this will always be
the   | 
message | 
 The message to be printed to screen, as a string.  | 
Value
Text output of message to the console when log is TRUE. If
log is FALSE, returns nothing.