The major version bump is largely to mark the occasion that the package is now considered “released”.
fippy comparison article since a more
comprehensive comparison is now available in xplainfi-benchmark.min_permutations default in
SAGE methods to 10 rather than 3, since the previous value
was found to lead to spurious early stopping.sim_dgp_ewald lading to erroneous variances when
compared to their settings.KnockoffSequentialSampler as the
seqknockoff package is not available on CRAN or R-universe.
KnockoffSampler with the corresponding
knockoff_fun = seqknockoff::knockoffs_seq still works.sim_dgp_confounded, removing x2
which doesn’t add anything interesting over x1.obs_loss() is computed (see
https://github.com/mlr-org/mlr3/pull/1411).measure to be unspecified and
falling back to a task_type-specific default measure$importance() gains ci_method parameter
for variance estimation (#40):
"none" (default): Simple aggregation without confidence
intervals"raw": Uncorrected variance estimates (informative
only, CIs too narrow)"nadeau_bengio": Variance correction by Nadeau &
Bengio (2003) as recommended by Molnar et al. (2023)"quantile": Empirical quantile-based confidence
intervals"cpi": Conditional Predictive Impact for perturbation
methods (PFI/CFI/RFI), supporting t-, Wilcoxon-, Fisher-, and binomial
testsPerturbationImportance
methods only (not available for WVIM/LOCO or SAGE)$importance() gains standardize parameter
to normalize scores to [-1, 1] range$importance() and $scores() gain
relation parameter (default: "difference") to
compute importances as difference or ratio of baseline and
post-modification loss
$compute() to avoid recomputing
predictions/refits when changing aggregation methodsim_dgp_independent(): Baseline with additive
independent effectssim_dgp_correlated(): Highly correlated features (PFI
fails, CFI succeeds)sim_dgp_mediated(): Mediation structure (total vs
direct effects)sim_dgp_confounded(): Confounding structuresim_dgp_interactions(): Interaction effects between
features$obs_loss() computes observation-wise importance scores
when measure has a Measure$obs_loss()
method$predictions field stores prediction objects for
further analysisPerturbationImportance and WVIM methods
support groups parameter for grouped feature importance:
groups = list(effects = c("x1", "x2", "x3"), noise = c("noise1", "noise2"))feature column contains group names instead
of individual featuresmlr3fselect for cleaner
internalsiters_refit → n_repeats
for consistencylearner$predict_newdata_fast() for faster
predictions (requires mlr3 >= 1.1.0)sampler$sample() callsbatch_size parameter to control memory usage with
large datasetsmirai or future
backendsmirai::daemons() or
future::plan()iters_perm → n_repeats
for consistency$sample(feature, row_ids): Samples from stored task
using row IDs$sample_newdata(feature, newdata): Samples from
external dataPermutationSampler →
MarginalPermutationSamplerARFSampler → ConditionalARFSamplerGaussianConditionalSampler →
ConditionalGaussianSamplerKNNConditionalSampler →
ConditionalKNNSamplerCtreeConditionalSampler →
ConditionalCtreeSamplerconditioning_set for
features to condition onMarginalSampler: Base class for marginal sampling
methodsMarginalReferenceSampler: Samples complete rows from
reference data (for SAGE)KnockoffSampler: Knockoff-based sampling (#16 via @mnwright)
KnockoffGaussianSampler,
KnockoffSequentialSamplerrow_ids-based samplingiters parameter for multiple knockoff iterationsBug fix: ConditionalSAGE now
properly uses conditional sampling (was accidentally using marginal
sampling)
Performance improvements:
learner$predict_newdata_fast() for faster
predictionsbatch_size parameter controls memory usage for large
coalitionsConvergence tracking (#29, #33):
early_stopping = TRUEse_threshold (default: 0.01)min_permutations (default: 3)check_interval permutations
(default: 1)$converged: Boolean indicating if convergence was
reached$n_permutations_used: Actual permutations used (may be
less than requested)$convergence_history: Per-feature importance and SE
over permutations$plot_convergence(): Visualize convergence curvesarf-powered conditional sampling)arf)fippy