explainer.sage.interval#

This module contains the interval SAGE explainer.

Classes

IntervalSage(model_function, feature_names, ...)

Interval SAGE Explainer

class explainer.sage.interval.IntervalSage(model_function, feature_names, loss_function, n_inner_samples=1, interval_length=1000, storage_length=1000, storage=None, imputer=None)[source]#

Bases: BatchSage

Interval SAGE Explainer

Computes SAGE importance values according to its original definition in https://arxiv.org/abs/2004.00668 at set time intervals. A Storage of the last n (specified by storage_length) observations are kept on which the explanations are created.

Parameters:
  • model_function (Callable[[Any], Any]) – The Model function to be explained (e.g. model.predict_one (river), model.predict_proba (sklearn)).

  • loss_function (Union[Metric, Callable[[Any, Dict], float]]) –

    The loss function for which the importance values are calculated. This can either be a callable function or a predefined river.metric.base.Metric.<br> - river.metric.base.Metric: Any Metric implemented in river (e.g.

    river.metrics.CrossEntropy() for classification or river.metrics.MSE() for regression).<br>

    • callable function: The loss_function needs to follow the signature of

      loss_function(y_true, y_pred) and handle the output dimensions of the model function. Smaller values are interpreted as being better if not overriden with loss_bigger_is_better=True. y_pred is passed as a dict.

  • feature_names (Sequence[Union[str, int, float]]) – List of feature names to be explained for the model.

  • storage (IntervalStorage, optional) – Optional incremental data storage Mechanism. Defaults to IntervalStorage(size=interval_length).

  • imputer (BaseImputer, optional) – Incremental imputing strategy to be used. Defaults to MarginalImputer(sampling_strategy=’joint’).

  • n_inner_samples (int) – Number of model evaluation per feature and explanation step (observation). Defaults to 1.

  • interval_length (int) – Length of the explanation interval after which the explanations are created. Defaults to 1000.

feature_names#

The feature names of the dataset.

Type:

Sequence[Union[str, int, float]]

n_inner_samples#

Number of model evaluation per feature and explanation step (observation).

Type:

int

seen_samples#

Number of observations seen.

Type:

int

explain_one(x_i, y_i, n_inner_samples=None, update_storage=True, force_explain=False, verbose=True)[source]#
Explain one observation (x_i, y_i) if enough time between the last explanation and now

has passed (interval_length).

Parameters:
  • x_i (dict) – The input features of the current observation as a dict of feature names to feature values.

  • y_i (Any) – Target label of the current observation.

  • n_inner_samples (int, optional) – Number of model evaluation per feature for the current explanation step (observation). Defaults to None.

  • update_storage (bool) – Flag if the underlying incremental data storage mechanism is to be updated with the new observation (True) or not (False). Defaults to True.

  • force_explain (bool) – Overrides the interval_length restriction and explains the current sample. This does not override the set interval_length globally, such that the explainer is still run in the same rhythm as before.

  • verbose (bool) – Flag indicating if the explanation should print to console (True) or not (False).

Returns:

The current SAGE feature importance scores.

Return type:

(dict)