minimization.GeneralizeToRepresentative

class minimization.GeneralizeToRepresentative(estimator=None, target_accuracy=0.998, features=None, cells=None)[source]

A transformer that generalizes data to representative points.

Learns data generalizations based on an original model’s predictions and a target accuracy. Once the generalizations are learned, can receive one or more data records and transform them to representative points based on the learned generalization.

An alternative way to use the transformer is to supply cells and features in init or set_params and those will be used to transform data to representatives. In this case, fit must still be called but there is no need to supply it with X and y, and there is no need to supply an existing estimator to init.

In summary, either estimator and target_accuracy should be supplied or cells and features should be supplied.

Parameters:
estimator : estimator, optional

The original model for which generalization is being performed. Should be pre-fitted.

target_accuracy : float, optional

The required accuracy when applying the base model to the generalized data. Accuracy is measured relative to the original accuracy of the model.

features : list of str, optional

The feature names, in the order that they appear in the data.

cells : list of object, optional

The cells used to generalize records. Each cell must define a range or subset of categories for each feature, as well as a representative value for each feature. This parameter should be used when instantiating a transformer object without first fitting it.

Attributes:
cells_ : list of object

The cells used to generalize records, as learned when calling fit.

ncp_ : float

The NCP (information loss) score of the resulting generalization, as measured on the training data.

generalizations_ : object

The generalizations that were learned (actual feature ranges).

__init__(self, estimator=None, target_accuracy=0.998, features=None, cells=None)[source]

Initialize self. See help(type(self)) for accurate signature.

fit(self, X=None, y=None)[source]

Learns the generalizations based on training data.

Parameters:
X : {array-like, sparse matrix}, shape (n_samples, n_features), optional

The training input samples.

y : array-like, shape (n_samples,), optional

The target values. An array of int. This should contain the predictions of the original model on X.

Returns:
X_transformed : ndarray, shape (n_samples, n_features)

The array containing the representative values to which each record in X is mapped.

fit_transform(self, X=None, y=None)[source]

Learns the generalizations based on training data, and applies them to the data.

Parameters:
X : {array-like, sparse matrix}, shape (n_samples, n_features), optional

The training input samples.

y : array-like, shape (n_samples,), optional

The target values. An array of int. This should contain the predictions of the original model on X.

Returns:
self : object

Returns self.

get_params(self, deep=True)[source]

Get parameters for this estimator.

Parameters:
deep : boolean, optional

If True, will return the parameters for this estimator and contained subobjects that are estimators.

Returns:
params : mapping of string to any

Parameter names mapped to their values.

set_params(self, **params)[source]

Set the parameters of this estimator.

Returns:
self : object

Returns self.

transform(self, X)[source]

Transforms data records to representative points.

Parameters:
X : {array-like, sparse-matrix}, shape (n_samples, n_features)

The input samples.

Returns:
X_transformed : ndarray, shape (n_samples, n_features)

The array containing the representative values to which each record in X is mapped.