skbold.preproc.label_preproc module

class LabelBinarizer(params)[source]

Bases: sklearn.base.BaseEstimator, sklearn.base.TransformerMixin

__init__(params)[source]

Initializes LabelBinarizer object.

fit(X=None, y=None)[source]

Does nothing, but included for scikit-learn pipelines.

transform(X, y)[source]

Binarizes y-attribute.

Parameters:X (ndarray) – Numeric (float) array of shape = [n_samples, n_features]
Returns:X – Transformed array of shape = [n_samples, n_features] given the indices calculated during fit().
Return type:ndarray
class LabelFactorizer(grouping)[source]

Bases: sklearn.base.BaseEstimator, sklearn.base.TransformerMixin

Transforms labels according to a given factorial grouping.

Factorizes/encodes labels based on part of the string label. For example, the label-vector [‘A_1’, ‘A_2’, ‘B_1’, ‘B_2’] can be grouped based on letter (A/B) or number (1/2).

Parameters:grouping (List of str) – List with identifiers for condition names as strings
Variables:new_labels (list) – List with new labels.
fit(y=None, X=None)[source]

Does nothing, but included to be used in sklearn’s Pipeline.

get_new_labels()[source]

Returns new labels based on factorization.

transform(y, X=None)[source]

Transforms label-vector given a grouping.

Parameters:
  • y (List/ndarray of str) – List of ndarray with strings indicating label-names
  • X (ndarray) – Numeric (float) array of shape = [n_samples, n_features]
Returns:

  • y_new (ndarray) – array with transformed y-labels
  • X_new (ndarray) – array with transformed data of shape = [n_samples, n_features] given new factorial grouping/design.

class MajorityUndersampler(verbose=False)[source]

Bases: sklearn.base.BaseEstimator, sklearn.base.TransformerMixin

Undersamples the majority-class(es) by selecting random samples.

Parameters:verbose (bool) – Whether to print downsamples number of samples.
__init__(verbose=False)[source]

Initializes MajorityUndersampler object.

fit(X=None, y=None)[source]

Does nothing, but included for scikit-learn pipelines.

transform(X, y)[source]

Downsamples majority-class(es).

Parameters:X (ndarray) – Numeric (float) array of shape = [n_samples, n_features]
Returns:X – Transformed array of shape = [n_samples, n_features] given the indices calculated during fit().
Return type:ndarray