skbold.preproc.label_preproc module¶

class LabelBinarizer(params)[source]¶

Bases: sklearn.base.BaseEstimator, sklearn.base.TransformerMixin

__init__(params)[source]¶: Initializes LabelBinarizer object.

fit(X=None, y=None)[source]¶: Does nothing, but included for scikit-learn pipelines.

transform(X, y)[source]¶

Binarizes y-attribute.

Parameters:	X (ndarray) – Numeric (float) array of shape = [n_samples, n_features]
Returns:	X – Transformed array of shape = [n_samples, n_features] given the indices calculated during fit().
Return type:	ndarray

class LabelFactorizer(grouping)[source]¶

Bases: sklearn.base.BaseEstimator, sklearn.base.TransformerMixin

Transforms labels according to a given factorial grouping.

Factorizes/encodes labels based on part of the string label. For example, the label-vector [‘A_1’, ‘A_2’, ‘B_1’, ‘B_2’] can be grouped based on letter (A/B) or number (1/2).

Parameters:	grouping (List of str) – List with identifiers for condition names as strings
Variables:	new_labels (list) – List with new labels.

fit(y=None, X=None)[source]¶: Does nothing, but included to be used in sklearn’s Pipeline.

get_new_labels()[source]¶: Returns new labels based on factorization.

transform(y, X=None)[source]¶

Transforms label-vector given a grouping.

Parameters:

y (List/ndarray of str) – List of ndarray with strings indicating label-names
X (ndarray) – Numeric (float) array of shape = [n_samples, n_features]

Returns:

y_new (ndarray) – array with transformed y-labels
X_new (ndarray) – array with transformed data of shape = [n_samples, n_features] given new factorial grouping/design.

class MajorityUndersampler(verbose=False)[source]¶

Bases: sklearn.base.BaseEstimator, sklearn.base.TransformerMixin

Undersamples the majority-class(es) by selecting random samples.

Parameters:	verbose (bool) – Whether to print downsamples number of samples.

__init__(verbose=False)[source]¶: Initializes MajorityUndersampler object.

fit(X=None, y=None)[source]¶: Does nothing, but included for scikit-learn pipelines.

transform(X, y)[source]¶

Downsamples majority-class(es).

Parameters:	X (ndarray) – Numeric (float) array of shape = [n_samples, n_features]
Returns:	X – Transformed array of shape = [n_samples, n_features] given the indices calculated during fit().
Return type:	ndarray