skbold.feature_extraction package¶

This module contains some feature-extraction methods/transformers.

class PatternAverager(method='mean')[source]¶

Bases: sklearn.base.BaseEstimator, sklearn.base.TransformerMixin

Reduces the set of features to its average.

Parameters:	method (str) – method of averaging (either ‘mean’ or ‘median’)

fit(X=None, y=None)[source]¶: Does nothing, but included to be used in sklearn’s Pipeline.

transform(X)[source]¶

Transforms patterns to its average.

Parameters:	X (ndarray) – Numeric (float) array of shape = [n_samples, n_features]
Returns:	X_new – Transformed ndarray of shape = [n_samples, 1]
Return type:	ndarray

class ArrayPermuter[source]¶

Bases: sklearn.base.BaseEstimator, sklearn.base.TransformerMixin

Permutes (shuffles) rows of matrix.

__init__()[source]¶: Initializes ArrayPermuter object.

fit(X=None, y=None)[source]¶: Does nothing, but included to be used in sklearn’s Pipeline.

transform(X)[source]¶

Permutes rows of data input.

Parameters:	X (ndarray) – Numeric (float) array of shape = [n_samples, n_features]
Returns:	X_new – ndarray with permuted rows
Return type:	ndarray

class AverageRegionTransformer(atlas='HarvardOxford-All', mask_threshold=0, mvp=None, reg_dir=None, orig_mask=None, data_shape=None, ref_space=None, affine=None, **kwargs)[source]¶

Bases: sklearn.base.BaseEstimator, sklearn.base.TransformerMixin

Transforms a whole-brain voxel pattern into a region-average pattern Computes the average from different regions from a given parcellation and returns those as features for X.

Parameters:	mask_type (List[str]) – List with absolute paths to nifti-images of brain masks in MNI152 (2mm) space. mvp (Mvp-object (see core.mvp)) – Mvp object that provides some metadata about previous masks mask_threshold (int (default: 0)) – Minimum threshold for probabilistic masks (such as Harvard-Oxford)

fit(X=None, y=None)[source]¶: Does nothing, but included to be used in sklearn’s Pipeline.

transform(X, y=None)[source]¶

Transforms features from X (voxels) to region-average features.

Parameters:	X (ndarray) – Numeric (float) array of shape = [n_samples, n_features] y (Optional[List[str] or numpy ndarray[str]]) – List of ndarray with strings indicating label-names
Returns:	X_new – array with transformed data of shape = [n_samples, n_features] in which features are region-average values.
Return type:	ndarray

class PCAfilter(n_components=5, reject=None)[source]¶

Bases: sklearn.base.BaseEstimator, sklearn.base.TransformerMixin

Filters out a (set of) PCA component(s) and transforms it back to original representation.

Parameters:	n_components (int) – number of components to retain. reject (list) – Indices of components which should be additionally removed.
Variables:	pca (scikit-learn PCA object) – Fitted PCA object.

fit(X, y=None, *args)[source]¶

Fits PcaFilter.

Parameters:	X (ndarray) – Numeric (float) array of shape = [n_samples, n_features] y (List of str) – List or ndarray with floats corresponding to labels

transform(X)[source]¶

Transforms a pattern (X) by the inverse PCA transform with removed components.

Parameters:	X (ndarray) – Numeric (float) array of shape = [n_samples, n_features]
Returns:	X – Transformed array of shape = [n_samples, n_features] given the PCA calculated during fit().
Return type:	ndarray

class RoiIndexer(mask, mask_threshold=0, mvp=None, orig_mask=None, ref_space=None, reg_dir=None, data_shape=None, affine=None, **kwargs)[source]¶

Bases: sklearn.base.BaseEstimator, sklearn.base.TransformerMixin

Indexes a whole-brain pattern with a certain ROI. Given a certain ROI-mask, this class allows transformation from a whole-brain pattern to the mask-subset.

Parameters:

mvp (mvp-object (see scikit_bold.core)) – Mvp-object, necessary to extract some pattern metadata. If no mvp object has been supplied, you have to set which original mask has been used (e.g. graymatter mask) and what the reference-space is (‘epi’ or ‘mni’).
mask (str) – Absolute paths to nifti-images of brain masks in MNI152 space
mask_threshold (Optional[int, float]) – Threshold to be applied on mask-indexing (given a probabilistic mask).

fit(X=None, y=None)[source]¶

Fits RoiIndexer.

Parameters:	X (ndarray) – Numeric (float) array of shape = [n_samples, n_features] y (List of str) – List or ndarray with floats corresponding to labels

transform(X, y=None)[source]¶

Transforms features from X (voxels) to a mask-subset.

Parameters:	X (ndarray) – Numeric (float) array of shape = [n_samples, n_features] y (Optional[List[str] or numpy ndarray[str]]) – List of ndarray with strings indicating label-names
Returns:	X_new – array with transformed data of shape = [n_samples, n_features] in which features are region-average values.
Return type:	ndarray

class RowIndexer(mvp, train_idx)[source]¶

Bases: object

Selects a subset of rows from an Mvp object.

Notes

NOT a scikit-learn style transformer.

Parameters:	idx (ndarray) – Array with indices. mvp (mvp-object) – Mvp-object to drawn metadata from.

transform()[source]¶

Returns:	mvp (mvp-object) – Indexed mvp-object. X_not_selected (ndarray) – Data which has not been selected. y_not_selected (ndarray) – Labels which have not been selected.

class ClusterThreshold(mvp, min_score, selector=<function f_classif>, min_cluster_size=20)[source]¶

Bases: sklearn.base.BaseEstimator, sklearn.base.TransformerMixin

Implements a cluster-based feature selection method. This feature selection method performs a univariate feature selection method to yield a set of voxels which are then cluster-thresholded using a minimum (contiguous) cluster size. These clusters are then averaged to yield a set of cluster-average features. This method is described in detail in my master’s thesis: github.com/lukassnoek/MSc_thesis.

Parameters:	transformer (scikit-learn style transformer class) – transformer class used to perform some kind of univariate feature selection. mvp (Mvp-object (see core.mvp)) – Necessary to provide mask metadata (index, shape). min_cluster_size (int) – minimum cluster size to be set for cluster-thresholding

fit(X, y, *args)[source]¶

Fits ClusterThreshold transformer.

Parameters:	X (ndarray) – Numeric (float) array of shape = [n_samples, n_features] y (List[str] or numpy ndarray[str]) – List of ndarray with floats corresponding to labels

transform(X)[source]¶

Transforms a pattern (X) given the indices calculated during fit().

Parameters:	X (ndarray) – Numeric (float) array of shape = [n_samples, n_features]
Returns:	X_cl – Transformed array of shape = [n_samples, n_clusters] given the indices calculated during fit().
Return type:	ndarray

class SelectFeatureset(mvp, featureset_idx)[source]¶

Bases: sklearn.base.BaseEstimator, sklearn.base.TransformerMixin

Selects only columns of a certain featureset. CANNOT be used in a scikit-learn pipeline!

Parameters:	mvp (mvp-object) – Used to extract meta-data. featureset_idx (ndarray) – Array with indices which map to unique feature-set voxels.

fit()[source]¶: Does nothing, but included due to scikit-learn API.

transform(X=None)[source]¶: Transforms mvp.

class IncrementalFeatureCombiner(scores, cutoff)[source]¶

Bases: sklearn.base.BaseEstimator, sklearn.base.TransformerMixin

Indexes a set of features with a number of (sorted) features.

Parameters:	scores (ndarray) – Array of shape = n_features, or [n_features, n_class] in case of soft/hard voting in, e.g., a roi_stacking_classifier (see classifiers.roi_stacking_classifier). cutoff (int or float) – If int, it refers the absolute number of features included, sorted from high to low (w.r.t. scores). If float, it selects a proportion of features.

fit(X, y=None)[source]¶

Fits IncrementalFeatureCombiner transformer.

Parameters:	X (ndarray) – Numeric (float) array of shape = [n_samples, n_features]

transform(X, y=None)[source]¶

Transforms a pattern (X) given the indices calculated during fit().

Parameters:	X (ndarray) – Numeric (float) array of shape = [n_samples, n_features]
Returns:	X – Transformed array of shape = [n_samples, n_features] given the indices calculated during fit().
Return type:	ndarray

Submodules¶

skbold.feature_extraction.transformers module