skbold.feature_extraction package

This module contains some feature-extraction methods/transformers.

class PatternAverager(method='mean')[source]

Bases: sklearn.base.BaseEstimator, sklearn.base.TransformerMixin

Reduces the set of features to its average.

Parameters:method (str) – method of averaging (either ‘mean’ or ‘median’)
fit(X=None, y=None)[source]

Does nothing, but included to be used in sklearn’s Pipeline.

transform(X)[source]

Transforms patterns to its average.

Parameters:X (ndarray) – Numeric (float) array of shape = [n_samples, n_features]
Returns:X_new – Transformed ndarray of shape = [n_samples, 1]
Return type:ndarray
class AverageRegionTransformer(atlas='HarvardOxford-All', mask_threshold=0, mvp=None, reg_dir=None, orig_mask=None, data_shape=None, ref_space=None, affine=None, **kwargs)[source]

Bases: sklearn.base.BaseEstimator, sklearn.base.TransformerMixin

Transforms a whole-brain voxel pattern into a region-average pattern Computes the average from different regions from a given parcellation and returns those as features for X.

Parameters:
  • atlas (str) – Atlas to extract ROIs from. Available: ‘HarvardOxford-Cortical’, ‘HarvardOxford-Subcortical’, ‘HarvardOxford-All’ (combination of cortical/subcortical), ‘Talairach’ (not tested), ‘JHU-labels’, ‘JHU-tracts’, ‘Yeo2011’.
  • mvp (Mvp-object (see core.mvp)) – Mvp object that provides some metadata about previous masks
  • mask_threshold (int (default: 0)) – Minimum threshold for probabilistic masks (such as Harvard-Oxford)
  • reg_dir (str) – Path to directory with registration info (warps/transforms).
  • **kwargs (key-word arguments) – Other arguments that can be passed to skbold.utils.load_roi_mask.
fit(X=None, y=None)[source]

Does nothing, but included to be used in sklearn’s Pipeline.

transform(X, y=None)[source]

Transforms features from X (voxels) to region-average features.

Parameters:
  • X (ndarray) – Numeric (float) array of shape = [n_samples, n_features]
  • y (Optional[List[str] or numpy ndarray[str]]) – List of ndarray with strings indicating label-names
Returns:

X_new – array with transformed data of shape = [n_samples, n_features] in which features are region-average values.

Return type:

ndarray

class PCAfilter(n_components=5, reject=None)[source]

Bases: sklearn.base.BaseEstimator, sklearn.base.TransformerMixin

Filters out a (set of) PCA component(s) and transforms it back to original representation.

Parameters:
  • n_components (int) – number of components to retain.
  • reject (list) – Indices of components which should be additionally removed.
Variables:

pca (scikit-learn PCA object) – Fitted PCA object.

fit(X, y=None, *args)[source]

Fits PcaFilter.

Parameters:
  • X (ndarray) – Numeric (float) array of shape = [n_samples, n_features]
  • y (List of str) – List or ndarray with floats corresponding to labels
transform(X)[source]

Transforms a pattern (X) by the inverse PCA transform with removed components.

Parameters:X (ndarray) – Numeric (float) array of shape = [n_samples, n_features]
Returns:X – Transformed array of shape = [n_samples, n_features] given the PCA calculated during fit().
Return type:ndarray
class ClusterThreshold(mvp, min_score, selector=<function f_classif>, min_cluster_size=20)[source]

Bases: sklearn.base.BaseEstimator, sklearn.base.TransformerMixin

Implements a cluster-based feature selection method. This feature selection method performs a univariate feature selection method to yield a set of voxels which are then cluster-thresholded using a minimum (contiguous) cluster size. These clusters are then averaged to yield a set of cluster-average features. This method is described in detail in my master’s thesis: github.com/lukassnoek/MSc_thesis.

Parameters:
  • transformer (scikit-learn style transformer class) – transformer class used to perform some kind of univariate feature selection.
  • mvp (Mvp-object (see core.mvp)) – Necessary to provide mask metadata (index, shape).
  • min_cluster_size (int) – minimum cluster size to be set for cluster-thresholding
fit(X, y, *args)[source]

Fits ClusterThreshold transformer.

Parameters:
  • X (ndarray) – Numeric (float) array of shape = [n_samples, n_features]
  • y (List[str] or numpy ndarray[str]) – List of ndarray with floats corresponding to labels
transform(X)[source]

Transforms a pattern (X) given the indices calculated during fit().

Parameters:X (ndarray) – Numeric (float) array of shape = [n_samples, n_features]
Returns:X_cl – Transformed array of shape = [n_samples, n_clusters] given the indices calculated during fit().
Return type:ndarray