skbold.feature_extraction package¶
This module contains some featureextraction methods/transformers.

class
PatternAverager
(method='mean')[source]¶ Bases:
sklearn.base.BaseEstimator
,sklearn.base.TransformerMixin
Reduces the set of features to its average.
Parameters: method (str) – method of averaging (either ‘mean’ or ‘median’)

class
AverageRegionTransformer
(atlas='HarvardOxfordAll', mask_threshold=0, mvp=None, reg_dir=None, orig_mask=None, data_shape=None, ref_space=None, affine=None, **kwargs)[source]¶ Bases:
sklearn.base.BaseEstimator
,sklearn.base.TransformerMixin
Transforms a wholebrain voxel pattern into a regionaverage pattern Computes the average from different regions from a given parcellation and returns those as features for X.
Parameters:  atlas (str) – Atlas to extract ROIs from. Available: ‘HarvardOxfordCortical’, ‘HarvardOxfordSubcortical’, ‘HarvardOxfordAll’ (combination of cortical/subcortical), ‘Talairach’ (not tested), ‘JHUlabels’, ‘JHUtracts’, ‘Yeo2011’.
 mvp (Mvpobject (see core.mvp)) – Mvp object that provides some metadata about previous masks
 mask_threshold (int (default: 0)) – Minimum threshold for probabilistic masks (such as HarvardOxford)
 reg_dir (str) – Path to directory with registration info (warps/transforms).
 **kwargs (keyword arguments) – Other arguments that can be passed to skbold.utils.load_roi_mask.

transform
(X, y=None)[source]¶ Transforms features from X (voxels) to regionaverage features.
Parameters:  X (ndarray) – Numeric (float) array of shape = [n_samples, n_features]
 y (Optional[List[str] or numpy ndarray[str]]) – List of ndarray with strings indicating labelnames
Returns: X_new – array with transformed data of shape = [n_samples, n_features] in which features are regionaverage values.
Return type: ndarray

class
PCAfilter
(n_components=5, reject=None)[source]¶ Bases:
sklearn.base.BaseEstimator
,sklearn.base.TransformerMixin
Filters out a (set of) PCA component(s) and transforms it back to original representation.
Parameters:  n_components (int) – number of components to retain.
 reject (list) – Indices of components which should be additionally removed.
Variables: pca (scikitlearn PCA object) – Fitted PCA object.

fit
(X, y=None, *args)[source]¶ Fits PcaFilter.
Parameters:  X (ndarray) – Numeric (float) array of shape = [n_samples, n_features]
 y (List of str) – List or ndarray with floats corresponding to labels

transform
(X)[source]¶ Transforms a pattern (X) by the inverse PCA transform with removed components.
Parameters: X (ndarray) – Numeric (float) array of shape = [n_samples, n_features] Returns: X – Transformed array of shape = [n_samples, n_features] given the PCA calculated during fit(). Return type: ndarray

class
ClusterThreshold
(mvp, min_score, selector=<function f_classif>, min_cluster_size=20)[source]¶ Bases:
sklearn.base.BaseEstimator
,sklearn.base.TransformerMixin
Implements a clusterbased feature selection method. This feature selection method performs a univariate feature selection method to yield a set of voxels which are then clusterthresholded using a minimum (contiguous) cluster size. These clusters are then averaged to yield a set of clusteraverage features. This method is described in detail in my master’s thesis: github.com/lukassnoek/MSc_thesis.
Parameters:  transformer (scikitlearn style transformer class) – transformer class used to perform some kind of univariate feature selection.
 mvp (Mvpobject (see core.mvp)) – Necessary to provide mask metadata (index, shape).
 min_cluster_size (int) – minimum cluster size to be set for clusterthresholding

fit
(X, y, *args)[source]¶ Fits ClusterThreshold transformer.
Parameters:  X (ndarray) – Numeric (float) array of shape = [n_samples, n_features]
 y (List[str] or numpy ndarray[str]) – List of ndarray with floats corresponding to labels

transform
(X)[source]¶ Transforms a pattern (X) given the indices calculated during fit().
Parameters: X (ndarray) – Numeric (float) array of shape = [n_samples, n_features] Returns: X_cl – Transformed array of shape = [n_samples, n_clusters] given the indices calculated during fit(). Return type: ndarray