skbold.estimators package

The classifiers subpackage provides two ensemble-type classifiers that aim at aggregating multivoxel information from multiple local sources in the brain. They do so by allowing to fit a model on different brain areas, which predictions are subsequently combined using either a stacked (meta) model (i.e. the RoiStackingClassifier) or using a voting-strategy (i.e. the RoiVotingClassifier). The structure and API of these classifiers adhere to the scikit-learn estimator object.

class RoiStackingClassifier(mvp, preproc_pipe='default', base_clf=None, meta_clf=None, mask_type='unilateral', proba=True, folds=10, meta_fs='univar', meta_gs=None, n_cores=1)[source]

Bases: sklearn.base.BaseEstimator, sklearn.base.ClassifierMixin

This scikit-learn-style classifier implements a stacking classifier that fits a base-classifier on multiple brain-regions separately and subsequently trains a meta-classifier on the outputs of the base- classifiers on the separate brain-regions.

Parameters:
  • mvp (mvp-object) – An custom object from the skbold package containing data (X, y) and corresponding meta-data (e.g. mask info)
  • preproc_pipe (object) – A scikit-learn Pipeline object with desired preprocessing steps (e.g. scaling, additional feature selection). Defaults to only scaling and univariate-feature-selection by means of highest mean-euclidean differences (see skbold.transformers.mean_euclidean).
  • base_clf (object) – A scikit-learn style classifier (implementing fit(), predict(), and predict_proba()), that is able to be used in Pipelines.
  • meta_clf (object) – A scikit-learn style classifier.
  • mask_type (str) – Can be ‘unilateral’ or ‘bilateral’, which will use all masks from the corresponding Harvard-Oxford Cortical (lateralized) atlas. Alternatively, it may be an absolute path to a directory containing a custom set of masks as nifti-files (default: ‘unilateral’).
  • meta_gs (list or ndarray) – Optional parameter-grid over which to perform gridsearch.
  • n_cores (int) – Number of CPU-cores on which to perform the fitting procedure (here, outer-folds are parallelized).
Variables:
  • train_scores (ndarray) – Accuracy-scores per brain region (averaged over outer-folds) on the training (fit) phase.
  • test_scores (ndarray) – Accuracy-scores per brain region (averaged over outer- and inner-folds) on the test phase.
  • masks (list of str) – List of absolute paths to found masks.
  • stck_train (ndarray) – Array with outputs from base-classifiers fit on train-set.
  • stck_test (ndarray) – Array with outputs from base-classifiers generalized to test-set.
fit(X, y)[source]

Fits RoiStackingClassfier.

Parameters:
  • X (ndarray) – Array of shape = [n_samples, n_features].
  • y (list or ndarray of int or float) – List or ndarray with floats/ints corresponding to labels.
Returns:

self – RoiStackingClassifier instance with fitted parameters.

Return type:

object

predict(X, y=None)[source]

Predict class given RoiStackingClassifier.

Parameters:X (ndarray) – Array of shape = [n_samples, n_features].
Returns:meta_pred – Array with class predictions.
Return type:ndarray
score(X, y)[source]

Scoring function calculating accuracy given predictions.

X : ndarray
Array of shape = [n_samples, n_features]
y : list or ndarray of int or float
List or ndarray with floats/ints corresponding to labels.
Returns:score – Accuracy of predictions on the test-set.
Return type:float
class RoiVotingClassifier(mvp, preproc_pipeline=None, clf=None, mask_type='unilateral', voting='soft', weights=None)[source]

Bases: sklearn.base.BaseEstimator, sklearn.base.ClassifierMixin

This classifier fits a base-estimator (by default a linear SVM) on different feature sets (i.e. voxels) from different regions of interest (which are drawn from the Harvard-Oxford Cortical atlas), and subsequently the final prediction is derived through a max-voting rule, which can be either ‘soft’ (argmax of mean class probability) or ‘hard’ (max of class prediction).

Notes

This classifier has not been tested!

Parameters:
  • mvp (mvp-object) – An custom object from the skbold package containing data (X, y) and corresponding meta-data (e.g. mask info)
  • preproc_pipeline (object) – A scikit-learn Pipeline object with desired preprocessing steps (e.g. scaling, additional feature selection)
  • clf (object) – A scikit-learn style classifier (implementing fit(), predict(), and predict_proba()), that is able to be used in Pipelines.
  • mask_type (str) – Can be ‘unilateral’ or ‘bilateral’, which will use all masks from the corresponding Harvard-Oxford Cortical (lateralized) atlas. Alternatively, it may be an absolute path to a directory containing a custom set of masks as nifti-files (default: ‘unilateral’).
  • voting (str) – Either ‘hard’ or ‘soft’ (default: ‘soft’).
  • weights (list (or ndarray)) – List/array of shape [n_rois] with a relative weighting factor to be used in the voting procedure.
fit(X=None, y=None)[source]

Fits RoiVotingClassifier.

Parameters:
  • X (ndarray) – Array of shape = [n_samples, n_features].
  • y (list or ndarray of int or float) – List or ndarray with floats/ints corresponding to labels.
Returns:

self – RoiStackingClassifier instance with fitted parameters.

Return type:

object

predict(X)[source]

Predict class given fitted RoiVotingClassifier.

Parameters:X (ndarray) – Array of shape = [n_samples, n_features].
Returns:maxvotes – Array with class predictions for all classes of X.
Return type:ndarray
class MultimodalVotingClassifier(mvp, clf=None, voting='soft', weights=None)[source]

Bases: sklearn.base.BaseEstimator, sklearn.base.ClassifierMixin

This classifier fits a base-estimator (by default a linear SVM) on different feature sets of different modalities (i.e. VBM, TBSS, BOLD, etc), and subsequently the final prediction is derived through a max-voting rule, which can be either ‘soft’ (argmax of mean class probability) or ‘hard’ (max of class prediction).

Notes

This classifier has not been tested!

Parameters:
  • mvp (mvp-object) – An custom object from the skbold package containing data (X, y) and corresponding meta-data (e.g. mask info)
  • preproc_pipeline (object) – A scikit-learn Pipeline object with desired preprocessing steps (e.g. scaling, additional feature selection)
  • clf (object) – A scikit-learn style classifier (implementing fit(), predict(), and predict_proba()), that is able to be used in Pipelines.
  • voting (str) – Either ‘hard’ or ‘soft’ (default: ‘soft’).
  • weights (list (or ndarray)) – List/array of shape [n_rois] with a relative weighting factor to be used in the voting procedure.
fit(X=None, y=None, iterations=1)[source]

Fits RoiVotingClassifier.

Parameters:
  • X (ndarray) – Array of shape = [n_samples, n_features].
  • y (list or ndarray of int or float) – List or ndarray with floats/ints corresponding to labels.
Returns:

self – RoiStackingClassifier instance with fitted parameters.

Return type:

object

predict(X)[source]

Predict class given fitted RoiVotingClassifier.

Parameters:X (ndarray) – Array of shape = [n_samples, n_features].
Returns:maxvotes – Array with class predictions for all classes of X.
Return type:ndarray