skbold.utils package

The utils subpackage contains some extra utilities for machine learning pipelines on fMRI data. For example, the CrossvalSplitter creates balanced train/test-sets given a (set of) confound(s). Also, the load_roi_mask function allows for loading ROIs from the Harvard-Oxford (sub)cortical atlas. This function is also integrated in the RoiIndexer transformer.from

Lastly, the ArrayPermuter, RowIndexer, and SelectFeatureset transformers can be used in, for example. permutation analyses.

sort_numbered_list(stat_list)[source]

Sorts a list containing numbers.

Sorts list with paths to statistic files (e.g. COPEs, VARCOPES), which are often sorted wrong (due to single and double digits). This function extracts the numbers from the stat files and sorts the original list accordingly.

Parameters:stat_list (list or str) – list with absolute paths to files
Returns:sorted_list – sorted stat_list
Return type:list of str
class CrossvalSplitter(data, train_size, vars, cb_between_splits=False, binarize=None, include=None, exclude=None, interactions=True, sep='t', index_col=0, ignore=None, iterations=1000)[source]

Bases: object

plot_results(out_dir)[source]
save(out_dir, save_plots=True)[source]
split(verbose=False)[source]
parse_roi_labels(atlas_type='Talairach', lateralized=False, debug=False)[source]

Parses xml-files belonging to FSL atlases.

Parameters:
  • atlas_type (str) – String identifying which atlas needs to be parsed.
  • lateralized (bool) – Whether to use the lateralized version of the atlas (only applicable to HarvardOxford masks)
Returns:

info_dict – Dictionary with indices and coordinates (values) per ROI (keys).

Return type:

dict

print_mask_options(atlas_name='HarvardOxford-Cortical')[source]

Prints the options for ROIs given a certain atlas.

Parameters:atlas_name (str) – Name of the atlas. Availabel: ‘HarvardOxford-Cortical’, ‘HarvardOxford-Subcortical’, ‘Yeo2011’.
class ArrayPermuter[source]

Bases: sklearn.base.BaseEstimator, sklearn.base.TransformerMixin

Permutes (shuffles) rows of matrix.

__init__()[source]

Initializes ArrayPermuter object.

fit(X=None, y=None)[source]

Does nothing, but included to be used in sklearn’s Pipeline.

transform(X)[source]

Permutes rows of data input.

Parameters:X (ndarray) – Numeric (float) array of shape = [n_samples, n_features]
Returns:X_new – ndarray with permuted rows
Return type:ndarray
class RowIndexer(mvp, train_idx)[source]

Bases: object

Selects a subset of rows from an Mvp object.

Notes

NOT a scikit-learn style transformer.

Parameters:
  • idx (ndarray) – Array with indices.
  • mvp (mvp-object) – Mvp-object to drawn metadata from.
transform()[source]
Returns:
  • mvp (mvp-object) – Indexed mvp-object.
  • X_not_selected (ndarray) – Data which has not been selected.
  • y_not_selected (ndarray) – Labels which have not been selected.
class SelectFeatureset(mvp, featureset_idx)[source]

Bases: sklearn.base.BaseEstimator, sklearn.base.TransformerMixin

Selects only columns of a certain featureset. CANNOT be used in a scikit-learn pipeline!

Parameters:
  • mvp (mvp-object) – Used to extract meta-data.
  • featureset_idx (ndarray) – Array with indices which map to unique feature-set voxels.
fit()[source]

Does nothing, but included due to scikit-learn API.

transform(X=None)[source]

Transforms mvp.