pnpl.datasets.gwilliams2022.dataset.Gwilliams2022#
- class pnpl.datasets.gwilliams2022.dataset.Gwilliams2022(data_path, task, preprocessing='notch+bp+ds', preprocessing_config=None, include_subjects=None, exclude_subjects=None, include_sessions=None, exclude_sessions=None, include_tasks=None, exclude_tasks=None, include_run_keys=None, exclude_run_keys=None, standardize=True, clipping_boundary=10.0, channel_means=None, channel_stds=None, include_info=False, create_h5_if_missing=True, download=True, preload_h5=False)[source]#
MEG-MASC continuous MEG dataset.
- Parameters:
data_path (str) – Local data directory. Files are arranged in BIDS layout here (matching what OSF provides). Created if missing.
task – Object implementing
pnpl.tasks.base.TaskProtocol. Seepnpl.tasks.gwilliams2022for ready-made tasks.preprocessing (Optional[str]) – Preprocessing string used in derivative filenames (e.g.
"notch+bp+ds"). WhenNone, the raw KIT recording is materialized to H5 unchanged.preprocessing_config (Optional[Dict[str, Dict[str, Any]]]) – Optional preprocessing-step overrides forwarded to
pnpl.preprocessing.Pipeline.exclude_subjects (Optional[Sequence[str]]) – BIDS subject ids without the
sub-prefix (e.g."01").exclude_sessions (Optional[Sequence[str]]) –
"0"or"1".exclude_tasks (Optional[Sequence[str]]) – Task ids in MEG-MASC’s BIDS layout (
"0"..``”3”`` — one per story).exclude_run_keys (Optional[Sequence[tuple]]) – Tuples of
(subject, session, task, run)to include/exclude. MEG-MASC uses a single run per task, sorunis always"01".channel_stds (ndarray | None) – See
pnpl.datasets.mixins.StandardizationMixin.include_info (bool) – If True,
__getitem__returns(x, y, info).create_h5_if_missing (bool) – If True (default), materialize the cached H5 from a local preprocessed FIF or — failing that — by running the preprocessing pipeline against the raw KIT recording.
download (bool) – If True, fetch missing files from OSF on demand.
include_subjects (Optional[Sequence[str]])
exclude_subjects
include_sessions (Optional[Sequence[str]])
exclude_sessions
include_tasks (Optional[Sequence[str]])
exclude_tasks
include_run_keys (Optional[Sequence[tuple]])
exclude_run_keys
standardize (bool)
clipping_boundary (Optional[float])
channel_means (ndarray | None)
channel_stds
preload_h5 (bool)
- __init__(data_path, task, preprocessing='notch+bp+ds', preprocessing_config=None, include_subjects=None, exclude_subjects=None, include_sessions=None, exclude_sessions=None, include_tasks=None, exclude_tasks=None, include_run_keys=None, exclude_run_keys=None, standardize=True, clipping_boundary=10.0, channel_means=None, channel_stds=None, include_info=False, create_h5_if_missing=True, download=True, preload_h5=False)[source]#
- Parameters:
data_path (str)
preprocessing (str | None)
preprocessing_config (Dict[str, Dict[str, Any]] | None)
include_subjects (Sequence[str] | None)
exclude_subjects (Sequence[str] | None)
include_sessions (Sequence[str] | None)
exclude_sessions (Sequence[str] | None)
include_tasks (Sequence[str] | None)
exclude_tasks (Sequence[str] | None)
include_run_keys (Sequence[tuple] | None)
exclude_run_keys (Sequence[tuple] | None)
standardize (bool)
clipping_boundary (float | None)
channel_means (ndarray | None)
channel_stds (ndarray | None)
include_info (bool)
create_h5_if_missing (bool)
download (bool)
preload_h5 (bool)
Methods
__init__(data_path, task[, preprocessing, ...])calculate_standardization_params(h5_data_loader)Calculate channel means and stds across all runs.
clip_sample(sample, boundary)Clip sample values to [-boundary, boundary].
close_h5_files()Close all open H5 file handles and drop preloaded arrays.
ensure_file(fpath)Ensure a file exists locally, downloading from OSF if needed.
get_bids_raw_path(subject, session, task, run)Construct path to raw BIDS MEG file.
get_calibration_files()Get paths to Maxwell filter calibration files.
get_dataset_manifest([refresh])Build a manifest of every file on OSF storage for the configured node(s).
get_derivatives_path(subject, session[, ...])Construct path to derivatives directory.
get_elp_path(subject, session)get_events_path(subject, session, task, run)Construct path to events TSV file.
get_h5_dataset(run_key)Get (cached) H5 dataset for a run.
get_h5_path(subject, session, task, run[, ...])Construct path to H5 file.
get_headpos_path(subject, session, task, run)Construct path to cached head position file.
get_hsp_path(subject, session)get_markers_path(subject, session, task)get_meg_dir(subject, session)get_preprocessed_path(subject, session, ...)Construct path to preprocessed file in derivatives.
get_sfreq_from_h5(h5_path)Get sampling frequency from H5 file.
init_continuous_h5([preload_h5])Initialize the H5 data cache.
list_remote_files([refresh])Return dataset-relative file paths advertised by the OSF manifest.
load_continuous_window(subject, session, ...)Load a time window from continuous H5 data.
load_continuous_window_from_sample(sample)Load time window from a sample tuple.
load_head_positions(subject, session, task, run)Load cached head positions from CSV file.
load_preprocessed_bids(subject, session, ...)Load a preprocessed FIF file from the derivatives directory.
load_raw_bids(subject, session, task, run[, ...])Load the raw KIT recording, fetching it (and the marker / head-shape sidecars KIT requires) from OSF if necessary.
prefetch_files(file_paths)Prefetch multiple files in parallel (skips already-present).
raw_bids_exists(subject, session, task, run)Check if raw BIDS data exists for given identifiers.
resolve_remote_file(rel_path)Resolve a single file's OSF location by walking only the folders on its path (~3-4 API calls per new folder, served from cache thereafter).
setup_standardization([standardize, ...])Set up standardization parameters.
standardize(data)Apply z-score normalization and optional clipping to data.
Attributes
OSF_API_BASEOSF_FILES_BASEOSF_PROJECT_FALLBACKSOSF_PROJECT_IDOSF_TOKEN_ENVbroadcasted_meansbroadcasted_stdschannel_meanschannel_stdslabel_infon_channelsn_times