Armeni 2022

Armeni 2022#

pnpl.datasets.Armeni2022 wraps the audiobook-listening MEG dataset released by Armeni et al. (2022) on the Radboud Data Repository (DSC_3011085.05_995_v1).

3 subjects × 10 sessions, ~10 hours of audiobook per subject
CTF275 axial gradiometer system, raw .ds directories
Sessions are large (~8 GB raw CTF each)
Single task: compr (comprehension)

Warning

Armeni 2022 is not open access. You need an approved data-sharing agreement with the dataset owner before you can download.

Auth#

export RADBOUD_USERNAME="you@orcid.org"   # often an ORCID
export RADBOUD_PASSWORD="..."

pnpl reads these from the environment (or a project-local .env) and uses HTTP Basic auth against the Radboud WebDAV endpoint.

Quickstart#

from pnpl.datasets import Armeni2022
from pnpl.tasks.armeni2022 import PhonemeClassification

ds = Armeni2022(
    data_path="./data/armeni",
    task=PhonemeClassification(tmin=-0.2, tmax=0.6, label_type="phoneme"),
    include_subjects=["001"],
    include_sessions=["001"],
    include_tasks=["compr"],
    preprocessing="notch+bp+ds",
    download=True,
    standardize=True,
)

x, y = ds[0]
print(x.shape, y.item())

The first construction downloads the requested CTF .ds directory (every chunked binary inside it), runs the preprocessing pipeline, and caches the result as H5. Subsequent constructions read directly from the cached H5.

Note

Because raw CTF sessions are multi-GB, the Armeni loader runs the pipeline chunk-by-chunk (default 120-second chunks) so peak memory stays bounded. Reference / EEG / EOG / STIM channels are dropped before filtering — only MEG channels are preserved.

BIDS axes#

Axis	Values
`subject`	`"001"`, `"002"`, `"003"`
`session`	`"001"`–`"010"`
`task`	`"compr"`
`run`	always `"01"` (Armeni has no run dimension)

Selected arguments#

task — currently pnpl.tasks.armeni2022.PhonemeClassification.
preprocessing — pipeline string used in derivative filenames (e.g. "notch+bp+ds"). Set to None to materialize H5 unchanged from the raw CTF.
preprocessing_config — per-step overrides forwarded to pnpl.preprocessing.Pipeline. See Preprocessing.
include_subjects, include_sessions, include_tasks, include_run_keys (and their exclude_* counterparts).
standardize, clipping_boundary, channel_means, channel_stds.
create_h5_if_missing (default True).
preload_h5 (default False) — read the H5 fully into RAM on first access. Faster repeat reads at the cost of memory.