pnpl.datasets.mixins.OSFDownloadMixin

pnpl.datasets.mixins.OSFDownloadMixin#

class pnpl.datasets.mixins.OSFDownloadMixin[source]#

Mixin providing OSF download functionality.

Classes using this mixin should define: - OSF_PROJECT_ID: str — primary OSF node id (e.g. "ag3kj")

Optional class attributes: - OSF_PROJECT_FALLBACKS: list[str] — additional OSF node ids whose

osfstorage is searched after the primary. Useful when a single logical dataset spans multiple OSF components.

  • OSF_API_BASE: str — defaults to https://api.osf.io/v2.

  • OSF_FILES_BASE: str — defaults to https://files.osf.io/v1.

  • OSF_TOKEN_ENV: str — env var name for an optional OAuth token (default "OSF_TOKEN"). Public projects do not need auth.

Expected instance attributes: - data_path: str — local data directory (manifest paths are

resolved relative to this).

  • download: bool — whether downloading is enabled.

__init__()#

Methods

__init__()

ensure_file(fpath)

Ensure a file exists locally, downloading from OSF if needed.

get_dataset_manifest([refresh])

Build a manifest of every file on OSF storage for the configured node(s).

list_remote_files([refresh])

Return dataset-relative file paths advertised by the OSF manifest.

prefetch_files(file_paths)

Prefetch multiple files in parallel (skips already-present).

resolve_remote_file(rel_path)

Resolve a single file's OSF location by walking only the folders on its path (~3-4 API calls per new folder, served from cache thereafter).

Attributes

OSF_API_BASE

OSF_FILES_BASE

OSF_PROJECT_FALLBACKS

OSF_PROJECT_ID

OSF_TOKEN_ENV