pnpl.datasets.mixins.HFDownloadMixin

pnpl.datasets.mixins.HFDownloadMixin#

class pnpl.datasets.mixins.HFDownloadMixin[source]#

Mixin providing HuggingFace download functionality.

Classes using this mixin should define: - HUGGINGFACE_REPO: str - Primary HuggingFace repository ID - HUGGINGFACE_FALLBACK_REPOS: list[str] - Optional fallback repositories - data_path: str - Local data directory - download: bool - Whether downloading is enabled

__init__()#

Methods

__init__()

ensure_file(fpath)

Ensure a file exists locally, downloading if needed.

ensure_file_download(fpath, data_path[, repo_id])

Class method to download a file without requiring dataset instantiation.

prefetch_files(file_paths)

Prefetch multiple files in parallel.

Attributes

HUGGINGFACE_FALLBACK_REPOS

HUGGINGFACE_REPO