Audio Datasets

Concrete audio dataset classes.

Classes

AudioDataset – labeled audio files (classification or regression) UnlabeledAudioDataset – audio files without labels

class src.audio_dataset.AudioDataset(root: str, lazy: bool = True, labels_file: str | None = None)[source]

Bases: LabeledDataset

Labeled audio dataset supporting two storage layouts.

CSV mode (labels_file provided):

Audio files are stored flat inside root. A CSV file maps each filename to its label (string for classification, numeric string for regression, e.g. BPM values).

Folder mode (labels_file is None):

Audio files are stored inside per-class subdirectories of root. The label of each file is the name of its parent subdirectory (e.g. the BallroomData genre folders).

Parameters:
  • root – Path to the directory containing audio files (or subdirectories in folder mode).

  • lazy – If True (default), audio is loaded on demand; if False, all files are loaded into memory at construction time.

  • labels_file – Path to the CSV file (filename → label). Pass None to use the folder-hierarchy mode.

Raises:
  • TypeError – If any argument has an unexpected type.

  • FileNotFoundError – If root or labels_file does not exist.

class src.audio_dataset.UnlabeledAudioDataset(root: str, lazy: bool = True)[source]

Bases: UnlabeledDataset

Audio dataset without labels (flat folder, no CSV required).

Parameters:
  • root – Path to the directory containing audio files.

  • lazy – If True (default), audio is loaded on demand.

Raises:

TypeError – If any argument has an unexpected type.