My Awesome Mind Model
模型简介
模型特点
模型能力
使用案例
library_name: transformers license: apache-2.0 base_model: facebook/wav2vec2-base tags:
- 从训练器生成 datasets:
- minds14 metrics:
- 准确率 model-index:
- name: my_awesome_mind_model
results:
- task:
name: 音频分类
type: audio-classification
dataset:
name: minds14
type: minds14
config: en-US
split: train
args: en-US
metrics:
- name: Accuracy type: accuracy value: 0.061946902654867256
- task:
name: 音频分类
type: audio-classification
dataset:
name: minds14
type: minds14
config: en-US
split: train
args: en-US
metrics:
my_awesome_mind_model
该模型是 facebook/wav2vec2-base 在 minds14 数据集上的微调版本。 在评估集上取得以下结果:
- Loss: 2.6577
- Accuracy: 0.0619
模型描述
用于微调的基础模型:https://huggingface.co/facebook/wav2vec2-base
预期用途与限制
用它来熟悉如何对预训练模型进行微调。这不是一个可用于生产的模型。
训练和评估数据
这是训练数据集的链接: https://huggingface.co/datasets/PolyAI/minds14 您可以使用自己的数据,并对其进行预处理以用于您的数据训练。
训练过程
训练超参数
训练过程中使用了以下超参数:
- learning_rate: 3e-05
- train_batch_size: 32
- eval_batch_size: 32
- seed: 42
- gradient_accumulation_steps: 4
- total_train_batch_size: 128
- optimizer: 使用 OptimizerNames.ADAMW_TORCH ,参数为 betas=(0.9,0.999) 和 epsilon=1e-08,optimizer_args=无额外参数
- lr_scheduler_type: linear
- lr_scheduler_warmup_ratio: 0.1
- num_epochs: 10
训练结果
Training Loss | Epoch | Step | Validation Loss | Accuracy |
---|---|---|---|---|
No log | 0.8 | 3 | 2.6463 | 0.0619 |
No log | 1.8 | 6 | 2.6525 | 0.0442 |
No log | 2.8 | 9 | 2.6524 | 0.0619 |
3.0286 | 3.8 | 12 | 2.6569 | 0.0619 |
3.0286 | 4.8 | 15 | 2.6572 | 0.0531 |
3.0286 | 5.8 | 18 | 2.6546 | 0.0619 |
3.0109 | 6.8 | 21 | 2.6593 | 0.0708 |
3.0109 | 7.8 | 24 | 2.6585 | 0.0531 |
3.0109 | 8.8 | 27 | 2.6569 | 0.0619 |
3.0047 | 9.8 | 30 | 2.6577 | 0.0619 |
框架版本
- Transformers 4.48.2
- Pytorch 2.5.1+cu124
- Datasets 3.2.0
- Tokenizers 0.21.0
推理 - 如何使用
加载一个您想要运行推理的音频文件。如果需要,请记住将音频文件的采样率重采样到与模型匹配的采样率!
[1] 18s
Transformers 安装
! pip install transformers datasets
要从源码安装而不是最新发布版,请注释掉上面的命令并取消下面一行的注释。
! pip install git+https://github.com/huggingface/transformers.git
Requirement already satisfied: transformers in /usr/local/lib/python3.11/dist-packages (4.48.2) Collecting datasets Downloading datasets-3.2.0-py3-none-any.whl.metadata (20 kB) Requirement already satisfied: filelock in /usr/local/lib/python3.11/dist-packages (from transformers) (3.17.0) Requirement already satisfied: huggingface-hub<1.0,>=0.24.0 in /usr/local/lib/python3.11/dist-packages (from transformers) (0.28.1) Requirement already satisfied: numpy>=1.17 in /usr/local/lib/python3.11/dist-packages (from transformers) (1.26.4) Requirement already satisfied: packaging>=20.0 in /usr/local/lib/python3.11/dist-packages (from transformers) (24.2) Requirement already satisfied: pyyaml>=5.1 in /usr/local/lib/python3.11/dist-packages (from transformers) (6.0.2) Requirement already satisfied: regex!=2019.12.17 in /usr/local/lib/python3.11/dist-packages (from transformers) (2024.11.6) Requirement already satisfied: requests in /usr/local/lib/python3.11/dist-packages (from transformers) (2.32.3) Requirement already satisfied: tokenizers<0.22,>=0.21 in /usr/local/lib/python3.11/dist-packages (from transformers) (0.21.0) Requirement already satisfied: safetensors>=0.4.1 in /usr/local/lib/python3.11/dist-packages (from transformers) (0.5.2) Requirement already satisfied: tqdm>=4.27 in /usr/local/lib/python3.11/dist-packages (from transformers) (4.67.1) Requirement already satisfied: pyarrow>=15.0.0 in /usr/local/lib/python3.11/dist-packages (from datasets) (17.0.0) Collecting dill<0.3.9,>=0.3.0 (from datasets) Downloading dill-0.3.8-py3-none-any.whl.metadata (10 kB) Requirement already satisfied: pandas in /usr/local/lib/python3.11/dist-packages (from datasets) (2.2.2) Collecting xxhash (from datasets) Downloading xxhash-3.5.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (12 kB) Collecting multiprocess<0.70.17 (from datasets) Downloading multiprocess-0.70.16-py311-none-any.whl.metadata (7.2 kB) Collecting fsspec<=2024.9.0,>=2023.1.0 (from fsspec[http]<=2024.9.0,>=2023.1.0->datasets) Downloading fsspec-2024.9.0-py3-none-any.whl.metadata (11 kB) Requirement already satisfied: aiohttp in /usr/local/lib/python3.11/dist-packages (from datasets) (3.11.11) Requirement already satisfied: aiohappyeyeballs>=2.3.0 in /usr/local/lib/python3.11/dist-packages (from aiohttp->datasets) (2.4.4) Requirement already satisfied: aiosignal>=1.1.2 in /usr/local/lib/python3.11/dist-packages (from aiohttp->datasets) (1.3.2) Requirement already satisfied: attrs>=17.3.0 in /usr/local/lib/python3.11/dist-packages (from aiohttp->datasets) (25.1.0) Requirement already satisfied: frozenlist>=1.1.1 in /usr/local/lib/python3.11/dist-packages (from aiohttp->datasets) (1.5.0) Requirement already satisfied: multidict<7.0,>=4.5 in /usr/local/lib/python3.11/dist-packages (from aiohttp->datasets) (6.1.0) Requirement already satisfied: propcache>=0.2.0 in /usr/local/lib/python3.11/dist-packages (from aiohttp->datasets) (0.2.1) Requirement already satisfied: yarl<2.0,>=1.17.0 in /usr/local/lib/python3.11/dist-packages (from aiohttp->datasets) (1.18.3) Requirement already satisfied: typing-extensions>=3.7.4.3 in /usr/local/lib/python3.11/dist-packages (from huggingface-hub<1.0,>=0.24.0->transformers) (4.12.2) Requirement already satisfied: charset-normalizer<4,>=2 in /usr/local/lib/python3.11/dist-packages (from requests->transformers) (3.4.1) Requirement already satisfied: idna<4,>=2.5 in /usr/local/lib/python3.11/dist-packages (from requests->transformers) (3.10) Requirement already satisfied: urllib3<3,>=1.21.1 in /usr/local/lib/python3.11/dist-packages (from requests->transformers) (2.3.0) Requirement already satisfied: certifi>=2017.4.17 in /usr/local/lib/python3.11/dist-packages (from requests->transformers) (2025.1.31) Requirement already satisfied: python-dateutil>=2.8.2 in /usr/local/lib/python3.11/dist-packages (from pandas->datasets) (2.8.2) Requirement already satisfied: pytz>=2020.1 in /usr/local/lib/python3.11/dist-packages (from pandas->datasets) (2025.1) Requirement already satisfied: tzdata>=2022.7 in /usr/local/lib/python3.11/dist-packages (from pandas->datasets) (2025.1) Requirement already satisfied: six>=1.5 in /usr/local/lib/python3.11/dist-packages (from python-dateutil>=2.8.2->pandas->datasets) (1.17.0) Downloading datasets-3.2.0-py3-none-any.whl (480 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 480.6/480.6 kB 4.0 MB/s eta 0:00:00 Downloading dill-0.3.8-py3-none-any.whl (116 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 116.3/116.3 kB 2.8 MB/s eta 0:00:00 Downloading fsspec-2024.9.0-py3-none-any.whl (179 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 179.3/179.3 kB 11.6 MB/s eta 0:00:00 Downloading multiprocess-0.70.16-py311-none-any.whl (143 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 143.5/143.5 kB 11.1 MB/s eta 0:00:00 Downloading xxhash-3.5.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (194 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 194.8/194.8 kB 10.2 MB/s eta 0:00:00 Installing collected packages: xxhash, fsspec, dill, multiprocess, datasets Attempting uninstall: fsspec Found existing installation: fsspec 2024.10.0 Uninstalling fsspec-2024.10.0: Successfully uninstalled fsspec-2024.10.0 ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts. torch 2.5.1+cu124 requires nvidia-cublas-cu12==12.4.5.8; platform_system == "Linux" and platform_machine == "x86_64", but you have nvidia-cublas-cu12 12.5.3.2 which is incompatible. torch 2.5.1+cu124 requires nvidia-cuda-cupti-cu12==12.4.127; platform_system == "Linux" and platform_machine == "x86_64", but you have nvidia-cuda-cupti-cu12 12.5.82 which is incompatible. torch 2.5.1+cu124 requires nvidia-cuda-nvrtc-cu12==12.4.127; platform_system == "Linux" and platform_machine == "x86_64", but you have nvidia-cuda-nvrtc-cu12 12.5.82 which is incompatible. torch 2.5.1+cu124 requires nvidia-cuda-runtime-cu12==12.4.127; platform_system == "Linux" and platform_machine == "x86_64", but you have nvidia-cuda-runtime-cu12 12.5.82 which is incompatible. torch 2.5.1+cu124 requires nvidia-cudnn-cu12==9.1.0.70; platform_system == "Linux" and platform_machine == "x86_64", but you have nvidia-cudnn-cu12 9.3.0.75 which is incompatible. torch 2.5.1+cu124 requires nvidia-cufft-cu12==11.2.1.3; platform_system == "Linux" and platform_machine == "x86_64", but you have nvidia-cufft-cu12 11.2.3.61 which is incompatible. torch 2.5.1+cu124 requires nvidia-curand-cu12==10.3.5.147; platform_system == "Linux" and platform_machine == "x86_64", but you have nvidia-curand-cu12 10.3.6.82 which is incompatible. torch 2.5.1+cu124 requires nvidia-cusolver-cu12==11.6.1.9; platform_system == "Linux" and platform_machine == "x86_64", but you have nvidia-cusolver-cu12 11.6.3.83 which is incompatible. torch 2.5.1+cu124 requires nvidia-cusparse-cu12==12.3.1.170; platform_system == "Linux" and platform_machine == "x86_64", but you have nvidia-cusparse-cu12 12.5.1.3 which is incompatible. torch 2.5.1+cu124 requires nvidia-nvjitlink-cu12==12.4.127; platform_system == "Linux" and platform_machine == "x86_64", but you have nvidia-nvjitlink-cu12 12.5.82 which is incompatible. gcsfs 2024.10.0 requires fsspec==2024.10.0, but you have fsspec 2024.9.0 which is incompatible. Successfully installed datasets-3.2.0 dill-0.3.8 fsspec-2024.9.0 multiprocess-0.70.16 xxhash-3.5.0 音频分类
[ ] #@title from IPython.display import HTML
HTML('')
音频分类 - 就像文本一样 - 会根据输入数据分配一个类别标签输出。唯一的区别是,这里不是文本输入,而是原始音频波形。一些实际应用包括识别说话者意图、语言分类,甚至通过声音识别动物种类。
本指南将向您展示如何:
在 MInDS-14 数据集上微调 Wav2Vec2 以分类说话者意图。 使用您的微调模型进行推理。 本教程中说明的任务受以下模型架构支持: Audio Spectrogram Transformer, Data2VecAudio, Hubert, SEW, SEW-D, UniSpeech, UniSpeechSat, Wav2Vec2, Wav2Vec2-Conformer, WavLM, Whisper
在









