库名称:transformers
许可证:apache-2.0
基础模型:Helsinki-NLP/opus-mt-es-es
标签:
- generated_from_trainer
模型索引:
- 名称:aslandmsl
结果:[]
aslandmsl
此模型是基于Helsinki-NLP/opus-mt-es-es在None数据集上微调的版本。
在评估集上取得了以下结果:
- 损失:0.1788
- 模型准备时间:0.0058
- Bleu Msl:88.0304
- Bleu Asl:0
- Ter Msl:7.4110
- Ter Asl:100
模型描述
需要更多信息
预期用途与限制
需要更多信息
训练与评估数据
需要更多信息
训练过程
训练超参数
训练过程中使用了以下超参数:
- 学习率:1e-05
- 训练批次大小:32
- 评估批次大小:64
- 随机种子:42
- 优化器:使用adamw_torch,betas=(0.9,0.999),epsilon=1e-08,optimizer_args=无额外优化器参数
- 学习率调度器类型:linear
- 训练轮数:30
- 混合精度训练:Native AMP
训练结果
训练损失 |
轮次 |
步数 |
验证损失 |
模型准备时间 |
Bleu Msl |
Bleu Asl |
Ter Msl |
Ter Asl |
无记录 |
1.0 |
225 |
1.5653 |
0.0058 |
6.5801 |
55.0209 |
107.8081 |
37.3399 |
无记录 |
2.0 |
450 |
0.9988 |
0.0058 |
36.1652 |
80.6836 |
45.5315 |
8.5089 |
1.7595 |
3.0 |
675 |
0.6401 |
0.0058 |
50.4479 |
83.7950 |
32.2672 |
7.3110 |
1.7595 |
4.0 |
900 |
0.4573 |
0.0058 |
61.3116 |
66.8757 |
25.3057 |
6.1545 |
0.6205 |
5.0 |
1125 |
0.3856 |
0.0058 |
66.5991 |
88.5773 |
21.8250 |
5.1219 |
0.6205 |
6.0 |
1350 |
0.3448 |
0.0058 |
43.1115 |
89.5128 |
31.3264 |
4.5849 |
0.3287 |
7.0 |
1575 |
0.3144 |
0.0058 |
65.9756 |
89.9086 |
20.4139 |
4.5023 |
0.3287 |
8.0 |
1800 |
0.2754 |
0.0058 |
45.0564 |
90.8438 |
28.8805 |
4.0479 |
0.2225 |
9.0 |
2025 |
0.2410 |
0.0058 |
72.2558 |
90.5190 |
16.5569 |
4.2131 |
0.2225 |
10.0 |
2250 |
0.2229 |
0.0058 |
72.6469 |
90.9231 |
15.6162 |
4.0892 |
0.2225 |
11.0 |
2475 |
0.2126 |
0.0058 |
73.4167 |
91.5905 |
14.9577 |
3.8827 |
0.1448 |
12.0 |
2700 |
0.2049 |
0.0058 |
74.4555 |
70.4375 |
14.8636 |
4.0892 |
0.1448 |
13.0 |
2925 |
0.1993 |
0.0058 |
73.3591 |
91.3585 |
15.0517 |
4.0066 |
0.11 |
14.0 |
3150 |
0.1958 |
0.0058 |
73.9381 |
91.3182 |
14.0169 |
3.8827 |
0.11 |
15.0 |
3375 |
0.1890 |
0.0058 |
75.5526 |
91.6437 |
14.2051 |
3.8001 |
0.0882 |
16.0 |
3600 |
0.1881 |
0.0058 |
73.7777 |
91.8284 |
14.4873 |
3.7588 |
0.0882 |
17.0 |
3825 |
0.1851 |
0.0058 |
75.4362 |
91.4902 |
14.2051 |
3.7588 |
0.0723 |
18.0 |
4050 |
0.1850 |
0.0058 |
75.6099 |
92.0202 |
14.4873 |
3.6349 |
0.0723 |
19.0 |
4275 |
0.1822 |
0.0058 |
76.2459 |
91.9730 |
14.0169 |
3.6349 |
0.0641 |
20.0 |
4500 |
0.1839 |
0.0058 |
75.0209 |
91.9730 |
14.0169 |
3.6349 |
0.0641 |
21.0 |
4725 |
0.1806 |
0.0058 |
75.7669 |
92.0658 |
13.8288 |
3.5936 |
0.0641 |
22.0 |
4950 |
0.1809 |
0.0058 |
76.2001 |
92.0484 |
13.2643 |
3.5936 |
0.0576 |
23.0 |
5175 |
0.1793 |
0.0058 |
75.9506 |
92.2068 |
13.7347 |
3.5109 |
0.0576 |
24.0 |
5400 |
0.1781 |
0.0058 |
76.3576 |
92.3340 |
13.4525 |
3.4696 |
0.0515 |
25.0 |
5625 |
0.1789 |
0.0058 |
75.8648 |
92.1142 |
13.3584 |
3.5936 |
0.0515 |
26.0 |
5850 |
0.1784 |
0.0058 |
76.3297 |
92.2886 |
12.8881 |
3.5109 |
0.0479 |
27.0 |
6075 |
0.1788 |
0.0058 |
76.0603 |
92.5564 |
13.2643 |
3.3870 |
0.0479 |
28.0 |
6300 |
0.1778 |
0.0058 |
76.3080 |
92.3287 |
13.0762 |
3.5109 |
0.0469 |
29.0 |
6525 |
0.1780 |
0.0058 |
76.3707 |
92.3287 |
13.0762 |
3.5109 |
0.0469 |
30.0 |
6750 |
0.1781 |
0.0058 |
76.3707 |
92.3287 |
13.0762 |
3.5109 |
框架版本
- Transformers 4.46.2
- Pytorch 2.5.1+cu121
- Datasets 3.1.0
- Tokenizers 0.20.3