base_model: d0rj/rut5-base-summ
tags:
- 由训练器生成
metrics:
- rouge
model-index:
- name: rut5-base-summ-dialogsum
results: []
rut5-base-summ-dialogsum
该模型是基于d0rj/rut5-base-summ在None数据集上微调的版本。
在评估集上取得如下结果:
- 损失值:1.1263
- Rouge1:33.5111
- Rouge2:0.1696
- RougeL:33.4559
- RougeLsum:33.4934
- 生成长度:4.1546
模型描述
需补充更多信息
预期用途与限制
需补充更多信息
训练与评估数据
需补充更多信息
训练流程
训练超参数
训练过程中使用的超参数如下:
- 学习率:5e-05
- 训练批次大小:8
- 评估批次大小:8
- 随机种子:42
- 优化器:Adam(β1=0.9,β2=0.999,ε=1e-08)
- 学习率调度器类型:线性
- 训练轮次:25
训练结果
训练损失 |
轮次 |
步数 |
验证损失 |
Rouge1 |
Rouge2 |
RougeL |
RougeLsum |
生成长度 |
2.0946 |
1.0 |
786 |
1.7462 |
45.4252 |
0.0 |
45.4009 |
45.4139 |
4.0464 |
1.7182 |
2.0 |
1572 |
1.5005 |
44.9295 |
0.0 |
44.9183 |
44.9108 |
4.1126 |
1.5304 |
3.0 |
2358 |
1.3826 |
39.5888 |
0.0 |
39.5811 |
39.5646 |
4.1698 |
1.4261 |
4.0 |
3144 |
1.3121 |
30.1735 |
0.0 |
30.1127 |
30.1415 |
4.1520 |
1.3252 |
5.0 |
3930 |
1.2641 |
35.7738 |
0.0 |
35.7408 |
35.7858 |
3.8791 |
1.2878 |
6.0 |
4716 |
1.2353 |
33.0773 |
0.0 |
32.9682 |
33.0551 |
3.7252 |
1.2068 |
7.0 |
5502 |
1.2051 |
34.4094 |
0.0 |
34.3902 |
34.3884 |
3.7729 |
1.1763 |
8.0 |
6288 |
1.1952 |
33.0914 |
0.1908 |
33.0267 |
33.0472 |
3.9739 |
1.1346 |
9.0 |
7074 |
1.1798 |
33.9606 |
0.0 |
33.9335 |
33.979 |
4.1768 |
1.1044 |
10.0 |
7860 |
1.1632 |
32.9529 |
0.0 |
32.9367 |
32.9396 |
4.1673 |
1.1073 |
11.0 |
8646 |
1.1499 |
34.0904 |
0.0 |
34.0659 |
34.1317 |
4.1934 |
1.0619 |
12.0 |
9432 |
1.1516 |
32.9502 |
0.0 |
32.9056 |
32.9376 |
4.0312 |
1.0365 |
13.0 |
10218 |
1.1478 |
31.68 |
0.0 |
31.6488 |
31.7003 |
4.0293 |
1.0161 |
14.0 |
11004 |
1.1427 |
32.6651 |
0.0424 |
32.6345 |
32.6538 |
4.1113 |
0.9805 |
15.0 |
11790 |
1.1343 |
34.0304 |
0.0636 |
33.9433 |
33.999 |
4.0674 |
0.9661 |
16.0 |
12576 |
1.1309 |
34.8704 |
0.0848 |
34.8014 |
34.8501 |
4.0681 |
0.9511 |
17.0 |
13362 |
1.1348 |
32.8744 |
0.0 |
32.8277 |
32.8547 |
4.1081 |
0.9392 |
18.0 |
14148 |
1.1326 |
32.9349 |
0.1908 |
32.8895 |
32.9376 |
4.2627 |
0.9341 |
19.0 |
14934 |
1.1263 |
33.5111 |
0.1696 |
33.4559 |
33.4934 |
4.1546 |
0.9396 |
20.0 |
15720 |
1.1349 |
33.9121 |
0.2545 |
33.8438 |
33.8993 |
4.1705 |
0.9314 |
21.0 |
16506 |
1.1276 |
33.0779 |
0.106 |
33.0546 |
33.0903 |
4.1399 |
0.8987 |
22.0 |
17292 |
1.1333 |
33.8566 |
0.1696 |
33.7943 |
33.843 |
4.1419 |
0.8895 |
23.0 |
18078 |
1.1343 |
33.6108 |
0.1484 |
33.5738 |
33.636 |
4.2328 |
0.8847 |
24.0 |
18864 |
1.1355 |
33.4257 |
0.2757 |
33.3804 |
33.4495 |
4.1711 |
0.8832 |
25.0 |
19650 |
1.1355 |
33.6211 |
0.3393 |
33.5937 |
33.636 |
4.1959 |
框架版本
- Transformers 4.35.2
- Pytorch 2.0.1+cu117
- Datasets 2.15.0
- Tokenizers 0.15.0