许可证:apache-2.0
标签:
- 训练生成
- 政府报告
- 长文档
评估指标:
- rouge
模型索引:
- 名称:long-t5-base-govreport
结果:[]
数据集:
- pszemraj/govreport-summarization-8192
语言:
- 英文
库名称:transformers
流水线标签:summarization
long-t5-base-govreport
该模型是基于google/long-t5-tglobal-base在None数据集上微调的版本。
在评估集上取得了以下结果:
- 生成长度:787.34
- 损失:1.5448
- Rouge1:57.2303
- Rouge2:24.9705
- Rougel:26.8081
- Rougelsum:54.2747
模型描述
需要更多信息
预期用途与限制
需要更多信息
训练与评估数据
参考pszemraj/govreport-summarization-8192数据集。
训练过程
训练超参数
训练过程中使用了以下超参数:
- 学习率:0.0002
- 训练批次大小:3
- 评估批次大小:1
- 随机种子:4299
- 梯度累积步数:128
- 总训练批次大小:384
- 优化器:Adam,参数beta=(0.9,0.999),epsilon=1e-08
- 学习率调度器类型:余弦
- 学习率预热比例:0.05
- 训练轮次:25.0
训练结果
训练损失 |
轮次 |
步数 |
生成长度 |
验证损失 |
Rouge1 |
Rouge2 |
Rougel |
Rougelsum |
2.1198 |
0.39 |
25 |
805.336 |
1.8720 |
29.4332 |
7.3761 |
17.0816 |
25.065 |
1.8609 |
0.78 |
50 |
833.404 |
1.7601 |
35.3533 |
10.6624 |
18.643 |
31.6979 |
1.7805 |
1.17 |
75 |
866.356 |
1.6833 |
36.5786 |
11.1185 |
20.0358 |
33.2116 |
1.7352 |
1.56 |
100 |
822.348 |
1.6524 |
40.5489 |
13.0695 |
20.1256 |
37.1369 |
1.7371 |
1.95 |
125 |
765.6 |
1.6294 |
43.8594 |
15.2962 |
20.7807 |
40.3461 |
1.6428 |
2.34 |
150 |
844.184 |
1.6055 |
44.5054 |
15.731 |
21.2582 |
40.9775 |
1.6567 |
2.73 |
175 |
857.236 |
1.6031 |
47.3641 |
16.9664 |
21.4998 |
43.994 |
1.5773 |
3.12 |
200 |
841.86 |
1.5855 |
47.2284 |
17.3099 |
21.6793 |
43.9018 |
1.5614 |
3.52 |
225 |
832.8 |
1.5883 |
46.4612 |
17.1368 |
21.5931 |
43.1184 |
1.5328 |
3.91 |
250 |
790.056 |
1.5730 |
46.5685 |
17.5423 |
22.2082 |
43.1811 |
1.5194 |
4.3 |
275 |
825.868 |
1.5690 |
47.6205 |
18.377 |
22.7639 |
44.3701 |
1.571 |
4.69 |
300 |
794.032 |
1.5676 |
49.2203 |
19.1109 |
22.8005 |
46.0679 |
1.4275 |
5.08 |
325 |
833.068 |
1.5656 |
50.6982 |
20.0278 |
23.5585 |
47.5036 |
1.4912 |
5.47 |
350 |
793.068 |
1.5625 |
50.3371 |
19.8639 |
23.3666 |
47.1898 |
1.4764 |
5.86 |
375 |
819.86 |
1.5532 |
50.9702 |
20.7532 |
23.8765 |
47.9915 |
1.3972 |
6.25 |
400 |
770.78 |
1.5564 |
49.279 |
19.4781 |
23.1018 |
46.1942 |
1.4479 |
6.64 |
425 |
806.244 |
1.5529 |
50.3317 |
20.2888 |
23.4454 |
47.3491 |
1.4567 |
7.03 |
450 |
787.48 |
1.5590 |
52.2209 |
21.2868 |
23.9284 |
49.1691 |
1.3933 |
7.42 |
475 |
842.664 |
1.5561 |
51.9578 |
20.5806 |
23.7177 |
48.9121 |
1.4245 |
7.81 |
500 |
813.772 |
1.5420 |
52.3725 |
21.7787 |
24.5209 |
49.4003 |
1.3033 |
8.2 |
525 |
824.66 |
1.5499 |
52.7839 |
21.589 |
24.5617 |
49.8609 |
1.3673 |
8.59 |
550 |
807.348 |
1.5530 |
53.2339 |
22.152 |
24.7587 |
50.2502 |
1.3634 |
8.98 |
575 |
767.952 |
1.5458 |
53.0293 |
22.3194 |
25.174 |
50.078 |
1.3095 |
9.37 |
600 |
856.252 |
1.5412 |
53.7658 |
22.5229 |
25.0448 |
50.708 |
1.3492 |
9.76 |
625 |
826.064 |
1.5389 |
51.8662 |
21.6229 |
24.6819 |
48.8648 |
1.3007 |
10.16 |
650 |
843.544 |
1.5404 |
53.6692 |
22.154 |
24.6218 |
50.6864 |
1.2729 |
10.55 |
675 |
808.764 |
1.5428 |
54.6479 |
23.3029 |
25.5647 |
51.6394 |
1.3758 |
10.94 |
700 |
800.152 |
1.5403 |
54.9418 |
23.3323 |
25.6087 |
51.9256 |
1.3357 |
11.33 |
725 |
814.496 |
1.5455 |
55.2511 |
23.5606 |
25.8237 |
52.3183 |
1.2817 |
11.72 |
750 |
811.144 |
1.5412 |
55.2847 |
23.6632 |
25.9341 |
52.3146 |
1.2771 |
12.11 |
775 |
852.704 |
1.5450 |
55.1956 |
23.5545 |
25.677 |
52.1841 |
1.2892 |
12.5 |
800 |
805.844 |
1.5369 |
54.9563 |
23.5105 |
25.8876 |
51.9568 |
1.2757 |
12.89 |
825 |
813.476 |
1.5467 |
56.4728 |
24.6875 |
26.4415 |
53.4939 |
1.2382 |
13.28 |
850 |
787.34 |
1.5448 |
57.2303 |
24.9705 |
26.8081 |
54.2747 |
框架版本
- Transformers 4.25.0.dev0
- Pytorch 1.13.0+cu117
- Datasets 2.7.0
- Tokenizers 0.13.2