许可证:apache-2.0
标签:
- generated_from_trainer
模型索引:
- 名称:t5-small-paraphrase-pubmed
结果:[]
t5-small-paraphrase-pubmed
该模型是基于t5-small在未知数据集上微调的版本。
在评估集上取得了以下结果:
- 损失:0.4032
- Rouge2精确率:0.8281
- Rouge2召回率:0.6346
- Rouge2 F值:0.6996
模型描述
需要更多信息
预期用途与限制
需要更多信息
训练与评估数据
需要更多信息
训练过程
训练超参数
训练过程中使用了以下超参数:
- 学习率:2e-05
- 训练批次大小:16
- 评估批次大小:16
- 随机种子:42
- 优化器:Adam,参数为betas=(0.9,0.999),epsilon=1e-08
- 学习率调度器类型:线性
- 训练轮数:40
- 混合精度训练:原生AMP
训练结果
训练损失 |
轮次 |
步数 |
验证损失 |
Rouge2精确率 |
Rouge2召回率 |
Rouge2 F值 |
0.5253 |
1.0 |
663 |
0.4895 |
0.8217 |
0.6309 |
0.695 |
0.5385 |
2.0 |
1326 |
0.4719 |
0.822 |
0.6307 |
0.6953 |
0.5255 |
3.0 |
1989 |
0.4579 |
0.8225 |
0.631 |
0.6954 |
0.4927 |
4.0 |
2652 |
0.4510 |
0.824 |
0.6315 |
0.6965 |
0.484 |
5.0 |
3315 |
0.4426 |
0.8254 |
0.6323 |
0.6974 |
0.4691 |
6.0 |
3978 |
0.4383 |
0.8241 |
0.6311 |
0.6962 |
0.4546 |
7.0 |
4641 |
0.4319 |
0.8248 |
0.6322 |
0.6969 |
0.4431 |
8.0 |
5304 |
0.4270 |
0.8254 |
0.633 |
0.6977 |
0.4548 |
9.0 |
5967 |
0.4257 |
0.8257 |
0.6322 |
0.6976 |
0.4335 |
10.0 |
6630 |
0.4241 |
0.8271 |
0.6333 |
0.6986 |
0.4234 |
11.0 |
7293 |
0.4203 |
0.827 |
0.6341 |
0.6992 |
0.433 |
12.0 |
7956 |
0.4185 |
0.8279 |
0.6347 |
0.6998 |
0.4108 |
13.0 |
8619 |
0.4161 |
0.8285 |
0.6352 |
0.7004 |
0.4101 |
14.0 |
9282 |
0.4133 |
0.8289 |
0.6356 |
0.7008 |
0.4155 |
15.0 |
9945 |
0.4149 |
0.8279 |
0.635 |
0.6998 |
0.3991 |
16.0 |
10608 |
0.4124 |
0.8289 |
0.6353 |
0.7005 |
0.3962 |
17.0 |
11271 |
0.4113 |
0.829 |
0.6353 |
0.7006 |
0.3968 |
18.0 |
11934 |
0.4114 |
0.8285 |
0.6352 |
0.7002 |
0.3962 |
19.0 |
12597 |
0.4100 |
0.8282 |
0.6346 |
0.6998 |
0.3771 |
20.0 |
13260 |
0.4078 |
0.829 |
0.6352 |
0.7005 |
0.3902 |
21.0 |
13923 |
0.4083 |
0.8295 |
0.6351 |
0.7006 |
0.3811 |
22.0 |
14586 |
0.4077 |
0.8276 |
0.6346 |
0.6995 |
0.38 |
23.0 |
15249 |
0.4076 |
0.8281 |
0.6346 |
0.6997 |
0.3695 |
24.0 |
15912 |
0.4059 |
0.8277 |
0.6344 |
0.6993 |
0.3665 |
25.0 |
16575 |
0.4043 |
0.8278 |
0.6343 |
0.6992 |
0.3728 |
26.0 |
17238 |
0.4059 |
0.8279 |
0.6346 |
0.6994 |
0.3669 |
27.0 |
17901 |
0.4048 |
0.8271 |
0.6342 |
0.6991 |
0.3702 |
28.0 |
18564 |
0.4058 |
0.8265 |
0.6338 |
0.6985 |
0.3674 |
29.0 |
19227 |
0.4049 |
0.8277 |
0.6345 |
0.6993 |
0.364 |
30.0 |
19890 |
0.4048 |
0.8273 |
0.6341 |
0.699 |
0.3618 |
31.0 |
20553 |
0.4041 |
0.828 |
0.6349 |
0.6997 |
0.3609 |
32.0 |
21216 |
0.4040 |
0.8275 |
0.6346 |
0.6994 |
0.357 |
33.0 |
21879 |
0.4037 |
0.8278 |
0.6348 |
0.6996 |
0.3638 |
34.0 |
22542 |
0.4038 |
0.8275 |
0.634 |
0.6989 |
0.3551 |
35.0 |
23205 |
0.4035 |
0.8275 |
0.6344 |
0.6992 |
0.358 |
36.0 |
23868 |
0.4035 |
0.8279 |
0.6347 |
0.6995 |
0.3519 |
37.0 |
24531 |
0.4034 |
0.8277 |
0.6343 |
0.6992 |
0.359 |
38.0 |
25194 |
0.4035 |
0.8281 |
0.6346 |
0.6996 |
0.3542 |
39.0 |
25857 |
0.4033 |
0.8281 |
0.6346 |
0.6996 |
0.3592 |
40.0 |
26520 |
0.4032 |
0.8281 |
0.6346 |
0.6996 |
框架版本
- Transformers 4.12.3
- Pytorch 1.9.0+cu111
- Datasets 1.15.1
- Tokenizers 0.10.3