库名称:transformers
基础模型:
- answerdotai/ModernBERT-large
许可证:apache-2.0
语言:
- en
任务标签:zero-shot-classification
数据集:
- nyu-mll/glue
- facebook/anli
标签:
- instruct
- natural-language-inference
- nli
模型卡片
本模型是基于tasksource自然语言推理(NLI)任务(包括MNLI、ANLI、SICK、WANLI、doc-nli、LingNLI、FOLIO、FOL-NLI、LogicNLI、Label-NLI及下表所有数据集)进行多任务微调的ModernBERT模型。
此版本相当于“指导”版本。
模型在Nvidia A30 GPU上训练了20万步。
该模型在推理任务(优于llama 3.1 8B Instruct在ANLI和FOLIO上的表现)、长文本推理、情感分析和新标签的零样本分类方面表现优异。
下表展示了模型在测试集上的准确率。这些分数是同一Transformer模型在不同分类头下的表现。
通过单任务微调(如SST)可进一步提升性能,但此检查点在零样本分类和自然语言推理(矛盾/蕴含/中性分类)中表现极佳。
测试名称 |
测试准确率 |
glue/mnli |
0.89 |
glue/qnli |
0.96 |
glue/rte |
0.91 |
glue/wnli |
0.64 |
glue/mrpc |
0.81 |
glue/qqp |
0.87 |
glue/cola |
0.87 |
glue/sst2 |
0.96 |
super_glue/boolq |
0.66 |
super_glue/cb |
0.86 |
super_glue/multirc |
0.9 |
super_glue/wic |
0.71 |
super_glue/axg |
1 |
anli/a1 |
0.72 |
anli/a2 |
0.54 |
anli/a3 |
0.55 |
sick/label |
0.91 |
sick/entailment_AB |
0.93 |
snli |
0.94 |
scitail/snli_format |
0.95 |
hans |
1 |
WANLI |
0.77 |
recast/recast_ner |
0.85 |
recast/recast_sentiment |
0.97 |
recast/recast_verbnet |
0.89 |
recast/recast_megaveridicality |
0.87 |
recast/recast_verbcorner |
0.87 |
recast/recast_kg_relations |
0.9 |
recast/recast_factuality |
0.95 |
recast/recast_puns |
0.98 |
probability_words_nli/reasoning_1hop |
1 |
probability_words_nli/usnli |
0.79 |
probability_words_nli/reasoning_2hop |
0.98 |
nan-nli |
0.85 |
nli_fever |
0.78 |
breaking_nli |
0.99 |
conj_nli |
0.72 |
fracas |
0.79 |
dialogue_nli |
0.94 |
mpe |
0.75 |
dnc |
0.91 |
recast_white/fnplus |
0.76 |
recast_white/sprl |
0.9 |
recast_white/dpr |
0.84 |
add_one_rte |
0.94 |
paws/labeled_final |
0.96 |
pragmeval/pdtb |
0.56 |
lex_glue/scotus |
0.58 |
lex_glue/ledgar |
0.85 |
dynasent/dynabench.dynasent.r1.all/r1 |
0.83 |
dynasent/dynabench.dynasent.r2.all/r2 |
0.76 |
cycic_classification |
0.96 |
lingnli |
0.91 |
monotonicity-entailment |
0.97 |
scinli |
0.88 |
naturallogic |
0.93 |
dynahate |
0.86 |
syntactic-augmentation-nli |
0.94 |
autotnli |
0.92 |
defeasible-nli/atomic |
0.83 |
defeasible-nli/snli |
0.8 |
help-nli |
0.96 |
nli-veridicality-transitivity |
0.99 |
lonli |
0.99 |
dadc-limit-nli |
0.79 |
folio |
0.71 |
tomi-nli |
0.54 |
puzzte |
0.59 |
temporal-nli |
0.93 |
counterfactually-augmented-snli |
0.81 |
cnli |
0.9 |
boolq-natural-perturbations |
0.72 |
equate |
0.65 |
logiqa-2.0-nli |
0.58 |
mindgames |
0.96 |
ConTRoL-nli |
0.66 |
logical-fallacy |
0.38 |
cladder |
0.89 |
conceptrules_v2 |
1 |
zero-shot-label-nli |
0.79 |
scone |
1 |
monli |
1 |
SpaceNLI |
1 |
propsegment/nli |
0.92 |
FLD.v2/default |
0.91 |
FLD.v2/star |
0.78 |
SDOH-NLI |
0.99 |
scifact_entailment |
0.87 |
feasibilityQA |
0.79 |
AdjectiveScaleProbe-nli |
1 |
resnli |
1 |
semantic_fragments_nli |
1 |
dataset_train_nli |
0.95 |
nlgraph |
0.97 |
ruletaker |
0.99 |
PARARULE-Plus |
1 |
logical-entailment |
0.93 |
nope |
0.56 |
LogicNLI |
0.91 |
contract-nli/contractnli_a/seg |
0.88 |
contract-nli/contractnli_b/full |
0.84 |
nli4ct_semeval2024 |
0.72 |
biosift-nli |
0.92 |
SIGA-nli |
0.57 |
FOL-nli |
0.79 |
doc-nli |
0.81 |
mctest-nli |
0.92 |
natural-language-satisfiability |
0.92 |
idioms-nli |
0.83 |
lifecycle-entailment |
0.79 |
MSciNLI |
0.84 |
hover-3way/nli |
0.92 |
seahorse_summarization_evaluation |
0.81 |
missing-item-prediction/contrastive |
0.88 |
Pol_NLI |
0.93 |
synthetic-retrieval-NLI/count |
0.72 |
synthetic-retrieval-NLI/position |
0.9 |
synthetic-retrieval-NLI/binary |
0.92 |
babi_nli |
0.98 |
使用方法
[ZS] 零样本分类流水线
from transformers import pipeline
classifier = pipeline("zero-shot-classification", model="tasksource/ModernBERT-large-nli")
text = "总有一天我会环游世界"
candidate_labels = ['旅行', '烹饪', '舞蹈']
classifier(text, candidate_labels)
本模型的NLI训练数据包含label-nli,这是一个专门为提升此类零样本分类性能构建的NLI数据集。
[NLI] 自然语言推理流水线
from transformers import pipeline
pipe = pipeline("text-classification", model="tasksource/ModernBERT-large-nli")
pipe([dict(text='有一只猫',
text_pair='有一只黑猫')])
用于进一步微调的主干模型
此检查点比基础版本具有更强的推理和细粒度能力,可用于进一步微调。
引用
@inproceedings{sileo-2024-tasksource,
title = "tasksource: 一个包含结构化数据集预处理框架的大规模NLP任务集合",
author = "Sileo, Damien",
booktitle = "2024年计算语言学与语言资源国际联合会议论文集(LREC-COLING 2024)",
month = 5,
year = "2024",
address = "意大利都灵",
publisher = "ELRA与ICCL",
url = "https://aclanthology.org/2024.lrec-main.1361",
pages = "15655--15684",
}