DeBERTa-v3-large-mnli-fever-anli-ling-wanli开源NLI模型

首页

Deberta V3 Large Mnli Fever Anli Ling Wanli

由 MoritzLaurer 开发

基于DeBERTa-v3-large微调的NLI模型，在多个NLI数据集上达到最先进性能

文本分类

Transformers

英语开源协议:MIT #零样本分类 #自然语言推理 #高准确率

下载量 312.01k

发布时间 : 6/6/2022

模型简介

该模型在MultiNLI、Fever-NLI、ANLI、LingNLI和WANLI数据集上进行了微调，用于自然语言推理和零样本分类任务。

模型特点

多数据集训练

在多个高质量NLI数据集上训练，共885,242个假设-前提对

最先进性能

在ANLI等基准测试上显著优于其他大型模型

零样本分类能力

可用于无需特定领域训练的零样本分类任务

模型能力

自然语言推理

零样本分类

文本分类

使用案例

文本分析

新闻分类

对新闻内容进行零样本分类，如政治、经济等类别

高准确率分类

情感分析

通过NLI判断文本情感倾向

内容审核

有害内容识别

识别文本中是否包含特定类型的有害内容

🚀 DeBERTa-v3-large-mnli-fever-anli-ling-wanli

本模型可用于零样本分类，在自然语言推理任务中表现出色，基于多个高质量数据集微调，显著提升了模型性能。

🚀 快速开始

本模型基于 Microsoft的DeBERTa - v3 - large 微调而来，结合了多项创新，相比经典的掩码语言模型（如BERT、RoBERTa等）有显著优势，详情见论文。

如何使用模型

简单的零样本分类管道

from transformers import pipeline
classifier = pipeline("zero-shot-classification", model="MoritzLaurer/DeBERTa-v3-large-mnli-fever-anli-ling-wanli")
sequence_to_classify = "Angela Merkel is a politician in Germany and leader of the CDU"
candidate_labels = ["politics", "economy", "entertainment", "environment"]
output = classifier(sequence_to_classify, candidate_labels, multi_label=False)
print(output)

NLI使用案例

from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch
device = torch.device("cuda") if torch.cuda.is_available() else torch.device("cpu")

model_name = "MoritzLaurer/DeBERTa-v3-large-mnli-fever-anli-ling-wanli"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name)

premise = "I first thought that I liked the movie, but upon second thought it was actually disappointing."
hypothesis = "The movie was not good."

input = tokenizer(premise, hypothesis, truncation=True, return_tensors="pt")
output = model(input["input_ids"].to(device))  # device = "cuda:0" or "cpu"
prediction = torch.softmax(output["logits"][0], -1).tolist()
label_names = ["entailment", "neutral", "contradiction"]
prediction = {name: round(float(pred) * 100, 1) for pred, name in zip(prediction, label_names)}
print(prediction)

✨ 主要特性

本模型在 MultiNLI、Fever - NLI、Adversarial - NLI (ANLI)、LingNLI 和 WANLI 数据集上进行了微调，这些数据集包含885242个NLI假设 - 前提对。
截至2022年6月6日，该模型是Hugging Face Hub上性能最佳的NLI模型，可用于零样本分类。
在 ANLI基准测试中，该模型显著优于所有其他大型模型。

📦 安装指南

文档未提及安装步骤，故跳过此章节。

💻 使用示例

基础用法

from transformers import pipeline
classifier = pipeline("zero-shot-classification", model="MoritzLaurer/DeBERTa-v3-large-mnli-fever-anli-ling-wanli")
sequence_to_classify = "Angela Merkel is a politician in Germany and leader of the CDU"
candidate_labels = ["politics", "economy", "entertainment", "environment"]
output = classifier(sequence_to_classify, candidate_labels, multi_label=False)
print(output)

高级用法

from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch
device = torch.device("cuda") if torch.cuda.is_available() else torch.device("cpu")

model_name = "MoritzLaurer/DeBERTa-v3-large-mnli-fever-anli-ling-wanli"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name)

premise = "I first thought that I liked the movie, but upon second thought it was actually disappointing."
hypothesis = "The movie was not good."

input = tokenizer(premise, hypothesis, truncation=True, return_tensors="pt")
output = model(input["input_ids"].to(device))  # device = "cuda:0" or "cpu"
prediction = torch.softmax(output["logits"][0], -1).tolist()
label_names = ["entailment", "neutral", "contradiction"]
prediction = {name: round(float(pred) * 100, 1) for pred, name in zip(prediction, label_names)}
print(prediction)

📚 详细文档

训练数据

DeBERTa - v3 - large - mnli - fever - anli - ling - wanli 在 MultiNLI、Fever - NLI、Adversarial - NLI (ANLI)、LingNLI 和 WANLI 数据集上进行训练，这些数据集包含885242个NLI假设 - 前提对。由于 SNLI 数据集存在质量问题，因此明确将其排除在外。更多的数据并不一定能造就更好的NLI模型。

训练过程

DeBERTa - v3 - large - mnli - fever - anli - ling - wanli 使用Hugging Face训练器进行训练，使用了以下超参数。在测试中发现，更长的训练时间和更多的训练轮数会损害模型性能（过拟合）。

training_args = TrainingArguments(
    num_train_epochs=4,              # total number of training epochs
    learning_rate=5e-06,
    per_device_train_batch_size=16,   # batch size per device during training
    gradient_accumulation_steps=2,    # doubles the effective batch_size to 32, while decreasing memory requirements
    per_device_eval_batch_size=64,    # batch size for evaluation
    warmup_ratio=0.06,                # number of warmup steps for learning rate scheduler
    weight_decay=0.01,               # strength of weight decay
    fp16=True                        # mixed precision training
)

评估结果

该模型使用MultiNLI、ANLI、LingNLI、WANLI的测试集和Fever - NLI的开发集进行评估，使用的指标是准确率。该模型在每个数据集上都达到了最先进的性能。令人惊讶的是，它在 ANLI 上的表现比之前的最先进模型（ALBERT - XXL）高出8.3%。推测这是因为ANLI是为了迷惑像RoBERTa（或ALBERT）这样的掩码语言模型而创建的，而DeBERTa - v3使用了更好的预训练目标（RTD）、解耦注意力，并且在更高质量的NLI数据上进行了微调。

数据集	mnli_test_m	mnli_test_mm	anli_test	anli_test_r3	ling_test	wanli_test
准确率	0.912	0.908	0.702	0.64	0.87	0.77
速度（文本/秒，A100 GPU）	696.0	697.0	488.0	425.0	828.0	980.0

🔧 技术细节

文档未提供足够的技术实现细节，故跳过此章节。

📄 许可证

本模型使用MIT许可证。

⚠️ 重要提示

请注意，DeBERTa - v3于2021年12月6日发布，较旧版本的HF Transformers在运行该模型时似乎存在问题（例如，会导致分词器出现问题）。使用Transformers >= 4.13可能会解决一些问题。

💡 使用建议

请参考原始的DeBERTa - v3论文和不同NLI数据集的相关文献，以获取更多关于训练数据和潜在偏差的信息。该模型会重现训练数据中的统计模式。

引用

如果您使用此模型，请引用：Laurer, Moritz, Wouter van Atteveldt, Andreu Salleras Casas, and Kasper Welbers. 2022. ‘Less Annotating, More Classifying – Addressing the Data Scarcity Issue of Supervised Machine Learning with Deep Transfer Learning and BERT - NLI’. Preprint, June. Open Science Framework. https://osf.io/74b8k.