DeBERTa-v3-base-mnli-fever-anli开源模型 - 免费使用，零样本分类与语言推理利器

首页

Deberta V3 Base Mnli Fever Anli

由 MoritzLaurer 开发

基于MultiNLI、Fever-NLI和ANLI数据集训练的DeBERTa-v3模型，擅长零样本分类和自然语言推理任务

文本分类

Transformers

英语开源协议:MIT #零样本分类 #自然语言推理 #多任务训练

下载量 613.93k

发布时间 : 3/2/2022

模型简介

该模型在自然语言推理(NLI)任务上表现优异，特别适用于零样本文本分类场景。基于微软DeBERTa-v3-base架构，通过改进预训练目标提升性能。

模型特点

多数据集训练

融合MultiNLI、Fever-NLI和ANLI三大数据集，共763,913个NLI样本对

对抗性测试表现优异

在ANLI对抗性基准测试中超越多数大型模型表现

改进的预训练架构

采用DeBERTa-v3改进版本，通过优化预训练目标显著提升性能

模型能力

零样本文本分类

自然语言推理

文本蕴含判断

多标签分类

使用案例

内容分类

新闻分类

无需训练即可将新闻自动分类到政治、经济等预定义类别

示例准确率约49.5%（ANLI测试集）

语义分析

观点矛盾检测

识别文本中前后陈述是否自相矛盾

🚀 DeBERTa-v3-base-mnli-fever-anli

该模型在文本分类和零样本分类任务中表现出色，基于特定数据集训练，能有效处理自然语言推理问题，为相关领域研究和应用提供了有力支持。

🚀 快速开始

简单的零样本分类管道

#!pip install transformers[sentencepiece]
from transformers import pipeline
classifier = pipeline("zero-shot-classification", model="MoritzLaurer/DeBERTa-v3-base-mnli-fever-anli")
sequence_to_classify = "Angela Merkel is a politician in Germany and leader of the CDU"
candidate_labels = ["politics", "economy", "entertainment", "environment"]
output = classifier(sequence_to_classify, candidate_labels, multi_label=False)
print(output)

NLI使用案例

from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch
device = torch.device("cuda") if torch.cuda.is_available() else torch.device("cpu")

model_name = "MoritzLaurer/DeBERTa-v3-base-mnli-fever-anli"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name)

premise = "I first thought that I liked the movie, but upon second thought it was actually disappointing."
hypothesis = "The movie was good."

input = tokenizer(premise, hypothesis, truncation=True, return_tensors="pt")
output = model(input["input_ids"].to(device))  # device = "cuda:0" or "cpu"
prediction = torch.softmax(output["logits"][0], -1).tolist()
label_names = ["entailment", "neutral", "contradiction"]
prediction = {name: round(float(pred) * 100, 1) for pred, name in zip(prediction, label_names)}
print(prediction)

✨ 主要特性

该模型在MultiNLI、Fever - NLI和Adversarial - NLI (ANLI)数据集上进行训练，包含763913个NLI假设 - 前提对。
此基础模型在ANLI基准测试中几乎优于所有大型模型。
基础模型是微软的DeBERTa - v3 - base，DeBERTa的v3变体通过不同的预训练目标，显著优于该模型的先前版本。

📦 安装指南

在使用模型前，你需要安装transformers库，可使用以下命令进行安装：

#!pip install transformers[sentencepiece]

💻 使用示例

基础用法

#!pip install transformers[sentencepiece]
from transformers import pipeline
classifier = pipeline("zero-shot-classification", model="MoritzLaurer/DeBERTa-v3-base-mnli-fever-anli")
sequence_to_classify = "Angela Merkel is a politician in Germany and leader of the CDU"
candidate_labels = ["politics", "economy", "entertainment", "environment"]
output = classifier(sequence_to_classify, candidate_labels, multi_label=False)
print(output)

高级用法

from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch
device = torch.device("cuda") if torch.cuda.is_available() else torch.device("cpu")

model_name = "MoritzLaurer/DeBERTa-v3-base-mnli-fever-anli"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name)

premise = "I first thought that I liked the movie, but upon second thought it was actually disappointing."
hypothesis = "The movie was good."

input = tokenizer(premise, hypothesis, truncation=True, return_tensors="pt")
output = model(input["input_ids"].to(device))  # device = "cuda:0" or "cpu"
prediction = torch.softmax(output["logits"][0], -1).tolist()
label_names = ["entailment", "neutral", "contradiction"]
prediction = {name: round(float(pred) * 100, 1) for pred, name in zip(prediction, label_names)}
print(prediction)

📚 详细文档

训练数据

DeBERTa - v3 - base - mnli - fever - anli在MultiNLI、Fever - NLI和Adversarial - NLI (ANLI)数据集上进行训练，这些数据集包含763913个NLI假设 - 前提对。

训练过程

DeBERTa - v3 - base - mnli - fever - anli使用Hugging Face训练器进行训练，超参数如下：

training_args = TrainingArguments(
    num_train_epochs=3,              # total number of training epochs
    learning_rate=2e-05,
    per_device_train_batch_size=32,   # batch size per device during training
    per_device_eval_batch_size=32,    # batch size for evaluation
    warmup_ratio=0.1,                # number of warmup steps for learning rate scheduler
    weight_decay=0.06,               # strength of weight decay
    fp16=True                        # mixed precision training
)