roberta_toxicity_classifier开源毒性评论分类模型 - 精准识别英文文本有毒内容

首页

Roberta Toxicity Classifier

由 s-nlp 开发

基于RoBERTa-large微调的毒性评论分类模型，在Jigsaw竞赛数据集上训练，用于识别英文文本中的毒性内容。

文本分类

Transformers

英语#毒性评论检测 #高精度分类 #多竞赛数据训练

下载量 80.61k

发布时间 : 3/2/2022

模型简介

该模型专门用于对英文评论进行毒性分类，能够有效识别文本中的有害内容。基于200万条样本训练，在测试集上表现优异。

模型特点

高性能分类

在Jigsaw竞赛测试集上达到AUC-ROC 0.98和F1分数0.76的优秀表现

大规模训练数据

整合了Jigsaw三届竞赛约200万条英文样本进行训练

基于RoBERTa优化

采用鲁棒优化的RoBERTa-large预训练模型进行微调

模型能力

文本毒性分类

有害内容检测

自然语言处理

使用案例

内容审核

社交媒体评论过滤

自动识别并过滤社交媒体平台上的有害评论

有效减少平台上的毒性内容

在线社区管理

帮助论坛和社区管理员快速识别不当言论

提高社区内容质量

学术研究

语言毒性研究

用于研究网络语言中的毒性特征和模式

🚀 毒性分类模型

本模型专为毒性分类任务而训练。训练所用的数据集是由 Jigsaw 提供的三个数据集的英文部分合并而成（Jigsaw 2018、Jigsaw 2019、Jigsaw 2020），包含约 200 万个示例。我们将其分为两部分，并在其上微调了一个 RoBERTa 模型（RoBERTa: A Robustly Optimized BERT Pretraining Approach）。该分类器在第一个 Jigsaw 竞赛的测试集上表现出色，AUC-ROC 达到 0.98，F1 分数达到 0.76。

🚀 快速开始

模型信息

属性	详情
模型类型	毒性分类模型
基础模型	FacebookAI/roberta-large
训练数据	由 Jigsaw 的三个数据集的英文部分合并而成，包含约 200 万个示例
许可证	OpenRAIL++

如何使用

import torch
from transformers import RobertaTokenizer, RobertaForSequenceClassification

tokenizer = RobertaTokenizer.from_pretrained('s-nlp/roberta_toxicity_classifier')
model = RobertaForSequenceClassification.from_pretrained('s-nlp/roberta_toxicity_classifier')

batch = tokenizer.encode("You are amazing!", return_tensors="pt")

output = model(batch)
# idx 0 for neutral, idx 1 for toxic

📚 详细文档

引用信息

若要引用我们的工作，请使用以下引用信息：

@inproceedings{logacheva-etal-2022-paradetox,
    title = "{P}ara{D}etox: Detoxification with Parallel Data",
    author = "Logacheva, Varvara  and
      Dementieva, Daryna  and
      Ustyantsev, Sergey  and
      Moskovskiy, Daniil  and
      Dale, David  and
      Krotova, Irina  and
      Semenov, Nikita  and
      Panchenko, Alexander",
    booktitle = "Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)",
    month = may,
    year = "2022",
    address = "Dublin, Ireland",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2022.acl-long.469",
    pages = "6804--6818",
    abstract = "We present a novel pipeline for the collection of parallel data for the detoxification task. We collect non-toxic paraphrases for over 10,000 English toxic sentences. We also show that this pipeline can be used to distill a large existing corpus of paraphrases to get toxic-neutral sentence pairs. We release two parallel corpora which can be used for the training of detoxification models. To the best of our knowledge, these are the first parallel datasets for this task.We describe our pipeline in detail to make it fast to set up for a new language or domain, thus contributing to faster and easier development of new parallel resources.We train several detoxification models on the collected data and compare them with several baselines and state-of-the-art unsupervised approaches. We conduct both automatic and manual evaluations. All models trained on parallel data outperform the state-of-the-art unsupervised models by a large margin. This suggests that our novel datasets can boost the performance of detoxification systems.",
}