rubert_tiny2_russian_emotion_sentiment开源模型 - 识别俄语五种情绪，轻松把握情感

首页

Rubert Tiny2 Russian Emotion Sentiment

由 Kostya165 开发

基于轻量级RuBERT-tiny2微调的俄语情感分类模型，可识别五种情绪

文本分类

Safetensors

其他#俄语情感分析 #轻量级BERT #多情绪分类

下载量 51

发布时间 : 4/21/2025

模型简介

该模型用于识别俄语文本中的五种情绪：攻击性、焦虑、中性、积极和讽刺。

模型特点

轻量级模型

基于rubert-tiny2架构，模型体积小，推理速度快

多情绪分类

可识别五种不同的情绪状态，包括攻击性、焦虑等

高准确率

在验证集上达到89.11%的准确率

模型能力

俄语文本情感分析

情绪状态分类

短文本情绪识别

使用案例

社交媒体分析

论坛情绪监测

分析俄语论坛帖子的情绪倾向

可识别攻击性、讽刺等负面情绪

客户服务

客户反馈分析

自动分类俄语客户反馈的情绪状态

帮助识别焦虑或愤怒的客户

🚀 rubert_tiny2_russian_emotion_sentiment

rubert_tiny2_russian_emotion_sentiment 模型是轻量级模型 cointegrated/rubert-tiny2 的微调版本，用于对俄语消息中的五种情绪进行分类，能够有效识别文本中的情绪倾向，为俄语情感分析提供了有力支持。

🚀 快速开始

安装依赖

pip install transformers torch

使用示例代码

from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch

# 加载模型和分词器
MODEL_ID = "Kostya165/rubert_tiny2_russian_emotion_sentiment"
tokenizer = AutoTokenizer.from_pretrained(MODEL_ID)
model     = AutoModelForSequenceClassification.from_pretrained(MODEL_ID)
model.eval()

texts = [
    "Сегодня отличный день!",
    "Меня это всё бесит и раздражает."
]

# 分词
enc = tokenizer(texts, padding=True, truncation=True, max_length=128, return_tensors="pt")
with torch.no_grad():
    logits = model(**enc).logits
    preds = logits.argmax(dim=-1).tolist()

# 将 ID 转换回标签
id2label = model.config.id2label
labels = [id2label[p] for p in preds]
print(labels)  # 例如: ['positive', 'aggression']

✨ 主要特性

该模型能够对俄语消息进行五种情绪的分类：

0：aggression（ aggression）
1：anxiety（ anxiety）
2：neutral（ neutral）
3：positive（ positive）
4：sarcasm（ sarcasm）

📦 安装指南

pip install transformers torch

💻 使用示例

基础用法

from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch

# 加载模型和分词器
MODEL_ID = "Kostya165/rubert_tiny2_russian_emotion_sentiment"
tokenizer = AutoTokenizer.from_pretrained(MODEL_ID)
model     = AutoModelForSequenceClassification.from_pretrained(MODEL_ID)
model.eval()

texts = [
    "Сегодня отличный день!",
    "Меня это всё бесит и раздражает."
]

# 分词
enc = tokenizer(texts, padding=True, truncation=True, max_length=128, return_tensors="pt")
with torch.no_grad():
    logits = model(**enc).logits
    preds = logits.argmax(dim=-1).tolist()

# 将 ID 转换回标签
id2label = model.config.id2label
labels = [id2label[p] for p in preds]
print(labels)  # 例如: ['positive', 'aggression']

📚 详细文档

验证结果

指标	值
Accuracy	0.8911
F1 macro	0.8910
F1 micro	0.8911

各类别准确率：

aggression (0): 0.9120
anxiety (1): 0.9462
neutral (2): 0.8663
positive (3): 0.8884
sarcasm (4): 0.8426

训练详情

基础模型：cointegrated/rubert-tiny2
数据集：Kostya165/ru_emotion_dvach
训练轮数：2
批次大小：32
学习率：1e-5
混合精度：FP16
正则化：Dropout 0.1，weight_decay 0.01，warmup_ratio 0.1

依赖项

transformers>=4.30.0
torch>=1.10.0
datasets
evaluate

🔧 技术细节

该模型基于 cointegrated/rubert-tiny2 进行微调，使用 Kostya165/ru_emotion_dvach 数据集进行训练。训练过程中采用了 2 轮训练，批次大小为 32，学习率为 1e-5，混合精度为 FP16，并使用了 Dropout 0.1、weight_decay 0.01 和 warmup_ratio 0.1 进行正则化。在验证集上取得了较好的分类效果，各类别准确率均较高。

📄 许可证

CC-BY-SA 4.0。

引用

@article{rubert_tiny2_russian_emotion_sentiment,
  title   = {Russian Emotion Sentiment Classification with RuBERT-tiny2},
  author  = {Kostya165},
  year    = {2024},
  howpublished = {\url{https://huggingface.co/Kostya165/rubert_tiny2_russian_emotion_sentiment}}
}