标签:
- 文本分类
- 多标签
- GoEmotions情感数据集
- transformers库
- huggingface平台
许可证: apache-2.0
库名称: transformers
语言:
- 英文
评估指标:
- 准确率
- F1值
基础模型:
- google-bert/bert-base-uncased
管道标签: 文本分类
🔥 基于GoEmotions数据集微调的BERT模型
📖 模型概述
本模型是在GoEmotions情感数据集上微调的BERT模型(bert-base-uncased
),专为多标签情感分类任务设计,可对输入文本预测多个情感标签。
📊 性能指标
指标 |
得分 |
准确率 |
46.57% |
F1值 |
56.41% |
汉明损失 |
3.39% |
📂 使用示例
from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch
model_name = "codewithdark/bert-Gomotions"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name)
emotion_labels = [
"钦佩", "有趣", "愤怒", "烦恼", "赞同", "关心", "困惑",
"好奇", "渴望", "失望", "反对", "厌恶", "尴尬",
"兴奋", "恐惧", "感激", "悲伤", "快乐", "爱", "紧张", "乐观",
"骄傲", "领悟", "解脱", "悔恨", "伤心", "惊讶", "中性"
]
text = "我今天太开心了!"
inputs = tokenizer(text, return_tensors="pt")
with torch.no_grad():
outputs = model(**inputs)
probs = torch.sigmoid(outputs.logits).squeeze(0)
top5_indices = torch.argsort(probs, descending=True)[:5]
top5_labels = [emotion_labels[i] for i in top5_indices]
top5_probs = [probs[i].item() for i in top5_indices]
print("预测情感TOP5:")
for label, prob in zip(top5_labels, top5_probs):
print(f"{label}: {prob:.4f}")
'''
输出示例:
预测情感TOP5:
快乐: 0.9478
爱: 0.7854
乐观: 0.6342
钦佩: 0.5678
兴奋: 0.5231
'''
🏋️♂️ 训练细节
- 基础模型:
bert-base-uncased
- 数据集: GoEmotions
- 优化器: AdamW
- 损失函数: BCEWithLogitsLoss(多标签分类的二值交叉熵)
- 批大小: 16
- 训练轮次: 3
- 评估指标: 准确率、F1值、汉明损失
📌 Hugging Face快速调用
from transformers import pipeline
classifier = pipeline("text-classification", model="codewithdark/bert-Gomotions", top_k=None)
classifier("我对这次旅行太期待了!")
🛠️ 引用方式
如需使用本模型,请引用:
@misc{your_model,
author = {codewithdark},
title = {基于GoEmotions微调的BERT模型},
year = {2025},
url = {https://huggingface.co/codewithdark/bert-Gomotions}
}