library_name: transformers
tags:
- 气候
- 显著性
- 政治
- 宣言
metrics:
- 准确率
- F1值
base_model:
- FacebookAI/xlm-roberta-base
policlim模型卡片
模型描述
该模型用于检测(政治)文本中气候变化议题的显著性。基于XLM-roberta模型,通过3,434条来自政治宣言(源自Manifesteo项目数据库)的人工标注准句子进行微调训练,最终模型在验证集上取得0.935的F1值和0.957的准确率。
我们已将该模型应用于政治宣言的气候变化显著性分类研究,初步成果详见下文工作论文。该论文完整记载了训练集构建、模型训练流程、评估方法及最终数据集的所有技术细节。
引用信息
@techreport{sanford2024policlim,
title={Policlim: 1990-2022年间45国政治宣言中的气候变化论述数据集},
author={Sanford, Mary and Pianta, Silvia and Schmid, Nicolas and Musto, Giorgio},
type={工作论文},
doi={https://osf.io/preprints/osf/bq356_v4},
year={2025}
}
模型使用指南
本模型可用于文本分类任务,亦可作为基础模型进行下游任务微调。推荐使用simpletransformers
工具包快速部署:
import simpletransformers
from simpletransformers.classification import ClassificationModel, ClassificationArgs
data = pd.read_csv('your_data.csv')
model = ClassificationModel(
model_type = "xlmroberta",
model_name = 'policlim'
)
preds,output = model.predict(data['text'].tolist())
from sklearn.metrics import f1_score, precision, accuracy, recall
new_train = pd.read_csv('your_new_train_data.csv')
new_test = pd.read_csv('your_new_test_data.csv')
new_eval = pd.read_csv('your_new_eval_data.csv')
model = ClassificationModel(
model_type="xlmroberta",
model_name="policlim",
num_labels=2,
ignore_mismatched_sizes=True,
use_cuda=True
)
model.train_model(train_df = new_train,
eval_df = new_test,
f1_train = f1_score(labels, preds,average=None)
)
result, model_outputs, wrong_predictions = model.eval_model(
val_df,
f1_eval = f1_score(labels, preds,average=None),
precision = precision(labels, preds,average=None),
recall = recall(labels, preds,average=None),
acc = accuracy_score(labels, preds,average=None)
)
print('\n测试集评估结果:\n')
print(result)
模型来源
- 代码库: https://github.com/marysanford/policlim/tree/main
- 论文: https://osf.io/preprints/osf/bq356
- 数据源: https://manifesto-project.wzb.eu/
模型卡片作者
Mary Sanford, mary.sanford@cmcc.it