模型简介
模型特点
模型能力
使用案例
许可证:apache-2.0
语言:
- 英语
评估指标: - F1值
- 召回率
- 精确率
标签: - 命名实体识别(NER)
管道标签:token-classification
库名称:gliner
数据集: - knowledgator/GLINER-multi-task-synthetic-data
- EmergentMethods/AskNews-NER-v0
- urchade/pile-mistral-v0.1
- MultiCoNER/multiconer_v2
- DFKI-SLT/few-nerd
基础模型:knowledgator/gliner-multitask-large-v0.5
xomad/gliner-model-merge-large-v1.0
模型基于预训练模型knowledgator/gliner-multitask-large-v0.5
开发,通过探索模型融合技术,性能显著提升了3.25个点,F1分数从0.6276提升至0.6601。
该模型仅在具有商业友好许可的数据集上训练,以确保在Apache-2.0许可下的广泛适用性。训练过程中使用了以下数据集:
- knowledgator/GLINER-multi-task-synthetic-data
- EmergentMethods/AskNews-NER-v0
- urchade/pile-mistral-v0.1
- MultiCoNER/multiconer_v2
- DFKI-SLT/few-nerd
⚙️ 微调流程
流程从基础模型knowledgator/gliner-multitask-large-v0.5
开始。我们的模型xomad/gliner-model-merge-large-v1.0
分别对上述每个数据集进行微调,并在微调过程中保存多个检查点。将这些检查点汇集后,应用模型融合技术生成不同的融合模型:
uniform_merged
greedy_on_random
greedy_on_sorted
随后,应用WiSE-FT融合技术,从上述3个模型和原始模型中选择配对,生成wise_ft_merged
模型。这完成了第一阶段微调。
第二阶段微调以wise_ft_merged
为起点,重复上述流程,生成最终模型。整个微调流程如下图所示:
微调模型池和融合模型的性能在CrossNER
和TwitterNER基准上评估,并在下图中分别以crossner_f1
和other_f1
展示。
第一阶段微调结果图:
第二阶段微调结果图:
🛠️ 安装
使用此模型需安装GLiNER Python库:
pip install gliner
安装完成后,导入GLiNER类并通过GLiNER.from_pretrained
加载模型。
💻 使用示例
from gliner import GLiNER
model = GLiNER.from_pretrained("xomad/gliner-model-merge-large-v1.0")
text = """
微软由比尔·盖茨和保罗·艾伦于1975年4月4日创立,旨在为Altair 8800开发和销售BASIC解释器。盖茨在微软任职期间,曾担任董事长、首席执行官、总裁和首席软件架构师,同时也是最大个人股东,直至2014年5月。
"""
labels = ["创始人", "计算机", "软件", "职位", "日期", "公司"]
entities = model.predict_entities(text, labels)
for entity in entities:
print(entity["text"], "=>", entity["label"])
输出:
微软 => 公司
比尔·盖茨 => 创始人
保罗·艾伦 => 创始人
1975年4月4日 => 日期
BASIC => 软件
Altair 8800 => 计算机
微软 => 公司
董事长 => 职位
首席执行官 => 职位
总裁 => 职位
首席软件架构师 => 职位
2014年5月 => 日期
📊 基准测试
不同零样本NER基准测试(CrossNER、mit-movie和mit-restaurant)的性能,数据来自https://huggingface.co/knowledgator/gliner-multitask-large-v0.5:
模型 | F1分数 |
---|---|
xomad/gliner-model-merge-large-v1.0 | 0.6601 |
knowledgator/gliner-multitask-v0.5 | 0.6276 |
numind/NuNER_Zero-span | 0.6196 |
gliner-community/gliner_large-v2.5 | 0.615 |
EmergentMethods/gliner_large_news-v2.1 | 0.5876 |
urchade/gliner_large-v2.1 | 0.5754 |
各数据集详细性能:
模型 | 数据集 | 精确率 | 召回率 | F1分数 | F1分数(小数) |
---|---|---|---|---|---|
xomad/gliner-model-merge-large-v1.0 | CrossNER_AI | 62.66% | 57.48% | 59.96% | 0.5996 |
CrossNER_literature | 73.28% | 66.42% | 69.68% | 0.6968 | |
CrossNER_music | 74.89% | 70.67% | 72.72% | 0.7272 | |
CrossNER_politics | 79.46% | 77.57% | 78.51% | 0.7851 | |
CrossNER_science | 74.72% | 70.24% | 72.41% | 0.7241 | |
mit-movie | 67.33% | 57.89% | 62.25% | 0.6225 | |
mit-restaurant | 54.94% | 40.41% | 46.57% | 0.4657 | |
平均 | 0.6601 | ||||
numind/NuNER_Zero-span | CrossNER_AI | 63.82% | 56.82% | 60.12% | 0.6012 |
CrossNER_literature | 73.53% | 58.06% | 64.89% | 0.6489 | |
CrossNER_music | 72.69% | 67.40% | 69.95% | 0.6995 | |
CrossNER_politics | 77.28% | 68.69% | 72.73% | 0.7273 | |
CrossNER_science | 70.08% | 63.12% | 66.42% | 0.6642 | |
mit-movie | 63.00% | 48.88% | 55.05% | 0.5505 | |
mit-restaurant | 54.81% | 37.62% | 44.62% | 0.4462 | |
平均 | 0.6196 | ||||
knowledgator/gliner-multitask-v0.5 | CrossNER_AI | 51.00% | 51.11% | 51.05% | 0.5105 |
CrossNER_literature | 72.65% | 65.62% | 68.96% | 0.6896 | |
CrossNER_music | 74.91% | 73.70% | 74.30% | 0.7430 | |
CrossNER_politics | 78.84% | 77.71% | 78.27% | 0.7827 | |
CrossNER_science | 69.20% | 65.48% | 67.29% | 0.6729 | |
mit-movie | 61.29% | 52.59% | 56.60% | 0.5660 | |
mit-restaurant | 50.65% | 38.13% | 43.51% | 0.4351 | |
平均 | 0.6276 | ||||
gliner-community/gliner_large-v2.5 | CrossNER_AI | 50.85% | 63.03% | 56.29% | 0.5629 |
CrossNER_literature | 64.92% | 67.21% | 66.04% | 0.6604 | |
CrossNER_music | 70.88% | 73.10% | 71.97% | 0.7197 | |
CrossNER_politics | 72.67% | 72.93% | 72.80% | 0.7280 | |
CrossNER_science | 61.71% | 68.85% | 65.08% | 0.6508 | |
mit-movie | 54.63% | 52.83% | 53.71% | 0.5371 | |
mit-restaurant | 47.99% | 42.13% | 44.87% | 0.4487 | |
平均 | 0.6154 | ||||
urchade/gliner_large-v2.1 | CrossNER_AI | 54.98% | 52.00% | 53.45% | 0.5345 |
CrossNER_literature | 59.33% | 56.47% | 57.87% | 0.5787 | |
CrossNER_music | 67.39% | 66.77% | 67.08% | 0.6708 | |
CrossNER_politics | 66.07% | 63.76% | 64.90% | 0.6490 | |
CrossNER_science | 61.45% | 62.56% | 62.00% | 0.6200 | |
mit-movie | 55.94% | 47.36% | 51.29% | 0.5129 | |
mit-restaurant | 53.34% | 40.83% | 46.25% | 0.4625 | |
平均 | 0.5754 | ||||
EmergentMethods/gliner_large_news-v2.1 | CrossNER_AI | 59.60% | 54.55% | 56.96% | 0.5696 |
CrossNER_literature | 65.41% | 56.16% | 60.44% | 0.6044 | |
CrossNER_music | 67.47% | 63.08% | 65.20% | 0.6520 | |
CrossNER_politics | 66.05% | 60.07% | 62.92% | 0.6292 | |
CrossNER_science | 68.44% | 63.57% | 65.92% | 0.6592 | |
mit-movie | 65.85% | 49.59% | 56.57% | 0.5657 | |
mit-restaurant | 54.71% | 35.94% | 43.38% | 0.4338 | |
平均 | 0.5876 |
作者
Hoan Nguyen,来自xomad.com
引用文献
@misc{wortsman2022modelsoupsaveragingweights,
title={模型融合:通过平均多个微调模型的权重提升准确率且不增加推理时间},
author={Mitchell Wortsman and Gabriel Ilharco and Samir Yitzhak Gadre and Rebecca Roelofs and Raphael Gontijo-Lopes and Ari S. Morcos and Hongseok Namkoong and Ali Farhadi and Yair Carmon and Simon Kornblith and Ludwig Schmidt},
year={2022},
eprint={2203.05482},
archivePrefix={arXiv},
primaryClass={cs.LG},
url={https://arxiv.org/abs/2203.05482},
}
@InProceedings{Wortsman_2022_CVPR,
author = {Wortsman, Mitchell and Ilharco, Gabriel and Kim, Jong Wook and Li, Mike and Kornblith, Simon and Roelofs, Rebecca and Lopes, Raphael Gontijo and Hajishirzi, Hannaneh and Farhadi, Ali and Namkoong, Hongseok and Schmidt, Ludwig},
title = {零样本模型的鲁棒微调},
booktitle = {IEEE/CVF计算机视觉与模式识别会议论文集(CVPR)},
month = {六月},
year = {2022},
pages = {7959-7971}
}
@misc{stepanov2024gliner,
title={GLiNER多任务:面向多种信息抽取任务的通用轻量模型},
author={Ihor Stepanov and Mykhailo Shtopko},
year={2024},
eprint={2406.12925},
archivePrefix={arXiv},
primaryClass={id='cs.LG' full_name='机器学习' is_active=True alt_name=None in_archive='cs' is_general=False description='涵盖机器学习研究所有方面的论文(监督学习、无监督学习、强化学习、赌博问题等),包括鲁棒性、可解释性、公平性和方法论。cs.LG也适用于机器学习方法的应用研究。'}
}
@misc{zaratiana2023gliner,
title={GLiNER:基于双向Transformer的通用命名实体识别模型},
author={Urchade Zaratiana and Nadi Tomeh and Pierre Holat and Thierry Charnois},
year={2023},
eprint={2311.08526},
archivePrefix={arXiv},
primaryClass={cs.CL}
}








