gliner-biomed-bi-base-v1.0开源模型 - 免费用于生物医学领域实体类型识别

首页

Gliner Biomed Bi Base V1.0

由 Ihor 开发

GLiNER-BioMed是基于GLiNER框架的高效开放生物医学命名实体识别模型套件，专为生物医学领域设计，能够识别多种实体类型。

序列标注

PyTorch

英语开源协议:Apache-2.0 #生物医学NER #零样本学习 #多标签识别

下载量 25

发布时间 : 2/19/2025

模型简介

该模型利用从大型生成式生物医学语言模型中提取的合成标注，在生物医学实体识别任务中实现了最先进的零样本和小样本性能。

模型特点

高效开放生物医学NER

专为生物医学领域设计，能够识别多种实体类型，提供高效的命名实体识别能力。

零样本和小样本性能优越

在生物医学实体识别任务中实现了最先进的零样本和小样本性能。

基于GLiNER框架

利用双向Transformer编码器（类似BERT）识别任何实体类型，为传统NER模型提供实用替代方案。

模型能力

命名实体识别

信息抽取

生物医学文本分析

使用案例

医疗健康

疾病诊断记录分析

从医疗记录中识别疾病、药物、实验室检查等实体。

准确识别多种生物医学实体，如疾病、药物、药物剂量等。

药物处方分析

从处方中提取药物名称、剂量和用药频率。

高效识别药物相关实体，支持药物管理系统的自动化处理。

生物医学研究

文献实体抽取

从生物医学文献中提取关键实体信息。

支持研究人员快速获取文献中的关键实体信息。

🚀 GLiNER-BioMed

GLiNER-BioMed是一套高效的开放生物医学命名实体识别模型。它基于GLiNER框架，利用从大型生成式生物医学语言模型中提取的合成注释，在生物医学实体识别任务中实现了零样本和少样本的先进性能。该模型为传统命名实体识别模型和大语言模型提供了实用的替代方案。

🚀 快速开始

GLiNER-BioMed是专门用于生物医学命名实体识别的模型，下面将指导你如何快速使用它。

✨ 主要特性

广泛的实体识别能力：GLiNER能够识别任何实体类型，突破了传统NER模型只能识别预定义实体的限制。
高效性能：GLiNER-biomed基于GLiNER框架，引入了一套高效的开放生物医学NER模型，在零样本和少样本的生物医学实体识别任务中表现出色。
合成注释的利用：该模型利用从大型生成式生物医学语言模型中提取的合成注释，提升了模型的性能。

📦 安装指南

使用pip安装官方的GLiNER库：

pip install gliner -U

💻 使用示例

基础用法

安装GLiNER库后，你可以轻松加载GLiNER-biomed模型并执行命名实体识别：

from gliner import GLiNER

model = GLiNER.from_pretrained("Ihor/gliner-biomed-bi-base-v1.0")

text = """
The patient, a 45-year-old male, was diagnosed with type 2 diabetes mellitus and hypertension.
He was prescribed Metformin 500mg twice daily and Lisinopril 10mg once daily. 
A recent lab test showed elevated HbA1c levels at 8.2%.
"""

labels = ["Disease", "Drug", "Drug dosage", "Drug frequency", "Lab test", "Lab test value", "Demographic information"]

entities = model.predict_entities(text, labels, threshold=0.5)

for entity in entities:
    print(entity["text"], "=>", entity["label"])

高级用法

如果你有大量实体并希望预先嵌入它们，请参考以下代码片段：

labels = ["your entities"]
texts = ["your texts"]

entity_embeddings = model.encode_labels(labels, batch_size = 8)

outputs = model.batch_predict_with_embeds(texts, entity_embeddings, labels)

📚 详细文档

模型基准测试

我们在8个复杂的真实世界数据集上对模型进行了测试，并与其他GLiNER模型进行了比较。

模型	F1分数	宏平均F1	宏中位数F1	加权F1
大型模型
NuNER Zero	40.87	21.79	13.94	33.67
NuNER Zero span	40.26	22.51	14.27	32.52
GLiNER bio v0.1	42.34	27.10	24.44	38.32
GLiNER bio v0.2	38.66	25.36	17.02	32.42
GLiNER v1.0	47.77	29.60	21.13	40.78
GLiNER v2.0	37.38	21.42	15.44	33.11
GLiNER v2.1	48.04	29.75	28.20	43.43
GLiNER news v2.1	48.99	31.79	33.77	45.13
GLiNER v2.5	53.81	35.22	35.65	51.57
GLiNER-biomed	59.77	40.67	42.65	58.40
GLiNER-biomed-bi	54.90	35.78	31.66	50.46
基础模型
GLiNER v1.0	41.61	24.98	10.27	31.59
GLiNER v2.0	34.33	24.48	22.01	30.58
GLiNER v2.1	40.25	25.26	14.41	32.64
GLiNER news v2.1	41.59	27.16	17.74	34.44
GLiNER v2.5	46.49	30.93	25.26	44.68
GLiNER-biomed	54.37	36.20	41.61	53.05
GLiNER-biomed-bi	58.31	35.22	32.39	54.91
小型模型
GLiNER v1.0	40.99	22.81	7.86	31.15
GLiNER v2.0	33.55	21.12	15.76	28.78
GLiNER v2.1	38.45	23.25	10.92	30.67
GLiNER news v2.1	39.15	24.96	14.48	33.10
GLiNER v2.5	38.21	28.53	18.01	36.88
GLiNER-biomed	52.53	34.49	38.17	50.87
GLiNER-biomed-bi	56.93	33.88	33.61	53.12

加入我们的Discord社区

在Discord上与我们的社区联系，获取有关我们模型的最新消息、支持和讨论。点击Discord加入。

📄 许可证

本项目采用Apache-2.0许可证。

📚 引用信息

本工作

如果你在工作中使用了GLiNER-biomed模型，请引用以下文献：

@misc{yazdani2025glinerbiomedsuiteefficientmodels,
      title={GLiNER-biomed: A Suite of Efficient Models for Open Biomedical Named Entity Recognition},
      author={Anthony Yazdani and Ihor Stepanov and Douglas Teodoro},
      year={2025},
      eprint={2504.00676},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2504.00676},
}

先前工作

@misc{zaratiana2023gliner,
      title={GLiNER: Generalist Model for Named Entity Recognition using Bidirectional Transformer},
      author={Urchade Zaratiana and Nadi Tomeh and Pierre Holat and Thierry Charnois},
      year={2023},
      eprint={2311.08526},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}

@misc{stepanov2024gliner,
      title={GLiNER multi-task: Generalist Lightweight Model for Various Information Extraction Tasks},
      author={Ihor Stepanov and Mykhailo Shtopko},
      year={2024},
      eprint={2406.12925},
      archivePrefix={arXiv},
      primaryClass={id='cs.LG' full_name='Machine Learning' is_active=True alt_name=None in_archive='cs' is_general=False description='Papers on all aspects of machine learning research (supervised, unsupervised, reinforcement learning, bandit problems, and so on) including also robustness, explanation, fairness, and methodology. cs.LG is also an appropriate primary category for applications of machine learning methods.'}
}