开源LettuceDetect-large-modernbert-en-v1模型 - 有效检测RAG应用幻觉，支持长上下文处理

首页

Lettucedect Large Modernbert En V1

由 KRLabsOrg 开发

LettuceDetect 是一个基于 ModernBERT 的幻觉检测模型，专为 RAG 应用设计，支持长上下文处理。

大型语言模型

Transformers

英语开源协议:MIT #长上下文幻觉检测 #RAG应用优化 #标记级精度

下载量 438

发布时间 : 2/10/2025

模型简介

该模型用于在上下文和答案对中进行幻觉检测，识别未被给定上下文支持的标记，适用于检索增强生成（RAG）应用。

模型特点

长上下文支持

支持最多 8192 个标记的上下文处理，适用于需要处理详细文档的任务。

标记级别检测

能够识别答案文本中未被上下文支持的标记，提供精确的幻觉检测。

高性能

在 RAGTruth 数据集上表现优异，优于 GPT-4 和 LLAMA-2-13B 等模型。

模型能力

幻觉检测

标记分类

长上下文处理

使用案例

检索增强生成（RAG）

答案验证

验证生成的答案是否基于给定的上下文，避免幻觉内容。

在 RAGTruth 数据集上 F1 得分 79.22%。

🚀 LettuceDetect：幻觉检测模型

LettuceDetect 是一个基于 Transformer 的模型，用于对上下文和答案对进行幻觉检测，专为检索增强生成（RAG）应用程序而设计。该模型基于 ModernBERT 构建，因其支持扩展上下文（最多 8192 个标记）而被特别选用和训练。这种长上下文能力对于需要处理详细和广泛文档以准确确定答案是否得到给定上下文支持的任务至关重要。

LettuceDetect Logo

模型名称：lettucedect-large-modernbert-en-v1
组织：KRLabsOrg
Github：https://github.com/KRLabsOrg/LettuceDetect

🚀 快速开始

安装

安装 'lettucedetect' 仓库：

pip install lettucedetect

使用模型

from lettucedetect.models.inference import HallucinationDetector

# 对于基于 Transformer 的方法：
detector = HallucinationDetector(
    method="transformer", model_path="KRLabsOrg/lettucedect-base-modernbert-en-v1"
)

contexts = ["France is a country in Europe. The capital of France is Paris. The population of France is 67 million.",]
question = "What is the capital of France? What is the population of France?"
answer = "The capital of France is Paris. The population of France is 69 million."

# 获取跨度级别的预测，指示答案中哪些部分被认为是幻觉内容。
predictions = detector.predict(context=contexts, question=question, answer=answer, output_format="spans")
print("Predictions:", predictions)

# Predictions: [{'start': 31, 'end': 71, 'confidence': 0.9944414496421814, 'text': ' The population of France is 69 million.'}]

✨ 主要特性

基于 ModernBERT 架构：具有扩展上下文支持（最多 8192 个标记），能处理详细和广泛的文档。
准确的幻觉检测：训练模型识别答案文本中未得到给定上下文支持的标记，并以跨度形式呈现结果。
高性能表现：在 RAGTruth 数据集上的测试中，大型模型 lettucedetect-large-v1 取得了 79.22% 的整体 F1 分数，优于多种其他方法。

📦 安装指南

安装 'lettucedetect' 仓库：

pip install lettucedetect

💻 使用示例

基础用法

from lettucedetect.models.inference import HallucinationDetector

# 对于基于 Transformer 的方法：
detector = HallucinationDetector(
    method="transformer", model_path="KRLabsOrg/lettucedect-base-modernbert-en-v1"
)

contexts = ["France is a country in Europe. The capital of France is Paris. The population of France is 67 million.",]
question = "What is the capital of France? What is the population of France?"
answer = "The capital of France is Paris. The population of France is 69 million."

# 获取跨度级别的预测，指示答案中哪些部分被认为是幻觉内容。
predictions = detector.predict(context=contexts, question=question, answer=answer, output_format="spans")
print("Predictions:", predictions)

# Predictions: [{'start': 31, 'end': 71, 'confidence': 0.9944414496421814, 'text': ' The population of France is 69 million.'}]

📚 详细文档

模型详情

属性	详情
架构	ModernBERT（大型），具有扩展上下文支持（最多 8192 个标记）
任务	标记分类 / 幻觉检测
训练数据集	RagTruth
语言	英语

工作原理

该模型经过训练，用于识别答案文本中未得到给定上下文支持的标记。在推理过程中，模型返回标记级别的预测，然后将其聚合为跨度。这使用户能够确切看到答案中哪些部分被认为是幻觉内容。

性能表现

示例级别结果

我们在 RAGTruth 数据集的测试集上评估了我们的模型。我们的大型模型 lettucedetect-large-v1 取得了 79.22% 的整体 F1 分数，优于基于提示的方法（如 GPT - 4，63.4%）和基于编码器的模型（如 Luna，65.4%）。它还超过了微调后的 LLAMA - 2 - 13B（78.7%），并与最先进的微调后的 LLAMA - 3 - 8B（83.9%）具有竞争力。总体而言，lettucedetect-large-v1 和 lettucedect-base-v1 是性能非常出色的模型，在推理环境中也非常有效。

Example-level Results

跨度级别结果

在跨度级别上，我们的模型在所有数据类型上都取得了最佳分数，显著优于以前的模型。请注意，这里我们没有与 RAG - HAT 等模型进行比较，因为它们没有提供跨度级别的评估。

Span-level Results

📄 许可证

本项目采用 MIT 许可证。

🔖 引用

如果您使用该模型或工具，请引用以下论文：

@misc{Kovacs:2025,
      title={LettuceDetect: A Hallucination Detection Framework for RAG Applications}, 
      author={Ádám Kovács and Gábor Recski},
      year={2025},
      eprint={2502.17125},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2502.17125}, 
}