selfrag_llama2_7b开源模型 - 免费部署，多样查询输出、自我批判且反思

首页

Selfrag Llama2 7b

由 selfrag 开发

一个70亿参数的Self-RAG模型，能够针对多样化用户查询生成输出结果，并自适应地调用检索系统、自我批判输出内容和检索段落，同时生成反思标记。

大型语言模型

Transformers

开源协议:MIT #自我反思生成 #检索增强生成 #多维度评估

下载量 1,318

发布时间 : 10/18/2023

模型简介

该模型通过标准的下一个标记预测目标，在交织段落与反思标记的指令遵循语料库上进行训练，实现了基于细粒度反馈的高效稳定学习。推理阶段则利用涵盖生成内容多维度的反思标记，采样最符合用户偏好的最佳输出。

模型特点

自适应检索调用

模型能够根据查询需求自动决定是否调用检索系统，优化资源使用效率。

自我批判机制

生成内容时会自我评估并输出反思标记，提供细粒度的质量反馈。

检索增强生成

可结合检索段落生成更准确、更具事实依据的响应。

细粒度反馈学习

训练时利用反思标记实现基于多维度的稳定学习。

模型能力

文本生成

检索增强生成

自我评估

多轮对话

事实核查

使用案例

信息查询

事实性问答

回答需要事实依据的问题时自动检索相关信息

生成带引用段落和可靠性评分的回答

内容分析

文本分类

识别并分类输入文本中的不同元素

输出带自我评估的分类结果

🚀 Self-RAG 7B模型

Self-RAG 7B模型是一个强大的语言模型，它不仅能对用户的各种查询生成输出，还能生成反思标记，以自适应地调用检索系统，并对自身的输出和检索到的段落进行评估。该模型基于标准的下一个标记预测目标进行训练，能实现高效稳定的学习，并提供细粒度的反馈。在推理时，它能利用覆盖生成各方面的反思标记，采样出符合用户偏好的最佳输出。

🚀 快速开始

本模型是一个70亿参数的 Self-RAG 模型，它可以针对用户的各种查询生成输出，同时生成反思标记，以自适应地调用检索系统，并对自身的输出和检索到的段落进行评估。

Self-RAG在我们的指令跟随语料库上进行训练，通过交错段落和反思标记，使用标准的下一个标记预测目标，实现了高效稳定的学习，并提供细粒度的反馈。在推理时，我们利用覆盖生成各方面的反思标记，采样出符合用户偏好的最佳输出。更多详细描述请见我们的论文。

✨ 主要特性

能够生成输出和反思标记，自适应调用检索系统。
基于标准的下一个标记预测目标进行训练，学习高效稳定。
推理时利用反思标记采样最佳输出，符合用户偏好。

📦 安装指南

确保安装 self-rag/requirements.txt 中列出的依赖项。若要运行包含检索系统和细粒度树解码的完整推理管道，请使用我们的代码。

💻 使用示例

基础用法

from transformers import AutoTokenizer, AutoModelForCausalLM
from vllm import LLM, SamplingParams

model = LLM("selfrag/selfrag_llama2_7b", download_dir="/gscratch/h2lab/akari/model_cache", dtype="half")
sampling_params = SamplingParams(temperature=0.0, top_p=1.0, max_tokens=100, skip_special_tokens=False)

def format_prompt(input, paragraph=None):
  prompt = "### Instruction:\n{0}\n\n### Response:\n".format(input)
  if paragraph is not None:
    prompt += "[Retrieval]<paragraph>{0}</paragraph>".format(paragraph)
  return prompt

query_1 = "Leave odd one out: twitter, instagram, whatsapp."
query_2 = "Can you tell me the difference between llamas and alpacas?"
queries = [query_1, query_2]

preds = model.generate([format_prompt(query) for query in queries], sampling_params)
for pred in preds:
  print("Model prediction: {0}".format(pred.outputs[0].text))
# Model prediction: Twitter, Instagram, and WhatsApp are all social media platforms.[No Retrieval]WhatsApp is the odd one out because it is a messaging app, while Twitter and # Instagram are primarily used for sharing photos and videos.[Utility:5]</s> (this query doesn't require factual grounding; just skip retrieval and do normal instruction-following generation)
# Model prediction: Sure![Retrieval]<paragraph> ... (this query requires factual grounding, call a retriever)

# generate with retrieved passage
prompt = format_prompt("Can you tell me the difference between llamas and alpacas?", paragraph="The alpaca (Lama pacos) is a species of South American camelid mammal. It is similar to, and often confused with, the llama. Alpacas are considerably smaller than llamas, and unlike llamas, they were not bred to be working animals, but were bred specifically for their fiber.")
preds = model.generate([prompt], sampling_params)
print([pred.outputs[0].text for pred in preds])
# ['[Relevant]Alpacas are considerably smaller than llamas, and unlike llamas, they were not bred to be working animals, but were bred specifically for their fiber.[Fully supported][Utility:5]</s>']

📚 详细文档

输入格式

如 format_prompt 函数中所述，输入应采用以下格式：

### Instruction:\n{instruction}\n\n### Response:\n".format(instruction)

或者，如果有额外输入：

### Instruction:\n{instruction}\n\n### Input:\n{input}\n\n### Response:\n"

可以在 ### Response:\n 之后的任何位置插入段落，但要确保将段落标记为段落标记（即 <paragraph>{0}</paragraph>）。

训练细节

我们的训练数据可在HuggingFace数据集 selfrag_train_data 上获取。训练细节请参考我们的官方仓库。我们在Stability HPC服务器上使用8个A100 40GB进行训练。

引用与联系

如果使用此模型，请引用我们的工作：

@article{asai2023selfrag,
  author    = {Asai, Akari and Wu, Zeqiu and Wang, Yizhong and Sil, Avirup and Hajishirzi, Hannaneh},
  title     = {{Self-RAG}: Learning to Retrieve, Generate, and Critique through Self-Reflection},
  year      = {2023},
  journal   = { arXiv preprint arXiv:2310.11511 },
  URL       = {https://arxiv.org/abs/2310.11511}
}