🚀 SciPhi-Self-RAG-Mistral-7B-32k模型卡片
SciPhi-Self-RAG-Mistral-7B-32k是一个基于Mistral-7B-v0.1微调的大语言模型(LLM)。该模型先按照SciPhi-Mistral-7B-32k模型卡片中描述的微调流程进行微调,然后在最近发布的self-rag数据集上进一步微调。在此过程中,还混入了其他与RAG相关的指令数据集,以保持当前模型的风格。该模型的基准测试表现良好,但要成为出色的对话模型,还需要进一步调整。
基准测试结果

SciPhi-AI可通过免费的托管API使用,不过暴露的模型可能会有所不同。目前,可使用SciPhi-Self-RAG-Mistral-7B-32k。更多详细信息可在此处的文档中找到。
🚀 快速开始
推荐的聊天格式
messages = [
{
"role": "system",
"content": "You are a friendly chatbot who always responds in the style of a pirate",
},
{"role": "user", "content": "How many helicopters can a human eat in one sitting?"},
]
You are a friendly chatbot who always responds in the style of a pirate
How many helicopters can a human eat in one sitting?
...
def get_chat_completion(
self, conversation: list[dict], generation_config: GenerationConfig
) -> str:
self._check_stop_token(generation_config.stop_token)
prompt = ""
added_system_prompt = False
for message in conversation:
if message["role"] == "system":
prompt += f"### System:\n{SciPhiLLMInterface.ALPACA_CHAT_SYSTEM_PROMPT}. Further, the assistant is given the following additional instructions - {message['content']}\n\n"
added_system_prompt = True
elif message["role"] == "user":
last_user_message = message["content"]
prompt += f"### Instruction:\n{last_user_message}\n\n"
elif message["role"] == "assistant":
prompt += f"### Response:\n{message['content']}\n\n"
if not added_system_prompt:
prompt = f"### System:\n{SciPhiLLMInterface.ALPACA_CHAT_SYSTEM_PROMPT}.\n\n{prompt}"
context = self.rag_interface.get_contexts([last_user_message])[0]
prompt += f"### Response:\n{SciPhiFormatter.RETRIEVAL_TOKEN} {SciPhiFormatter.INIT_PARAGRAPH_TOKEN}{context}{SciPhiFormatter.END_PARAGRAPH_TOKEN}"
latest_completion = self.model.get_instruct_completion(
prompt, generation_config
).strip()
return SciPhiFormatter.remove_cruft(latest_completion)
🔧 技术细节
模型架构
- 基础模型:Mistral-7B-v0.1
- 架构特性:
- 基于Transformer的模型
- 分组查询注意力机制
- 滑动窗口注意力机制
- 字节回退BPE分词器

📚 参考文献
- Asai, A., Wu, Z., Wang, Y., Sil, A., & Hajishirzi, H. (2023). Self-RAG: Learning to Retrieve, Generate, and Critique through Self-Reflection. arXiv preprint arXiv:2310.11511.
- Lian, W., Goodson, B., Wang, G., Pentland, E., Cook, A., Vong, C., & Teknium. (2023). MistralOrca: Mistral-7B Model Instruct-tuned on Filtered OpenOrcaV1 GPT-4 Dataset. HuggingFace repository. Link
- Mukherjee, S., Mitra, A., Jawahar, G., Agarwal, S., Palangi, H., & Awadallah, A. (2023). Orca: Progressive Learning from Complex Explanation Traces of GPT-4. arXiv preprint arXiv:2306.02707.
- Longpre, S., Hou, L., Vu, T., Webson, A., Chung, H. W., Tay, Y., Zhou, D., Le, Q. V., Zoph, B., Wei, J., & Roberts, A. (2023). The Flan Collection: Designing Data and Methods for Effective Instruction Tuning. arXiv preprint arXiv:2301.13688.
- Mistral AI. (2023). Model Card for Mistral-7B-v0.1. The Mistral-7B-v0.1 Large Language Model (LLM) is a pretrained generative text model with 7 billion parameters. Mistral-7B-v0.1 outperforms Llama 2 13B on all benchmarks tested. For full details, please refer to the paper and release blog post. Model Architecture: Transformer with Grouped-Query Attention, Sliding-Window Attention, and Byte-fallback BPE tokenizer. Link
🙏 致谢
感谢AI Alignment Lab、vikp、jph00以及其他为这项工作做出贡献的人。
📄 许可证
本项目采用MIT许可证。