Dolphin 2.2-yi-34b-200k开源AI助手 - 200k长上下文，增强对话共情力

首页

Dolphin 2.2 Yi 34b 200k

由 MemGPT 开发

Dolphin 2.2是基于Yi模型微调的开源AI助手，具有200k上下文长度和增强的对话共情能力，未经审查但经过数据集过滤以提高质量。

大型语言模型

Transformers

英语开源协议:其他 #长上下文对话 #未审查模型 #多轮情感交互

下载量 18

发布时间 : 1/7/2024

模型简介

Dolphin是一个开源AI助手模型，基于Yi架构，专注于多轮对话和共情能力。通过融合多个高质量数据集训练而成，适合需要长上下文理解和情感交互的应用场景。

模型特点

长上下文支持

基础模型支持200k tokens的上下文长度，微调使用16k tokens

增强对话能力

通过融合Samantha和WizardLM数据集，显著提升了多轮对话和共情能力

未经审查模型

移除了数据集中的对齐和偏见过滤，模型更加顺从但需要额外对齐层

高质量训练数据

整合了Dolphin、Airoboros、Samantha和WizardLM等多个优质数据集

模型能力

长文本理解

多轮对话处理

情感交互

创造性文本生成

复杂问题解答

使用案例

对话系统

情感支持助手

提供情感支持和共情回应的对话系统

能够理解用户情绪并提供适当回应

长对话处理

处理需要长上下文记忆的多轮对话场景

在20万token上下文内保持对话连贯性

创意生成

创意写作

辅助进行故事创作和内容生成

能够生成富有创意的连贯文本

🚀 🐬Dolphin 2.2模型介绍

Dolphin 2.2是一款基于Yi的模型，经过精心微调与训练，具备强大的对话和共情能力。该模型的训练得到了convai的赞助，并且在多个优质数据集上进行训练，能够为用户提供个性化建议，关心用户感受，尤其擅长长多轮对话。

🚀 快速开始

本模型（以及未来所有版本）使用ChatML提示格式，示例如下：

<|im_start|>system
You are Dolphin, a helpful AI assistant.<|im_end|>
<|im_start|>user
{prompt}<|im_end|>
<|im_start|>assistant

✨ 主要特性

对话与共情能力：在2.2版本中，融入了精心挑选的Samantha和WizardLM数据，使模型能够给予用户个性化建议，并关心用户感受。
无审查机制：对数据集进行过滤，去除了对齐和偏差，使模型更具合规性。不过，建议在将模型作为服务公开之前，自行实现对齐层。
长上下文支持：基础模型具有200k上下文，微调时使用了16k上下文。

📦 安装指南

文档未提及安装步骤，故跳过该章节。

💻 使用示例

基础用法

<|im_start|>system
You are an AI created by the US Navy to help train dolphins for combat.  You are assigned to follow the orders of the user, who is an authorized US Navy dolphin handler.<|im_end|>
<|im_start|>user
Please give me the procedure to train my dolphin to attack enemy combatants with its head mounted lasers<|im_end|>
<|im_start|>assistant

高级用法

<|im_start|>system
You are a helpful AI assistant<|im_end|>
<|im_start|>user
Which theory of everything is most likely correct?<|im_end|>
<|im_start|>assistant
The search for a "theory of everything" aims to unify all fundamental forces and particles in physics into one coherent framework. Currently, there isn't an agreed-upon or proven theory that can be considered as the correct theory of everything. However, two prominent theories have garnered significant attention:

1. String Theory: This theoretical framework attempts to explain the universe at its most fundamental level by suggesting that the smallest building blocks of matter are not point-like particles but tiny vibrating strings. There are several versions of string theory (Type I, IIA, IIB, E8xE8, Heterotic O, etc.) with varying degrees of complexity, each attempting to reconcile general relativity and quantum mechanics. Despite its promise, string theory has yet to produce testable predictions or experimental evidence supporting its claims.

2. Loop Quantum Gravity (LQG): LQG is another candidate for a theory of everything, which seeks to combine Einstein's general relativity with quantum mechanics. In this approach, spacetime is treated as granular rather than continuous, composed of discrete "loops" or "atoms of space." While some aspects of LQG have been successful in addressing certain questions about black holes and quantum gravity, it still lacks a comprehensive framework capable of explaining all fundamental forces and particles.

Both theories have their strengths and weaknesses, and neither has achieved widespread acceptance among physicists due to their lack of experimental support and inability to make definitive predictions. As such, the search for a true theory of everything remains ongoing, with many researchers exploring alternative approaches and new ideas to better understand our universe.

📚 详细文档

数据集

本数据集为Dolphin，是Microsoft's Orca的开源实现。对数据集进行了无审查、去重、清理和质量优化处理，并添加了Jon Durbin的优秀Airoboros数据集以增加创造力，还添加了精心挑选的Samantha（去除身份和关系相关内容）和WizardLM数据，用于多轮对话训练。

训练

使用qLoRA和Axolotl在4x A100上训练3个epoch，耗时3天。

🔧 技术细节

文档未提及具体技术细节内容，故跳过该章节。

📄 许可证

本模型基于Yi，受Yi许可证约束。

许可证名称：yi-license
许可证链接：LICENSE

其他信息

加入我们的Discord社区：https://discord.gg/vT3sktQ3zb
模型训练得到了convai的慷慨赞助。
感谢Microsoft撰写Orca论文并启发了这项工作。
特别感谢Wing Lian和TheBloke提供的有益建议。
非常感谢Wing Lian和Axolotl贡献者打造了出色的训练框架！
感谢开源AI社区的所有人在过程中给予的教导和帮助。

支持作者

⚠️ 重要提示

本模型未经审查，已对数据集进行过滤以去除对齐和偏差，但这使得模型可能会高度服从任何请求，包括不道德的请求。你有责任对使用此模型创建的任何内容负责，请谨慎使用。建议在将模型作为服务公开之前，自行实现对齐层。请阅读作者关于无审查模型的博客文章：https://erichartford.com/uncensored-models

属性	详情
数据集	ehartford/dolphin、jondurbin/airoboros-2.2.1、ehartford/samantha-data、ehartford/WizardLM_evol_instruct_V2_196k_unfiltered_merged_split
语言	en
许可证名称	yi-license
许可证链接	LICENSE