Orca Mini 3B开源文本生成模型 - 免费部署实现优质文本创作

首页

Orca Mini 3b

由 pankajmathur 开发

orca_mini_3b是基于OpenLLaMa-3B模型训练的文本生成模型，采用了来自WizardLM、Alpaca和Dolly-V2数据集的指令和输入进行解释性调优，并应用了Orca研究论文中的数据集构建方法。

大型语言模型

Transformers

英语#指令微调模型 #多数据集融合 #小规模高效训练

下载量 4,232

发布时间 : 6/22/2023

模型简介

该模型是一个3B参数的文本生成模型，通过解释性调优方法训练，能够生成高质量的文本响应。它特别适合需要理解复杂指令并生成详细解释的应用场景。

模型特点

解释性调优

采用Orca研究论文中的方法，通过系统指令生成自定义数据集，使模型能够学习思考过程。

多数据集训练

结合了WizardLM、Alpaca和Dolly-V2数据集，提供了丰富的指令和输入样本。

高效训练

使用DeepSpeed和ZeRO阶段3优化，在8块A100 GPU上仅需4小时完成训练。

模型能力

文本生成

指令理解

解释性响应生成

使用案例

教育

教学辅助

生成详细的解释和示例，帮助学生理解复杂概念。

内容创作

文章生成

根据指令生成高质量的文章或段落。

🚀 orca_mini_3b

orca_mini_3b是基于OpenLLaMa-3B模型，在经过解释调优的数据集上训练得到的模型，它借助了WizardLM、Alpaca和Dolly-V2数据集的指令和输入，并应用了Orca研究论文的数据集构建方法。

🚀 快速开始

你可以在Google Colab上免费使用带有T4 GPU的orca-mini-3b。点击下面的按钮在Colab中打开：

✨ 主要特性

基于OpenLLaMa-3B模型，在经过解释调优的数据集上进行训练。
利用WizardLM、Alpaca和Dolly-V2数据集的指令和输入，并应用Orca研究论文的数据集构建方法。
帮助学生模型（即本模型）从教师模型（ChatGPT）学习思维过程。

📦 安装指南

文档未提及安装步骤，故跳过此章节。

💻 使用示例

基础用法

import torch
from transformers import LlamaForCausalLM, LlamaTokenizer

# Hugging Face model_path
model_path = 'psmathur/orca_mini_3b'
tokenizer = LlamaTokenizer.from_pretrained(model_path)
model = LlamaForCausalLM.from_pretrained(
    model_path, torch_dtype=torch.float16, device_map='auto',
)


#generate text function
def generate_text(system, instruction, input=None):
    
    if input:
        prompt = f"### System:\n{system}\n\n### User:\n{instruction}\n\n### Input:\n{input}\n\n### Response:\n"
    else:
        prompt = f"### System:\n{system}\n\n### User:\n{instruction}\n\n### Response:\n"
    
    tokens = tokenizer.encode(prompt)
    tokens = torch.LongTensor(tokens).unsqueeze(0)
    tokens = tokens.to('cuda')

    instance = {'input_ids': tokens,'top_p': 1.0, 'temperature':0.7, 'generate_len': 1024, 'top_k': 50}

    length = len(tokens[0])
    with torch.no_grad():
        rest = model.generate(
            input_ids=tokens, 
            max_length=length+instance['generate_len'], 
            use_cache=True, 
            do_sample=True, 
            top_p=instance['top_p'],
            temperature=instance['temperature'],
            top_k=instance['top_k']
        )    
    output = rest[0][length:]
    string = tokenizer.decode(output, skip_special_tokens=True)
    return f'[!] Response: {string}'

# Sample Test Instruction Used by Youtuber Sam Witteveen https://www.youtube.com/@samwitteveenai
system = 'You are an AI assistant that follows instruction extremely well. Help as much as you can.'
instruction = 'Write a letter to Sam Altman, CEO of OpenAI, requesting him to convert GPT4 a private model by OpenAI to an open source project'
print(generate_text(system, instruction))

高级用法

# 此部分文档未提及高级用法说明，保持原代码
import torch
from transformers import LlamaForCausalLM, LlamaTokenizer

# Hugging Face model_path
model_path = 'psmathur/orca_mini_3b'
tokenizer = LlamaTokenizer.from_pretrained(model_path)
model = LlamaForCausalLM.from_pretrained(
    model_path, torch_dtype=torch.float16, device_map='auto',
)


#generate text function
def generate_text(system, instruction, input=None):
    
    if input:
        prompt = f"### System:\n{system}\n\n### User:\n{instruction}\n\n### Input:\n{input}\n\n### Response:\n"
    else:
        prompt = f"### System:\n{system}\n\n### User:\n{instruction}\n\n### Response:\n"
    
    tokens = tokenizer.encode(prompt)
    tokens = torch.LongTensor(tokens).unsqueeze(0)
    tokens = tokens.to('cuda')

    instance = {'input_ids': tokens,'top_p': 1.0, 'temperature':0.7, 'generate_len': 1024, 'top_k': 50}

    length = len(tokens[0])
    with torch.no_grad():
        rest = model.generate(
            input_ids=tokens, 
            max_length=length+instance['generate_len'], 
            use_cache=True, 
            do_sample=True, 
            top_p=instance['top_p'],
            temperature=instance['temperature'],
            top_k=instance['top_k']
        )    
    output = rest[0][length:]
    string = tokenizer.decode(output, skip_special_tokens=True)
    return f'[!] Response: {string}'

# Sample Test Instruction Used by Youtuber Sam Witteveen https://www.youtube.com/@samwitteveenai
system = 'You are an AI assistant that follows instruction extremely well. Help as much as you can.'
instruction = 'Write a letter to Sam Altman, CEO of OpenAI, requesting him to convert GPT4 a private model by OpenAI to an open source project'
print(generate_text(system, instruction))

📚 详细文档

数据集

我们构建了经过解释调优的WizardLM数据集 ~70K、Alpaca数据集 ~52K 和 Dolly-V2数据集 ~15K，使用了 Orca研究论文中的方法。

与原始数据集使用的普通指令调优方法不同，我们利用了Orca研究论文中提供的所有15条系统指令来生成自定义数据集。

这有助于学生模型（即本模型）从教师模型（ChatGPT，gpt-3.5-turbo-0301版本）学习思维过程。

请参阅以下示例用法，了解如何在每个指令之前添加系统提示。

训练

训练配置如下表所示：

训练在8个A100（80G）GPU上进行，持续约4小时，使用 Lambda Labs 的成本为48美元。

我们使用了DeepSpeed和完全分片数据并行（即 ZeRO stage 3），通过编写自己的微调脚本并利用出色的 OpenAlpaca仓库提供的部分模型训练代码。

以下是训练期间使用的一些参数：

参数	值
batch_size	64
train_micro_batch_size_per_gpu	4
gradient_accumulation_steps	2
Learning rate	2e-5
Max length	1024
Epochs	3
Optimizer	AdamW

示例输出

[!] Response:
Dear Sam Altman,

I am writing to request that you convert the GPT4 private model developed by OpenAI to an open source project. As a user of OpenAI, I have been waiting for the day when I can use the advanced natural language processing capabilities of GPT4 in a more open and accessible way.

While OpenAI has made significant progress in developing AI applications, it has primarily focused on building private models that are not accessible to the general public. However, with the recent release of GPT-3, there is a growing demand for more open and accessible AI tools.

Converting GPT4 to an open source project would allow for greater transparency, collaboration, and innovation. It would also help to build trust in the technology and ensure that it is used ethically and responsibly.

I urge you to consider converting GPT4 to an open source project. This would be a significant contribution to the AI community and would help to create a more open and accessible future.

Thank you for your consideration.

Sincerely,

[Your Name]

下一步目标

尝试更多数据，如实际使用FLAN-v2，就像Orca研究论文中那样（欢迎提出建议）。
为文本生成界面提供更多选项（可能使用 https://github.com/oobabooga/text-generation-webui）。
提供4位GGML/GPTQ量化模型（可能 TheBloke 可以提供帮助）。

局限性和偏差

本模型可能会产生事实错误的输出，不应依赖它来生成事实准确的信息。本模型在各种公共数据集上进行训练。尽管在清理预训练数据方面付出了巨大努力，但该模型仍有可能生成低俗、有偏见或其他冒犯性的输出。

免责声明

本模型的许可证不构成法律建议。我们不对使用此模型的第三方的行为负责。在将此模型用于商业目的之前，请咨询律师。

引用

如果您在研究或应用中发现wizardlm_alpaca_dolly_orca_open_llama_3b很有用，请使用以下BibTeX进行引用：

@misc{orca_mini_3b,
  author = {Pankaj Mathur},
  title = {wizardlm_alpaca_dolly_orca_open_llama_3b: An explain tuned OpenLLaMA-3b model on custom wizardlm, alpaca, & dolly datasets},
  year = {2023},
  publisher = {GitHub, HuggingFace},
  journal = {GitHub repository, HuggingFace repository},
  howpublished = {\url{https://github.com/pankajarm/wizardlm_alpaca_dolly_orca_open_llama_3b}, \url{https://https://huggingface.co/psmathur/wizardlm_alpaca_dolly_orca_open_llama_3b}},
}

@misc{mukherjee2023orca,
      title={Orca: Progressive Learning from Complex Explanation Traces of GPT-4}, 
      author={Subhabrata Mukherjee and Arindam Mitra and Ganesh Jawahar and Sahaj Agarwal and Hamid Palangi and Ahmed Awadallah},
      year={2023},
      eprint={2306.02707},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}

@software{openlm2023openllama,
  author = {Xinyang Geng and Hao Liu},
  title = {OpenLLaMA: An Open Reproduction of LLaMA},
  month = May,
  year = 2023,
  url = {https://github.com/openlm-research/open_llama}
}

@misc{openalpaca,
  author = {Yixuan Su and Tian Lan and Deng Cai},
  title = {OpenAlpaca: A Fully Open-Source Instruction-Following Model Based On OpenLLaMA},
  year = {2023},
  publisher = {GitHub},
  journal = {GitHub repository},
  howpublished = {\url{https://github.com/yxuansu/OpenAlpaca}},
}

@misc{alpaca,
  author = {Rohan Taori and Ishaan Gulrajani and Tianyi Zhang and Yann Dubois and Xuechen Li and Carlos Guestrin and Percy Liang and Tatsunori B. Hashimoto },
  title = {Stanford Alpaca: An Instruction-following LLaMA model},
  year = {2023},
  publisher = {GitHub},
  journal = {GitHub repository},
  howpublished = {\url{https://github.com/tatsu-lab/stanford_alpaca}},
}

Open LLM Leaderboard评估结果

详细结果可在此处找到。

指标	值
平均值	35.5
ARC (25-shot)	41.55
HellaSwag (10-shot)	61.52
MMLU (5-shot)	26.79
TruthfulQA (0-shot)	42.42
Winogrande (5-shot)	61.8
GSM8K (5-shot)	0.08
DROP (3-shot)	14.33

Open LLM Leaderboard评估结果

详细结果可在此处找到。

指标	值
平均值	39.03
AI2 Reasoning Challenge (25-Shot)	41.55
HellaSwag (10-Shot)	61.52
MMLU (5-Shot)	26.79
TruthfulQA (0-shot)	42.42
Winogrande (5-shot)	61.80
GSM8k (5-shot)	0.08