Orca Alpaca 3b

由 pankajmathur 开发

基于Open_LLaMA-3B模型训练的解释性调优模型，采用Alpaca数据集的指令和输入，并应用了Orca研究论文的数据集构建方法。

大型语言模型

Transformers

英语

#解释性指令调优 #思维链学习 #轻量级LLM

下载量 85

发布时间 : 6/16/2023

模型介绍

内容详情

替代品

模型简介

该模型是一个经过解释性调优的语言模型，通过结合Alpaca数据集的指令和Orca研究论文的方法，学习从教师模型（ChatGPT）中获取思维过程。

模型特点

解释性调优

采用Orca研究论文的方法，学习教师模型的思维过程，而不仅仅是输出结果。

多系统指令支持

使用15条系统指令生成自定义数据集，增强模型的多样性和适应性。

高效训练

在4块A600(50G) GPU上仅用20小时完成训练，成本效益高。

模型能力

指令跟随

逐步推理

文本生成

任务解答

使用案例

教育

数学问题解答

逐步解释如何解决数学问题

提供详细的解题步骤和推理过程

研究辅助

数据分析解释

解释数据分析结果和统计方法

清晰展示分析过程和结论推导

语言:

英语库名称: transformers 数据集:
psmathur/alpaca_orca 许可证: cc-by-nc-sa-4.0

Orca_alpaca_3b

一个基于Open_LLaMA-3B模型训练的解释性调优数据集，采用Alpaca数据集的指令和输入，并应用了Orca研究论文的数据集构建方法。

数据集

我们构建了解释性调优的Alpaca数据集约52K，使用了Orca研究论文中的方法。

我们利用了Orca研究论文中提供的全部15条系统指令来生成自定义数据集，这与原始数据集使用的普通指令调优方法不同。

这帮助学生模型（即本模型）从教师模型（ChatGPT，gpt-3.5-turbo-0301版本）学习思维过程。

请参见下面的示例用法，了解如何在每条指令前添加系统提示。

训练

训练配置如下表所示。

训练在4块A600(50G) GPU上进行，耗时约20小时，成本为66美元，使用了Lambda Labs。

我们通过编写自己的微调脚本，结合了OpenAlpaca仓库提供的部分模型训练代码，使用DeepSpeed和Zero-3方法进行并行GPU训练。

以下是训练中使用的一些参数：


批量大小	16
每GPU训练微批量大小	2
梯度累积步数	2
学习率	2e-5
最大长度	1024
训练轮数	3

示例用法

以下展示如何使用alpaca_orca_open_llama_3b：

import torch
from transformers import LlamaForCausalLM, LlamaTokenizer

# 在3b、7b或13b之间切换模型路径
model_path = 'psmathur/alpaca_orca_open_llama_3b'
tokenizer = LlamaTokenizer.from_pretrained(model_path)
model = LlamaForCausalLM.from_pretrained(
    model_path, torch_dtype=torch.float16, device_map='auto',
)

# 生成文本函数
def generate_text(system, instruction, input=None):
    
    if input:
        prompt = f"### 系统:\n{system}\n\n#\n\n### 用户:\n{instruction}\n\n### 输入:\n{input}\n\n### 响应:\n"
    else:
        prompt = f"### 系统:\n{system}\n\n#\n\n### 用户:\n{instruction}\n\n### 响应:\n"
    
    tokens = tokenizer.encode(prompt)
    tokens = torch.LongTensor(tokens).unsqueeze(0)
    tokens = tokens.to('cuda')

    instance = {'input_ids': tokens,'top_p': 1.0, 'temperature':0.7, 'generate_len': 1024}

    length = len(tokens[0])
    with torch.no_grad():
        rest = model.generate(
            input_ids=tokens, 
            max_length=length+instance['generate_len'], 
            use_cache=True, 
            do_sample=True, 
            top_p=instance['top_p'],
            temperature=instance['temperature']
        )    
    output = rest[0][length:]
    string = tokenizer.decode(output, skip_special_tokens=True)
    print(f'[!] 响应: {string}')

# 与Orca研究论文相同的提示
system = '你是一个AI助手。用户会给你一个任务。你的目标是尽可能忠实地完成任务。在执行任务时，逐步思考并证明你的步骤。'
instruction = '使用给定数据计算中位数。'
input = '[5,2,3,4,1]'
generate_text(system, instruction, input)

附注：我目前#开放工作机会和#寻求合作，如需帮助，请通过www.linkedin.com/in/pankajam联系我。

下一步目标：

尝试更多数据，如Dolly V2、WizardLM等（欢迎建议）
尝试更大的OpenLLaMA模型7B和13B
尝试更好的GPU进行训练，目前无法获取8xA100（40GB），可能需求旺盛
提供更多文本生成UI选项（或许可以使用https://github.com/oobabooga/text-generation-webui）
提供4位GGML/GPTQ量化模型（或许TheBloke可以帮忙）

参考文献：如果您在研究中或应用中发现alpaca_orca_open_llama_3b有用，请使用以下BibTeX引用：

@misc{alpaca_orca_open_llama_3b,
  author = {Pankaj Mathur},
  title = {alpaca_orca_open_llama_3b: 基于OpenLLaMA的自定义解释性调优Alpaca模型},
  year = {2023},
  publisher = {GitHub, HuggingFace},
  journal = {GitHub仓库, HuggingFace仓库},
  howpublished = {\url{https://github.com/pankajarm/alpaca_orca_open_llama_3b}, \url{https://huggingface.co/psmathur/alpaca_orca_open_llama_3b}},
}

@software{openlm2023openllama,
  author = {Xinyang Geng and Hao Liu},
  title = {OpenLLaMA: LLaMA的开源复现},
  month = 5月,
  year = 2023,
  url = {https://github.com/openlm-research/open_llama}
}

@misc{openalpaca,
  author = {Yixuan Su and Tian Lan and Deng Cai},
  title = {OpenAlpaca: 基于OpenLLaMA的全开源指令跟随模型},
  year = {2023},
  publisher = {GitHub},
  journal = {GitHub仓库},
  howpublished = {\url{https://github.com/yxuansu/OpenAlpaca}},
}

@misc{alpaca,
  author = {Rohan Taori and Ishaan Gulrajani and Tianyi Zhang and Yann Dubois and Xuechen Li and Carlos Guestrin and Percy Liang and Tatsunori B. Hashimoto },
  title = {Stanford Alpaca: 一个指令跟随的LLaMA模型},
  year = {2023},
  publisher = {GitHub},
  journal = {GitHub仓库},
  howpublished = {\url{https://github.com/tatsu-lab/stanford_alpaca}},
}