h2ogpt-gm-oasst1-en-2048-open-llama-7b-preview-700bt开源大模型

首页

H2ogpt Gm Oasst1 En 2048 Open Llama 7b Preview 700bt

由 h2oai 开发

基于OpenLlama 7B预训练模型微调的大语言模型，使用OpenAssistant数据集训练，支持英文文本生成任务

大型语言模型

Transformers

英语开源协议:Apache-2.0 #英文对话生成 #7B参数量级 #指令微调模型

下载量 58

发布时间 : 5/24/2023

模型简介

该模型是使用H2O LLM Studio训练的文本生成模型，基于OpenLlama架构，适用于对话和问答场景

模型特点

基于OpenLlama架构

采用经过700B token预训练的OpenLlama 7B模型作为基础

使用OpenAssistant数据集微调

使用高质量的OpenAssistant对话数据集进行微调，优化对话能力

2048上下文长度

支持长达2048 token的上下文记忆

模型能力

文本生成

对话系统

问答系统

使用案例

对话系统

智能助手

构建能够理解并回应用户问题的智能对话助手

内容生成

文本创作

生成各种类型的文本内容，如文章、故事等

🚀 H2O GPT 模型

本模型基于 H2O LLM Studio 训练，可用于自然语言处理任务，能根据输入生成相关文本内容，为用户提供语言交互能力。

🚀 快速开始

要在配备 GPU 的机器上使用 transformers 库调用此模型，首先需确保已安装 transformers、accelerate 和 torch 库。

pip install transformers==4.28.1
pip install accelerate==0.18.0
pip install torch==2.0.0

✨ 主要特性

基础模型：采用 [openlm - research/open_llama_7b_700bt_preview](https://huggingface.co/openlm - research/open_llama_7b_700bt_preview) 作为基础模型。
数据集：使用 [OpenAssistant/oasst1](https://github.com/h2oai/h2o - llmstudio/blob/1935d84d9caafed3ee686ad2733eb02d2abfce57/app_utils/utils.py#LL1896C5 - L1896C28) 数据集进行训练。

📦 安装指南

在配备 GPU 的机器上，使用以下命令安装所需库：

pip install transformers==4.28.1
pip install accelerate==0.18.0
pip install torch==2.0.0

💻 使用示例

基础用法

import torch
from transformers import pipeline

generate_text = pipeline(
    model="h2oai/h2ogpt - gm - oasst1 - en - 2048 - open - llama - 7b - preview - 700bt",
    torch_dtype=torch.float16,
    trust_remote_code=True,
    use_fast=False,
    device_map={"": "cuda:0"},
)

res = generate_text(
    "Why is drinking water so healthy?",
    min_new_tokens=2,
    max_new_tokens=1024,
    do_sample=False,
    num_beams=1,
    temperature=float(0.3),
    repetition_penalty=float(1.2),
    renormalize_logits=True
)
print(res[0]["generated_text"])

高级用法

若不想使用 trust_remote_code=True，可以下载 h2oai_pipeline.py，将其与你的笔记本放在同一目录下，然后从加载的模型和分词器自行构建管道：

import torch
from h2oai_pipeline import H2OTextGenerationPipeline
from transformers import AutoModelForCausalLM, AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained(
    "h2oai/h2ogpt - gm - oasst1 - en - 2048 - open - llama - 7b - preview - 700bt",
    use_fast=False,
    padding_side="left"
)
model = AutoModelForCausalLM.from_pretrained(
    "h2oai/h2ogpt - gm - oasst1 - en - 2048 - open - llama - 7b - preview - 700bt",
    torch_dtype=torch.float16,
    device_map={"": "cuda:0"}
)
generate_text = H2OTextGenerationPipeline(model=model, tokenizer=tokenizer)

res = generate_text(
    "Why is drinking water so healthy?",
    min_new_tokens=2,
    max_new_tokens=1024,
    do_sample=False,
    num_beams=1,
    temperature=float(0.3),
    repetition_penalty=float(1.2),
    renormalize_logits=True
)
print(res[0]["generated_text"])

你也可以从加载的模型和分词器自行构建管道，并考虑预处理步骤：

from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "h2oai/h2ogpt - gm - oasst1 - en - 2048 - open - llama - 7b - preview - 700bt"  # 可以是本地文件夹或 Hugging Face 模型名称
# 重要提示：提示语的格式必须与模型训练时的格式相同。
# 你可以在实验日志中找到示例提示语。
prompt = "<|prompt|>How are you?</s><|answer|>"

tokenizer = AutoTokenizer.from_pretrained(model_name, use_fast=False)
model = AutoModelForCausalLM.from_pretrained(model_name)
model.cuda().eval()
inputs = tokenizer(prompt, return_tensors="pt", add_special_tokens=False).to("cuda")

# 生成配置可根据需要修改
tokens = model.generate(
    **inputs,
    min_new_tokens=2,
    max_new_tokens=1024,
    do_sample=False,
    num_beams=1,
    temperature=float(0.3),
    repetition_penalty=float(1.2),
    renormalize_logits=True
)[0]

tokens = tokens[inputs["input_ids"].shape[1]:]
answer = tokenizer.decode(tokens, skip_special_tokens=True)
print(answer)

🔧 技术细节

模型架构

LlamaForCausalLM(
  (model): LlamaModel(
    (embed_tokens): Embedding(32000, 4096, padding_idx=0)
    (layers): ModuleList(
      (0-31): 32 x LlamaDecoderLayer(
        (self_attn): LlamaAttention(
          (q_proj): Linear(in_features=4096, out_features=4096, bias=False)
          (k_proj): Linear(in_features=4096, out_features=4096, bias=False)
          (v_proj): Linear(in_features=4096, out_features=4096, bias=False)
          (o_proj): Linear(in_features=4096, out_features=4096, bias=False)
          (rotary_emb): LlamaRotaryEmbedding()
        )
        (mlp): LlamaMLP(
          (gate_proj): Linear(in_features=4096, out_features=11008, bias=False)
          (down_proj): Linear(in_features=11008, out_features=4096, bias=False)
          (up_proj): Linear(in_features=4096, out_features=11008, bias=False)
          (act_fn): SiLUActivation()
        )
        (input_layernorm): LlamaRMSNorm()
        (post_attention_layernorm): LlamaRMSNorm()
      )
    )
    (norm): LlamaRMSNorm()
  )
  (lm_head): Linear(in_features=4096, out_features=32000, bias=False)
)

模型配置

本模型使用 H2O LLM Studio 进行训练，配置文件为 cfg.yaml。你可以访问 [H2O LLM Studio](https://github.com/h2oai/h2o - llmstudio) 了解如何训练自己的大语言模型。

模型验证

使用 [EleutherAI lm - evaluation - harness](https://github.com/EleutherAI/lm - evaluation - harness) 进行模型验证：

CUDA_VISIBLE_DEVICES=0 python main.py --model hf - causal - experimental --model_args pretrained=h2oai/h2ogpt - gm - oasst1 - en - 2048 - open - llama - 7b - preview - 700bt --tasks openbookqa,arc_easy,winogrande,hellaswag,arc_challenge,piqa,boolq --device cuda &> eval.log

📄 许可证

本项目采用 Apache - 2.0 许可证。

⚠️ 重要提示

偏差与冒犯性：大语言模型是在各种互联网文本数据上进行训练的，这些数据可能包含有偏差、种族主义、冒犯性或其他不适当的内容。使用此模型即表示您承认并接受生成的内容有时可能会表现出偏差，或产生冒犯性或不适当的内容。本仓库的开发者不认可、支持或推广任何此类内容或观点。
局限性：大语言模型是基于人工智能的工具，而非人类。它可能会产生不正确、无意义或不相关的回复。用户有责任批判性地评估生成的内容，并自行决定是否使用。
风险自担：使用此大语言模型的用户必须对使用该工具可能产生的任何后果承担全部责任。本仓库的开发者和贡献者不对因使用或滥用所提供的模型而导致的任何损害、损失或伤害承担责任。
伦理考量：鼓励用户负责任且合乎伦理地使用大语言模型。使用此模型即表示您同意不将其用于促进仇恨言论、歧视、骚扰或任何形式的非法或有害活动的目的。
问题反馈：如果您遇到大语言模型生成的有偏差、冒犯性或其他不适当的内容，请通过提供的渠道向仓库维护者报告。您的反馈将有助于改进模型并减轻潜在问题。
免责声明变更：本仓库的开发者保留随时修改或更新此免责声明的权利，恕不另行通知。用户有责任定期查看免责声明，以了解任何变更。

使用本仓库提供的大语言模型即表示您同意接受并遵守本免责声明中规定的条款和条件。如果您不同意本免责声明的任何部分，则不应使用该模型及其生成的任何内容。