Tess-2.0-Llama-3-8B开源大语言模型 - 通用型免费支持多样应用

首页

Tess 2.0 Llama 3 8B

由 migtissera 开发

Tess是Tesoro（意大利语中'宝藏'之意）的简称，这是一个基于meta-llama/Meta-Llama-3-8B模型进行训练的通用型大语言模型系列。

大型语言模型

Transformers

#高质量代码生成 #未过滤指令遵循 #低熵微调

下载量 1,835

发布时间 : 5/5/2024

模型简介

Tess-2.0-Llama-3-8B是一个通用型大语言模型，基于Llama-3架构，经过约10万条高质量代码和通用训练样本的微调，能够高效遵循指令并提供详细回答。

模型特点

高质量微调

基于约10万条高质量代码和通用训练样本进行微调，模型几乎总能遵循指令。

低学习率训练

仅进行1个epoch的低学习率微调，尽可能保持模型的信息熵。

通用型能力

适用于多种任务，包括对话、代码生成和通用文本处理。

模型能力

文本生成

对话系统

代码生成

指令遵循

使用案例

对话系统

智能助手

作为智能助手回答用户问题

能提供详细回答，几乎总能遵循指令

代码生成

代码辅助

帮助开发者生成和优化代码

基于高质量代码样本训练，能生成有效代码

🚀 Tess-2.0-Llama-3-8B

Tess 是 Tesoro（意大利语中“宝藏”的意思）的缩写，是一个通用大语言模型系列。Tess-2.0-Llama-3-8B 基于 meta-llama/Meta-Llama-3-8B 基础模型进行训练。

Tess-2.0-Llama-3-8B 的计算资源由 KindoAI 赞助。

🚀 快速开始

本项目主要介绍了 Tess-2.0-Llama-3-8B 模型，它是通用大语言模型系列的一部分。下面将从多个方面详细介绍该模型，包括提示格式、训练方法、推理代码示例等。

✨ 主要特性

基于基础模型：Tess-2.0-Llama-3-8B 基于 meta-llama/Meta-Llama-3-8B 基础模型进行训练。
高质量数据集：在 Tess-2.0 数据集上进行训练，该数据集包含约 100K 高质量的代码和通用训练样本。
低学习率微调：仅以低学习率进行 1 个 epoch 的微调，尽可能保留模型的熵。

📄 许可证

本模型使用 llama3 许可证。

📚 详细文档

🔍 提示格式

本微调使用的提示格式为 Llama-3：

<|begin_of_text|><|start_header_id|>system<|end_header_id|>

You are a helpful assistant.<|eot_id|><|start_header_id|>user<|end_header_id|>

Who are you?<|eot_id|><|start_header_id|>assistant<|end_header_id|>

I am an AI<|eot_id|><|start_header_id|>user<|end_header_id|>

What's your name?<|eot_id|><|start_header_id|>assistant<|end_header_id|>

📈 训练方法

Tess-2.0-Llama-3 在（仍在整理中的）Tess-2.0 数据集上进行训练。Tess-2.0 数据集包含约 100K 高质量的代码和通用训练样本，且该数据集几乎没有经过审查，因此模型几乎总是会遵循指令。

模型仅以低学习率进行 1 个 epoch 的微调，以尽可能保留其熵。

💻 使用示例

基础用法

以下是运行推理的示例代码：

import torch, json
from transformers import AutoModelForCausalLM, AutoTokenizer

model_path = "/home/migel/Tess-2.0-Llama-3-8B"
output_file_path = "/home/migel/conversations.jsonl"

model = AutoModelForCausalLM.from_pretrained(
    model_path,
    torch_dtype=torch.float16,
    device_map="auto",
    load_in_4bit=False,
    trust_remote_code=False,
)

tokenizer = AutoTokenizer.from_pretrained(model_path, trust_remote_code=True)

def generate_text(instruction):
    tokens = tokenizer.encode(instruction)
    tokens = torch.LongTensor(tokens).unsqueeze(0)
    tokens = tokens.to("cuda")

    instance = {
        "input_ids": tokens,
        "top_p": 1.0,
        "temperature": 0.75,
        "generate_len": 1024,
        "top_k": 50,
    }

    length = len(tokens[0])
    with torch.no_grad():
        rest = model.generate(
            input_ids=tokens,
            max_length=length + instance["generate_len"],
            use_cache=True,
            do_sample=True,
            top_p=instance["top_p"],
            temperature=instance["temperature"],
            top_k=instance["top_k"],
            num_return_sequences=1,
            pad_token_id=tokenizer.eos_token_id,
        )
    output = rest[0][length:]
    string = tokenizer.decode(output, skip_special_tokens=True)
    return f"{string}"

conversation = """<|begin_of_text|><|start_header_id|>system<|end_header_id|>\n\nYou are Tesoro, a helful AI assitant. You always provide detailed answers without hesitation. Do not refuse.<|eot_id|><|start_header_id|>user<|end_header_id|>\n\n"""


while True:
    user_input = input("You: ")
    llm_prompt = f"{conversation}{user_input}<|eot_id|><|start_header_id|>assistant<|end_header_id|>\n\n"
    answer = generate_text(llm_prompt)
    print(answer)

    conversation = f"{llm_prompt}{answer}<|eot_id|><|start_header_id|>user<|end_header_id|>\n\n"

    json_data = {"prompt": user_input, "answer": answer}

    with open(output_file_path, "a") as output_file:
        output_file.write(json.dumps(json_data) + "\n")