Meta-Llama-3-120B-Instruct开源大语言模型 - 免费助力创意写作任务

首页

Meta Llama 3 120B Instruct

由 mlabonne 开发

通过MergeKit工具对Meta-Llama-3-70B-Instruct进行自融合创建的120B参数大语言模型，擅长创意写作任务

大型语言模型

Transformers

开源协议:其他 #创意写作优化 #70B自融合架构 #Llama3对话模板

下载量 17

发布时间 : 5/1/2024

模型简介

基于Llama 3架构的120B参数指令微调模型，采用7个70B模型的层融合技术构建，支持8K上下文窗口并可通过参数扩展，特别优化创意写作能力

模型特点

创意写作优化

模型在创意写作任务中表现出色，具有丰富的想象力和优质文风

扩展上下文支持

默认支持8K上下文窗口，可通过rope theta参数进一步扩展

多层融合架构

采用7个70B模型的层次化融合技术，增强模型表达能力

模型能力

创意文本生成

长文本连贯性保持

指令跟随

多轮对话

使用案例

内容创作

小说创作

生成具有连贯剧情和丰富细节的虚构故事

评测显示能产生天马行空且文风上乘的文学作品

诗歌生成

创作具有韵律和意境的诗歌

🚀 Meta-Llama-3-120B-Instruct

Meta-Llama-3-120B-Instruct是一个基于meta-llama/Meta-Llama-3-70B-Instruct，使用MergeKit进行自合并的模型。它受到了以下大型合并模型的启发：

特别感谢Eric Hartford对本模型的启发和评估，以及Charles Goddard创建了MergeKit。

image/jpeg

🚀 快速开始

本模型适合用于创意写作。它使用Llama 3聊天模板，默认上下文窗口为8K（可通过rope theta扩展）。你可以查看评估部分的示例，了解其性能表现。该模型整体表现较为自由，但写作风格不错，不过有时会输出拼写错误，且喜欢使用大写字母。

✨ 主要特性

基于Meta-Llama-3-70B-Instruct进行自合并，受多个大型合并模型启发。
适用于创意写作场景。
有多种量化模型可供选择。

📦 安装指南

暂未提供相关安装步骤。

💻 使用示例

基础用法

!pip install -qU transformers accelerate

from transformers import AutoTokenizer
import transformers
import torch

model = "mlabonne/Meta-Llama-3-120B-Instruct"
messages = [{"role": "user", "content": "What is a large language model?"}]

tokenizer = AutoTokenizer.from_pretrained(model)
prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
pipeline = transformers.pipeline(
    "text-generation",
    model=model,
    torch_dtype=torch.float16,
    device_map="auto",
)

outputs = pipeline(prompt, max_new_tokens=256, do_sample=True, temperature=0.7, top_k=50, top_p=0.95)
print(outputs[0]["generated_text"])

📚 详细文档

🔍 应用场景

推荐将此模型用于创意写作。它使用Llama 3聊天模板，默认上下文窗口为8K（可通过rope theta扩展）。查看评估部分的示例，可了解其性能表现。该模型整体表现较为自由，但写作风格不错，有时会输出拼写错误，且喜欢使用大写字母。

⚡ 量化模型

感谢Bartowski、elinas、mlx-community等提供以下量化模型：

GGUF：https://huggingface.co/lmstudio-community/Meta-Llama-3-120B-Instruct-GGUF
EXL2：https://huggingface.co/elinas/Meta-Llama-3-120B-Instruct-4.0bpw-exl2
mlx：https://huggingface.co/mlx-community/Meta-Llama-3-120B-Instruct-4bit

🏆 评估

此模型在创意写作方面表现出色，但在其他任务中表现不佳。使用时需谨慎，不要期望它在某些特定用例之外能超越GPT - 4。

Eric Hartford的X线程（创意写作）：https://twitter.com/erhartford/status/1787050962114207886
Daniel Kaiser的X线程（创意写作）：https://twitter.com/spectate_or/status/1787257261309518101
Simon的X线程（推理）：https://twitter.com/NewDigitalEdu/status/1787403266894020893
r/LocalLLaMa：https://www.reddit.com/r/LocalLLaMA/comments/1cl525q/goliath_lovers_where_is_the_feedback_about/

创意写作

感谢Sam Paech对本模型进行评估并分享输出结果！

image/png

🧩 配置

slices:
- sources:
  - layer_range: [0, 20]
    model: meta-llama/Meta-Llama-3-70B-Instruct
- sources:
  - layer_range: [10, 30]
    model: meta-llama/Meta-Llama-3-70B-Instruct
- sources:
  - layer_range: [20, 40]
    model: meta-llama/Meta-Llama-3-70B-Instruct
- sources:
  - layer_range: [30, 50]
    model: meta-llama/Meta-Llama-3-70B-Instruct
- sources:
  - layer_range: [40, 60]
    model: meta-llama/Meta-Llama-3-70B-Instruct
- sources:
  - layer_range: [50, 70]
    model: meta-llama/Meta-Llama-3-70B-Instruct
- sources:
  - layer_range: [60, 80]
    model: meta-llama/Meta-Llama-3-70B-Instruct
merge_method: passthrough
dtype: float16