Mixtral-8x22B-Instruct-v0.1开源大语言模型 - 支持多语言及函数调用功能

首页

Mixtral 8x22B Instruct V0.1

由 mistralai 开发

Mixtral-8x22B-Instruct-v0.1是基于Mixtral-8x22B-v0.1进行指令微调的大语言模型，支持多种语言和函数调用功能。

大型语言模型

Transformers

支持多种语言开源协议:Apache-2.0 #多专家混合模型 #多语言指令微调 #220B参数规模

下载量 12.80k

发布时间 : 4/16/2024

模型简介

这是一个基于Mixtral-8x22B架构的指令微调大语言模型，专门优化了对话和指令跟随能力，支持多种编程语言接口和工具调用功能。

模型特点

多专家模型架构

采用8个专家模型的混合架构，每个输入token动态选择2个专家进行处理，提高模型效率

多语言支持

原生支持英语、西班牙语、意大利语、德语和法语等多种语言

函数调用能力

支持工具调用和函数执行，可集成外部API和工具

高效推理

尽管模型规模大，但通过专家混合架构实现了相对高效的推理

模型能力

文本生成

对话系统

指令跟随

多语言处理

函数调用

工具集成

使用案例

对话系统

智能助手

构建多语言智能助手，处理用户查询和任务

能够理解复杂指令并提供准确响应

开发者工具

API集成

通过函数调用能力集成外部API和服务

实现动态数据获取和处理

教育

多语言学习助手

帮助学生学习多种语言的概念和表达

🚀 Mixtral-8x22B-Instruct-v0.1模型卡片

Mixtral-8x22B-Instruct-v0.1大语言模型（LLM）是Mixtral-8x22B-v0.1的指令微调版本。

属性	详情
支持语言	英语、西班牙语、意大利语、德语、法语
许可证	Apache-2.0
基础模型	mistralai/Mixtral-8x22B-v0.1

⚠️ 重要提示

如果您想了解更多关于我们如何处理您的个人数据的信息，请阅读我们的隐私政策。

🚀 快速开始

💻 使用示例

基础用法

以下是使用mistral_common进行编码和解码的示例：

from mistral_common.tokens.tokenizers.mistral import MistralTokenizer
from mistral_common.protocol.instruct.messages import UserMessage
from mistral_common.protocol.instruct.request import ChatCompletionRequest
 
mistral_models_path = "MISTRAL_MODELS_PATH"
 
tokenizer = MistralTokenizer.v3()
 
completion_request = ChatCompletionRequest(messages=[UserMessage(content="Explain Machine Learning to me in a nutshell.")])
 
tokens = tokenizer.encode_chat_completion(completion_request).tokens

高级用法

以下是使用mistral_inference进行推理的示例：

from mistral_inference.transformer import Transformer
from mistral_inference.generate import generate
 
model = Transformer.from_folder(mistral_models_path)
out_tokens, _ = generate([tokens], model, max_tokens=64, temperature=0.0, eos_id=tokenizer.instruct_tokenizer.tokenizer.eos_id)

result = tokenizer.decode(out_tokens[0])

print(result)

使用Hugging Face `transformers`准备输入

from transformers import AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained("mistralai/Mixtral-8x22B-Instruct-v0.1")

chat = [{"role": "user", "content": "Explain Machine Learning to me in a nutshell."}]

tokens = tokenizer.apply_chat_template(chat, return_dict=True, return_tensors="pt", add_generation_prompt=True)

使用Hugging Face `transformers`进行推理

from transformers import AutoModelForCausalLM
import torch

# You can also use 8-bit or 4-bit quantization here
model = AutoModelForCausalLM.from_pretrained("mistralai/Mixtral-8x22B-Instruct-v0.1", torch_dtype=torch.bfloat16, device_map="auto")
model.to("cuda")
 
generated_ids = model.generate(**tokens, max_new_tokens=1000, do_sample=True)

# decode with HF tokenizer
result = tokenizer.decode(generated_ids[0])
print(result)

函数调用示例

from transformers import AutoModelForCausalLM
from mistral_common.protocol.instruct.messages import (
    AssistantMessage,
    UserMessage,
)
from mistral_common.protocol.instruct.tool_calls import (
    Tool,
    Function,
)
from mistral_common.tokens.tokenizers.mistral import MistralTokenizer
from mistral_common.tokens.instruct.normalize import ChatCompletionRequest

device = "cuda" # the device to load the model onto

tokenizer_v3 = MistralTokenizer.v3()

mistral_query = ChatCompletionRequest(
    tools=[
        Tool(
            function=Function(
                name="get_current_weather",
                description="Get the current weather",
                parameters={
                    "type": "object",
                    "properties": {
                        "location": {
                            "type": "string",
                            "description": "The city and state, e.g. San Francisco, CA",
                        },
                        "format": {
                            "type": "string",
                            "enum": ["celsius", "fahrenheit"],
                            "description": "The temperature unit to use. Infer this from the users location.",
                        },
                    },
                    "required": ["location", "format"],
                },
            )
        )
    ],
    messages=[
        UserMessage(content="What's the weather like today in Paris"),
    ],
    model="test",
)

encodeds = tokenizer_v3.encode_chat_completion(mistral_query).tokens
model = AutoModelForCausalLM.from_pretrained("mistralai/Mixtral-8x22B-Instruct-v0.1")
model_inputs = encodeds.to(device)
model.to(device)

generated_ids = model.generate(model_inputs, max_new_tokens=1000, do_sample=True)
sp_tokenizer = tokenizer_v3.instruct_tokenizer.tokenizer
decoded = sp_tokenizer.decode(generated_ids[0])
print(decoded)

使用`transformers`进行函数调用

要使用此示例，您需要transformers版本4.42.0或更高版本。有关更多信息，请参阅transformers文档中的函数调用指南。

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

model_id = "mistralai/Mixtral-8x22B-Instruct-v0.1"
tokenizer = AutoTokenizer.from_pretrained(model_id)

def get_current_weather(location: str, format: str):
    """
    Get the current weather

    Args:
        location: The city and state, e.g. San Francisco, CA
        format: The temperature unit to use. Infer this from the users location. (choices: ["celsius", "fahrenheit"])
    """
    pass

conversation = [{"role": "user", "content": "What's the weather like in Paris?"}]
tools = [get_current_weather]

# format and tokenize the tool use prompt 
inputs = tokenizer.apply_chat_template(
            conversation,
            tools=tools,
            add_generation_prompt=True,
            return_dict=True,
            return_tensors="pt",
)

model = AutoModelForCausalLM.from_pretrained(model_id, torch_dtype=torch.bfloat16, device_map="auto")

inputs.to(model.device)
outputs = model.generate(**inputs, max_new_tokens=1000)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

⚠️ 重要提示

由于篇幅原因，此示例未展示调用工具并将工具调用和工具结果添加到聊天历史记录的完整循环，以便模型在下次生成时使用它们。有关完整的工具调用示例，请参阅函数调用指南，并注意Mixtral 确实使用工具调用ID，因此这些ID必须包含在您的工具调用和工具结果中。它们应该正好是9个字母数字字符。

指令分词器

此版本中包含的HuggingFace分词器应与我们自己的分词器匹配。您可以通过以下命令进行比较： pip install mistral-common

from mistral_common.protocol.instruct.messages import (
    AssistantMessage,
    UserMessage,
)
from mistral_common.tokens.tokenizers.mistral import MistralTokenizer
from mistral_common.tokens.instruct.normalize import ChatCompletionRequest

from transformers import AutoTokenizer

tokenizer_v3 = MistralTokenizer.v3()

mistral_query = ChatCompletionRequest(
    messages=[
        UserMessage(content="How many experts ?"),
        AssistantMessage(content="8"),
        UserMessage(content="How big ?"),
        AssistantMessage(content="22B"),
        UserMessage(content="Noice 🎉 !"),
    ],
    model="test",
)
hf_messages = mistral_query.model_dump()['messages']

tokenized_mistral = tokenizer_v3.encode_chat_completion(mistral_query).tokens

tokenizer_hf = AutoTokenizer.from_pretrained('mistralai/Mixtral-8x22B-Instruct-v0.1')
tokenized_hf = tokenizer_hf.apply_chat_template(hf_messages, tokenize=True)

assert tokenized_hf == tokenized_mistral

函数调用和特殊标记

此分词器包含更多与函数调用相关的特殊标记：

[TOOL_CALLS]
[AVAILABLE_TOOLS]
[/AVAILABLE_TOOLS]
[TOOL_RESULTS]
[/TOOL_RESULTS]

如果您想在函数调用中使用此模型，请确保以类似于我们在SentencePieceTokenizerV3中所做的方式应用它。

米斯特拉尔AI团队

Albert Jiang、Alexandre Sablayrolles、Alexis Tacnet、Antoine Roux、 Arthur Mensch、Audrey Herblin - Stoop、Baptiste Bout、Baudouin de Monicault、 Blanche Savary、Bam4d、Caroline Feldman、Devendra Singh Chaplot、 Diego de las Casas、Eleonore Arcelin、Emma Bou Hanna、Etienne Metzger、 Gianna Lengyel、Guillaume Bour、Guillaume Lample、Harizo Rajaona、 Jean - Malo Delignon、Jia Li、Justus Murke、Louis Martin、Louis Ternon、 Lucile Saulnier、Lélio Renard Lavaud、Margaret Jennings、Marie Pellat、 Marie Torelli、Marie - Anne Lachaux、Nicolas Schuhl、Patrick von Platen、 Pierre Stock、Sandeep Subramanian、Sophia Yang、Szymon Antoniak、Teven Le Scao、 Thibaut Lavril、Timothée Lacroix、Théophile Gervet、Thomas Wang、 Valera Nemychnikova、William El Sayed、William Marshall