RakutenAI-7B-chat开源多语言LLM - 免费部署，日英双语言任务处理优选

首页

Rakutenai 7B Chat

由 Rakuten 开发

RakutenAI-7B-chat是乐天集团开发的日语大语言模型，在日语理解基准测试中表现优异，同时支持英语任务。基于Mistral-7B架构扩展词汇表优化日文处理。

大型语言模型

Transformers

支持多种语言开源协议:Apache-2.0 #日语优化 #双语对话 #指令微调

下载量 3,702

发布时间 : 3/18/2024

模型简介

该模型是RakutenAI-7B的对话优化版本，专为自然语言交互设计，可生成有帮助、详细且有礼貌的回答。

模型特点

日语优化处理

通过扩展词汇表至48k显著提升日文字符-标记比，在日语基准测试中取得最佳成绩

双语支持

同时保持英语任务处理能力，与同类日语模型相比具有竞争力

对话优化

经过指令调优专门优化对话交互场景，生成有帮助且详细的回答

模型能力

日语文本生成

英语文本生成

多轮对话

问答系统

跨语言理解

使用案例

客户服务

日语客服助手

处理日语客户咨询，提供自然流畅的回复

在Japanese MT-bench评估中表现优异

教育

语言学习助手

帮助日语或英语学习者进行语言练习

🚀 RakutenAI-7B-chat

RakutenAI-7B-chat是一个系统性的项目，它将最新技术引入到日语大语言模型领域。该模型在日语理解基准测试中取得了最佳成绩，同时在英语测试集上与OpenCalm、Elyza、Youri、Nekomata和Swallow等类似模型相比，也保持着有竞争力的性能。

🚀 快速开始

RakutenAI-7B-chat模型在日语和英语语言处理方面表现出色。若你正在寻找基础模型，可查看 RakutenAI-7B；若你需要指令微调模型，可查看 RakutenAI-7B-instruct。

✨ 主要特性

性能卓越：RakutenAI-7B在日语语言理解基准测试中取得最佳成绩，在英语测试集上也有有竞争力的表现。
架构先进：采用Mistral模型架构，并基于 Mistral-7B-v0.1 预训练检查点。
词汇扩展：将Mistral的词汇表从32k扩展到48k，为日语提供更好的字符与标记比率。

💻 使用示例

基础用法

# With RakutenAI-7B-Chat's custom chat template.

from transformers import AutoModelForCausalLM, AutoTokenizer

model_path = "Rakuten/RakutenAI-7B-chat"
tokenizer = AutoTokenizer.from_pretrained(model_path)
model = AutoModelForCausalLM.from_pretrained(model_path, torch_dtype="auto", device_map="auto")
model.eval()

chat = [
    
    {"role": "system", "content": "A chat between a curious user and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the user's questions."},
    {"role": "user", "content": "How to make an authentic Spanish Omelette?"},
]

input_ids = tokenizer.apply_chat_template(chat, tokenize=True, add_generation_prompt=True, return_tensors="pt").to(device=model.device)
tokens = model.generate(
    input_ids,
    max_length=4096,
    do_sample=False,
    num_beams=1,
    pad_token_id=tokenizer.eos_token_id,
)
out = tokenizer.decode(tokens[0][len(input_ids[0]):], skip_special_tokens=True)
print("ASSISTANT:\n" + out)
print()


# Without using custom chat template.

from transformers import AutoModelForCausalLM, AutoTokenizer

model_path = "Rakuten/RakutenAI-7B-chat"
tokenizer = AutoTokenizer.from_pretrained(model_path)
model = AutoModelForCausalLM.from_pretrained(model_path, torch_dtype="auto", device_map="auto")
model.eval()

requests = [
    "「馬が合う」はどう言う意味ですか",
    "How to make an authentic Spanish Omelette?",
]

system_message = "A chat between a curious user and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the user's questions. USER: {user_input} ASSISTANT:"

for req in requests:
    input_req = system_message.format(user_input=req)
    input_ids = tokenizer.encode(input_req, return_tensors="pt").to(device=model.device)
    tokens = model.generate(
        input_ids,
        max_new_tokens=1024,
        do_sample=True,
        pad_token_id=tokenizer.eos_token_id,
    )
    out = tokenizer.decode(tokens[0][len(input_ids[0]):], skip_special_tokens=True)
    print("USER:\n" + req)
    print("ASSISTANT:\n" + out)
    print()
    print()

📚 详细文档

模型详情

属性	详情
开发方	Rakuten Group, Inc.
支持语言	日语、英语
许可证	Apache License, Version 2.0
指令微调数据集	使用开源和内部手工制作的数据集对基础模型进行微调，包括 JSNLI、RTE 等。

局限性和偏差

RakutenAI-7B系列模型能够在广泛的主题上生成类似人类的文本。然而，像所有大语言模型一样，它们也有局限性，可能会产生有偏见、不准确或不安全的输出。在与它们交互时，请谨慎并运用判断力。

📄 许可证

本模型采用 Apache License, Version 2.0 许可。

🔧 技术细节

技术报告可在 arXiv 上获取。

📚 引用

如需引用我们在RakutenAI-7B系列模型上的工作，请使用以下格式：

@misc{rakutengroup2024rakutenai7b,
      title={RakutenAI-7B: Extending Large Language Models for Japanese}, 
      author={{Rakuten Group, Inc.} and Aaron Levine and Connie Huang and Chenguang Wang and Eduardo Batista and Ewa Szymanska and Hongyi Ding and Hou Wei Chou and Jean-François Pessiot and Johanes Effendi and Justin Chiu and Kai Torben Ohlhus and Karan Chopra and Keiji Shinzato and Koji Murakami and Lee Xiong and Lei Chen and Maki Kubota and Maksim Tkachenko and Miroku Lee and Naoki Takahashi and Prathyusha Jwalapuram and Ryutaro Tatsushima and Saurabh Jain and Sunil Kumar Yadav and Ting Cai and Wei-Te Chen and Yandi Xia and Yuki Nakayama and Yutaka Higashiyama},
      year={2024},
      eprint={2403.15484},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}