license: apache-2.0
language:
- zh
- en
pipeline_tag: text-generation
library_name: transformers
MiniCPM代码库 |
MiniCPM论文 |
MiniCPM-V代码库 |
加入Discord社区与微信群
模型简介
MiniCPM3-4B是MiniCPM系列的第三代模型。其综合性能超越Phi-3.5-mini-Instruct和GPT-3.5-Turbo-0125,与近期多个7B~9B量级模型表现相当。
相比MiniCPM1.0/2.0版本,MiniCPM3-4B具备更强大的多功能技能集以实现更通用化的应用场景。该模型支持函数调用功能,并配备代码解释器。具体使用指南请参阅进阶功能。
MiniCPM3-4B拥有32k上下文窗口。结合LLMxMapReduce技术,理论上可处理无限长文本且无需消耗大量内存。
使用指南
Transformers推理
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
path = "openbmb/MiniCPM3-4B"
device = "cuda"
tokenizer = AutoTokenizer.from_pretrained(path, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(path, torch_dtype=torch.bfloat16, device_map=device, trust_remote_code=True)
messages = [
{"role": "user", "content": "推荐5个北京的景点。"},
]
model_inputs = tokenizer.apply_chat_template(messages, return_tensors="pt", add_generation_prompt=True).to(device)
model_outputs = model.generate(
model_inputs,
max_new_tokens=1024,
top_p=0.7,
temperature=0.7
)
output_token_ids = [
model_outputs[i][len(model_inputs[i]):] for i in range(len(model_inputs))
]
responses = tokenizer.batch_decode(output_token_ids, skip_special_tokens=True)[0]
print(responses)
当前需安装我们维护的vLLM分支版本:
pip install git+https://github.com/OpenBMB/vllm.git@minicpm3
from transformers import AutoTokenizer
from vllm import LLM, SamplingParams
model_name = "openbmb/MiniCPM3-4B"
prompt = [{"role": "user", "content": "推荐5个北京的景点。"}]
tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)
input_text = tokenizer.apply_chat_template(prompt, tokenize=False, add_generation_prompt=True)
llm = LLM(
model=model_name,
trust_remote_code=True,
tensor_parallel_size=1
)
sampling_params = SamplingParams(top_p=0.7, temperature=0.7, max_tokens=1024, repetition_penalty=1.02)
outputs = llm.generate(prompts=input_text, sampling_params=sampling_params)
print(outputs[0].outputs[0].text)
评测结果
评测基准 |
Qwen2-7B-Instruct |
GLM-4-9B-Chat |
Gemma2-9B-it |
Llama3.1-8B-Instruct |
GPT-3.5-Turbo-0125 |
Phi-3.5-mini-Instruct(3.8B) |
MiniCPM3-4B |
英文能力 |
MMLU |
70.5 |
72.4 |
72.6 |
69.4 |
69.2 |
68.4 |
67.2 |
BBH |
64.9 |
76.3 |
65.2 |
67.8 |
70.3 |
68.6 |
70.2 |
MT-Bench |
8.41 |
8.35 |
7.88 |
8.28 |
8.17 |
8.60 |
8.41 |
IFEVAL (严格准确率) |
51.0 |
64.5 |
71.9 |
71.5 |
58.8 |
49.4 |
68.4 |
中文能力 |
CMMLU |
80.9 |
71.5 |
59.5 |
55.8 |
54.5 |
46.9 |
73.3 |
CEVAL |
77.2 |
75.6 |
56.7 |
55.2 |
52.8 |
46.1 |
73.6 |
AlignBench v1.1 |
7.10 |
6.61 |
7.10 |
5.68 |
5.82 |
5.73 |
6.74 |
FollowBench-zh (SSR) |
63.0 |
56.4 |
57.0 |
50.6 |
64.6 |
58.1 |
66.8 |
数学能力 |
MATH |
49.6 |
50.6 |
46.0 |
51.9 |
41.8 |
46.4 |
46.6 |
GSM8K |
82.3 |
79.6 |
79.7 |
84.5 |
76.4 |
82.7 |
81.1 |
MathBench |
63.4 |
59.4 |
45.8 |
54.3 |
48.9 |
54.9 |
65.6 |
代码能力 |
HumanEval+ |
70.1 |
67.1 |
61.6 |
62.8 |
66.5 |
68.9 |
68.3 |
MBPP+ |
57.1 |
62.2 |
64.3 |
55.3 |
71.4 |
55.8 |
63.2 |
LiveCodeBench v3 |
22.2 |
20.2 |
19.2 |
20.4 |
24.0 |
19.6 |
22.6 |
函数调用 |
BFCL v2 |
71.6 |
70.1 |
19.2 |
73.3 |
75.4 |
48.4 |
76.0 |
综合表现 |
平均得分 |
65.3 |
65.0 |
57.9 |
60.8 |
61.0 |
57.2 |
66.3 |
声明
- MiniCPM3-4B作为语言模型,其生成内容基于海量文本训练所得
- 模型不具备理解或表达个人观点及价值判断的能力
- 模型生成内容不代表开发者的观点和立场
- 用户在使用生成内容时需自行承担验证责任
许可协议
引用文献
@article{hu2024minicpm,
title={MiniCPM: 基于可扩展训练策略的小模型潜力探索},
author={胡胜德 and 涂宇歌 and 韩旭 and 何超群 and 崔甘泉 and 龙翔 and 郑智 and 方业伟 and 黄宇翔 and 赵伟林等},
journal={arXiv预印本 arXiv:2404.06395},
year={2024}
}