WizardMath-7B-V1.1开源数学大语言模型 - 免费部署解决复杂数学问题

首页

Wizardmath 7B V1.1

由 WizardLMTeam 开发

WizardMath-7B-V1.1是基于Mistral-7B训练的最先进的7B数学大语言模型，在GSM8k和MATH数据集上表现优异。

大型语言模型

Transformers

英语#数学推理 #强化进化指令 #7B领先模型

下载量 175.35k

发布时间 : 12/19/2023

模型简介

WizardMath通过强化进化指令（RLEIF）赋能大语言模型的数学推理能力，专注于解决数学问题。

模型特点

强化进化指令

通过RLEIF方法提升模型的数学推理能力。

高性能

在GSM8k和MATH数据集上达到当前最先进的性能。

开源

模型和代码公开可用，便于研究和应用。

模型能力

数学问题解答

数学推理

文本生成

使用案例

教育

数学问题解答

帮助学生解答复杂的数学问题。

在GSM8k上达到83.2 pass@1。

研究

数学推理研究

用于研究大语言模型在数学推理方面的能力。

在MATH上达到33.0 pass@1。

🚀 WizardMath：通过强化进化指令（RLEIF）赋能大语言模型的数学推理能力

WizardMath是一个专注于提升大语言模型数学推理能力的项目。通过强化进化指令（RLEIF）方法，该项目训练出的模型在数学问题解决上表现出色，为大语言模型在数学领域的应用提供了强大支持。

🔗 项目链接

📢 最新消息

[2023年12月19日] 🔥 我们发布了基于Mistral - 7B训练的 WizardMath - 7B - V1.1，这是目前 最优的7B数学大语言模型，在GSM8k上达到了 83.2 pass@1，在MATH上达到了 33.0 pass@1。可以使用这个 [演示] 与它进行对话。
[2023年12月19日] 🔥 WizardMath - 7B - V1.1 在GSM8K pass@1上超越了 ChatGPT 3.5、Gemini Pro、Mixtral MOE 和 Claude Instant。
[2023年12月19日] 🔥 WizardMath - 7B - V1.1 在MATH pass@1上与 ChatGPT 3.5、Gemini Pro 相当，并且超越了 Mixtral MOE。

🌟 模型性能对比表

模型	检查点	论文	GSM8k	MATH	演示
WizardMath - 7B - V1.1	🤗 HF链接	📃 WizardMath	83.2	33.0	[演示]
WizardMath - 70B - V1.0	🤗 HF链接	📃 WizardMath	81.6	22.7
WizardMath - 13B - V1.0	🤗 HF链接	📃 WizardMath	63.9	14.0
WizardMath - 7B - V1.0	🤗 HF链接	📃 WizardMath	54.9	10.7

[2023年12月19日] WizardMath - 7B - V1.1与其他开源7B规模数学大语言模型对比

模型	GSM8k Pass@1	MATH Pass@1
MPT - 7B	6.8	3.0
Llama 1 - 7B	11.0	2.9
Llama 2 - 7B	12.3	2.8
Yi - 6b	32.6	5.8
Mistral - 7B	37.8	9.1
Qwen - 7b	47.8	9.3
RFT - 7B	50.3	--
MAmmoTH - 7B (COT)	50.5	10.4
WizardMath - 7B - V1.0	54.9	10.7
Abel - 7B - 001	59.7	13
MetaMath - 7B	66.5	19.8
Arithmo - Mistral - 7B	74.7	25.3
MetaMath - Mistral - 7B	77.7	28.2
Abel - 7B - 002	80.4	29.5
WizardMath - 7B - V1.1	83.2	33.0

[2023年12月19日] WizardMath - 7B - V1.1与大型开源（30B ~ 70B）大语言模型对比

模型	GSM8k Pass@1	MATH Pass@1
Llemma - 34B	51.5	25.0
Minerva - 62B	52.4	27.6
Llama 2 - 70B	56.8	13.5
DeepSeek 67B	63.4	--
Gork 33B	62.9	23.9
MAmmoTH - 70B	72.4	21.1
Yi - 34B	67.9	15.9
Mixtral 8x7B	74.4	28.4
MetaMath - 70B	82.3	26.6
WizardMath - 7B - V1.1	83.2	33.0

⚠️ 数据污染检查

在模型训练前，我们对所有训练数据进行了仔细而严格的检查，并使用多种去重方法来验证和防止GSM8k和MATH测试集的数据泄露。

⚠️ 模型系统提示使用说明

请严格使用与我们 相同的系统提示，我们不保证 量化版本 的准确性。

默认版本

"Below is an instruction that describes a task. Write a response that appropriately completes the request.\n\n### Instruction:\n{instruction}\n\n### Response:"

CoT版本（❗对于简单的数学问题，我们不建议使用CoT提示）

"Below is an instruction that describes a task. Write a response that appropriately completes the request.\n\n### Instruction:\n{instruction}\n\n### Response: Let's think step by step."

💻 WizardMath推理演示脚本

我们在这里提供了WizardMath推理演示代码。

📚 引用

如果您使用了本仓库中的数据、方法或代码，请引用本仓库。

@article{luo2023wizardmath,
  title={WizardMath: Empowering Mathematical Reasoning for Large Language Models via Reinforced Evol-Instruct},
  author={Luo, Haipeng and Sun, Qingfeng and Xu, Can and Zhao, Pu and Lou, Jianguang and Tao, Chongyang and Geng, Xiubo and Lin, Qingwei and Chen, Shifeng and Zhang, Dongmei},
  journal={arXiv preprint arXiv:2308.09583},
  year={2023}
}