库名称:transformers
许可证:其他
基础模型:deepseek-ai/DeepSeek-R1-Distill-Qwen-32B
标签:
- llama-factory
- full
- generated_from_trainer
模型索引:
- 名称:ReasonFlux-F1-32B
结果:[]
ReasonFlux:基于思维模板扩展的分层大语言模型推理
革命性的模板增强推理范式使32B模型在推理任务中超越o1-mini和DeepSeek-R1蒸馏模型。
任务/Pass@1 |
ReasonFlux-F1-32B |
ReasonFlux-Zero-32B |
R1-Distill-32B |
o1-mini |
LIMO-32B |
s1-32B |
MATH500 |
96.0 |
91.2 |
94.3 |
90.0 |
90.6 |
93.0 |
AIME 2024 |
76.7 |
56.7 |
72.6 |
56.7 |
50.0 |
56.7 |
AIME 2025 |
53.3 |
37.2 |
46.67 |
50.8 |
37.2 |
49.3 |
GPQA-Diamond |
67.2 |
61.2 |
62.1 |
60.0 |
65.2 |
59.6 |
ReasonFlux-F1-32B
ReasonFlux-F1-32B是我们通过利用ReasonFlux-Zero中的模板增强推理轨迹微调出的SOTA级推理大语言模型。
评估
我们在AIME2024、AIME2025、MATH500和GPQA-Diamond等具有挑战性的推理任务上展示了ReasonFlux-F1-32B的评估结果。为确保公平比较,我们在ReasonFlux-F1中报告了各模型的评估脚本结果。
模型 |
AIME2024@pass1 |
AIME2025@pass1 |
MATH500@pass1 |
GPQA@pass1 |
QwQ-32B-Preview |
46.7 |
37.2 |
90.6 |
65.2 |
LIMO-32B |
56.3 |
44.5 |
94.8 |
58.1 |
s1-32B |
56.7 |
49.3 |
93.0 |
59.6 |
OpenThinker-32B |
66.0 |
53.3 |
94.8 |
60.1 |
R1-Distill-32B |
70.0 |
46.7 |
92.0 |
59.6 |
ReasonFlux-Zero-32B |
56.7 |
37.2 |
91.2 |
61.2 |
ReasonFlux-F1-32B |
76.7 |
53.3 |
96.0 |
67.2 |
使用VLLM快速开始
from vllm import LLM, SamplingParams
from transformers import AutoTokenizer
model_id = 'Gen-Verse/ReasonFlux-F1'
model = LLM(
model_id,
tensor_parallel_size=8,
)
tokenizer = AutoTokenizer.from_pretrained(model_id)
sampling_params = SamplingParams(
max_tokens=32768,
)
question = """设\(x, y\)和\(z\)为正实数,满足以下方程组:
\[
\begin{array}{c}
\sqrt{2 x-x y}+\sqrt{2 y-x y}=1 \\
\sqrt{2 y-y z}+\sqrt{2 z-y z}=\sqrt{2} \\
\sqrt{2 z-z x}+\sqrt{2 x-z x}=\sqrt{3} .
\end{array}
\]
则\(\left[(1-x)(1-y)(1-z)\right]^{2}\)可表示为\(\frac{m}{n}\),其中\(m\)和\(n\)为互质的正整数。求\(m+n\)。"""
ds_prompt="\n" + question + "\n"
output = model.generate(ds_prompt, sampling_params=sampling_params)
print(output[0].outputs[0].text)
引用
@article{yang2025reasonflux,
title={ReasonFlux: Hierarchical LLM Reasoning via Scaling Thought Templates},
author={Yang, Ling and Yu, Zhaochen and Cui, Bin and Wang, Mengdi},
journal={arXiv preprint arXiv:2502.06772},
year={2025}
}