MathCoder-VL-8B开源多模态模型 - 免费助力解决通用数学问题，增强推理能力！

首页

Mathcoder VL 8B

由 MathLLMs 开发

MathCoder-VL系列开源大型多模态模型，专为通用数学问题解决而设计，结合视觉与代码增强数学推理能力。

图像生成文本

Transformers

英语开源协议:Apache-2.0 #多模态数学推理 #图像转代码 #几何图表解析

下载量 17

发布时间 : 5/15/2025

模型简介

MathCoder-VL是一个多模态大型模型，专注于解决通用数学问题，通过连接视觉与代码来增强数学推理能力。

模型特点

多模态数学推理

结合视觉与文本信息进行数学问题求解，支持图表、几何图形等多种数学表达形式。

代码增强推理

通过代码生成与执行增强数学推理能力，支持数学问题的程序化求解。

广泛数学领域覆盖

支持几何、代数、函数图、科学图表等多种数学领域的推理任务。

模型能力

多模态数学推理

图像文本转换

数学问题求解

图表理解

几何推理

代码生成

使用案例

教育

数学教材问题解答

帮助学生理解并解答教材中的数学问题，包括图表和文字描述。

提高学习效率，增强数学理解能力。

几何图形推理

通过几何图形进行推理和问题求解，如角度计算、面积求解等。

准确解答几何问题，辅助几何学习。

科研

科学图表分析

分析科学实验中的图表数据，提取关键信息并进行推理。

辅助科研人员进行数据分析和解释。

🚀 MathCoder-VL：连接视觉与代码，提升多模态数学推理能力

MathCoder-VL 是一系列专门为解决通用数学问题而设计的开源大型多模态模型（LMMs）。同时，还推出了图像转代码模型 FigCodifier-8B。

仓库链接

论文链接

🚀 快速开始

模型信息

属性	详情
模型类型	image-text-to-text
评估指标	accuracy
标签	mathematics、reasoning、multi-modal-qa、math-qa、figure-qa、geometry-qa、math-word-problem、textbook-qa、vqa、geometry-diagram、synthetic-scene、chart、plot、scientific-figure、table、function-plot、abstract-scene、puzzle-test、document-image、science
库名称	transformers
基础模型	OpenGVLab/InternVL2-8B
数据集	MathLLMs/MM-MathInstruct
许可证	apache-2.0

模型对比

基础模型	本项目模型
Mini-InternVL-Chat-2B-V1-5	MathCoder-VL-2B
InternVL2-8B	MathCoder-VL-8B
InternVL2-8B	FigCodifier-8B

使用示例

训练和推理代码请参考 InternVL。

基础用法

from datasets import load_dataset
from PIL import Image
from io import BytesIO

mm_mathinstruct = load_dataset("MathLLMs/MM-MathInstruct")
print(mm_mathinstruct)

# show the last image
img = Image.open(BytesIO(mm_mathinstruct['train'][-1]['image']))
img.show()

运行上述代码后，应该会输出：

DatasetDict({
    train: Dataset({
        features: ['id', 'image', 'question', 'solution', 'image_path'],
        num_rows: 2871988
    })
})

📚 详细文档

动机

FigCodifier 的构建

MathCoder-VL 的构建

性能表现

📄 许可证

本项目采用 apache-2.0 许可证。

📖 引用

如果您使用了我们的数据、模型或代码，请引用以下论文：

@inproceedings{
wang2025mathcodervl,
title={MathCoder-{VL}: Bridging Vision and Code for Enhanced Multimodal Mathematical Reasoning},
author={Ke Wang and Junting Pan and Linda Wei and Aojun Zhou and Weikang Shi and Zimu Lu and Han Xiao and Yunqiao Yang and Houxing Ren and Mingjie Zhan and Hongsheng Li},
booktitle={The 63rd Annual Meeting of the Association for Computational Linguistics},
year={2025},
url={https://openreview.net/forum?id=nuvtX1imAb}
}

@inproceedings{
lu2025mathcoder2,
title={MathCoder2: Better Math Reasoning from Continued Pretraining on Model-translated Mathematical Code},
author={Zimu Lu and Aojun Zhou and Ke Wang and Houxing Ren and Weikang Shi and Junting Pan and Mingjie Zhan and Hongsheng Li},
booktitle={The Thirteenth International Conference on Learning Representations},
year={2025},
url={https://openreview.net/forum?id=1Iuw1jcIrf}
}

@inproceedings{
wang2024mathcoder,
title={MathCoder: Seamless Code Integration in {LLM}s for Enhanced Mathematical Reasoning},
author={Ke Wang and Houxing Ren and Aojun Zhou and Zimu Lu and Sichun Luo and Weikang Shi and Renrui Zhang and Linqi Song and Mingjie Zhan and Hongsheng Li},
booktitle={The Twelfth International Conference on Learning Representations},
year={2024},
url={https://openreview.net/forum?id=z8TW0ttBPp}
}