🚀 MathCoder-VL:连接视觉与代码,提升多模态数学推理能力
MathCoder-VL 是一系列专门为解决通用数学问题而设计的开源大型多模态模型(LMMs)。同时,还推出了图像转代码模型 FigCodifier-8B。
仓库链接
论文链接
🚀 快速开始
模型信息
属性 |
详情 |
模型类型 |
image-text-to-text |
评估指标 |
accuracy |
标签 |
mathematics、reasoning、multi-modal-qa、math-qa、figure-qa、geometry-qa、math-word-problem、textbook-qa、vqa、geometry-diagram、synthetic-scene、chart、plot、scientific-figure、table、function-plot、abstract-scene、puzzle-test、document-image、science |
库名称 |
transformers |
基础模型 |
OpenGVLab/InternVL2-8B |
数据集 |
MathLLMs/MM-MathInstruct |
许可证 |
apache-2.0 |
模型对比
使用示例
训练和推理代码请参考 InternVL。
基础用法
from datasets import load_dataset
from PIL import Image
from io import BytesIO
mm_mathinstruct = load_dataset("MathLLMs/MM-MathInstruct")
print(mm_mathinstruct)
img = Image.open(BytesIO(mm_mathinstruct['train'][-1]['image']))
img.show()
运行上述代码后,应该会输出:
DatasetDict({
train: Dataset({
features: ['id', 'image', 'question', 'solution', 'image_path'],
num_rows: 2871988
})
})
📚 详细文档
动机
FigCodifier 的构建
MathCoder-VL 的构建
性能表现
📄 许可证
本项目采用 apache-2.0 许可证。
📖 引用
如果您使用了我们的数据、模型或代码,请引用以下论文:
@inproceedings{
wang2025mathcodervl,
title={MathCoder-{VL}: Bridging Vision and Code for Enhanced Multimodal Mathematical Reasoning},
author={Ke Wang and Junting Pan and Linda Wei and Aojun Zhou and Weikang Shi and Zimu Lu and Han Xiao and Yunqiao Yang and Houxing Ren and Mingjie Zhan and Hongsheng Li},
booktitle={The 63rd Annual Meeting of the Association for Computational Linguistics},
year={2025},
url={https://openreview.net/forum?id=nuvtX1imAb}
}
@inproceedings{
lu2025mathcoder2,
title={MathCoder2: Better Math Reasoning from Continued Pretraining on Model-translated Mathematical Code},
author={Zimu Lu and Aojun Zhou and Ke Wang and Houxing Ren and Weikang Shi and Junting Pan and Mingjie Zhan and Hongsheng Li},
booktitle={The Thirteenth International Conference on Learning Representations},
year={2025},
url={https://openreview.net/forum?id=1Iuw1jcIrf}
}
@inproceedings{
wang2024mathcoder,
title={MathCoder: Seamless Code Integration in {LLM}s for Enhanced Mathematical Reasoning},
author={Ke Wang and Houxing Ren and Aojun Zhou and Zimu Lu and Sichun Luo and Weikang Shi and Renrui Zhang and Linqi Song and Mingjie Zhan and Hongsheng Li},
booktitle={The Twelfth International Conference on Learning Representations},
year={2024},
url={https://openreview.net/forum?id=z8TW0ttBPp}
}