Bespoke-MiniChart-7B开源图表理解模型 - 免费部署，图表问答表现超Gemini-1.5-Pro

首页

Bespoke MiniChart 7B

由 bespokelabs 开发

由Bespoke Labs开发的7B参数规模开源图表理解视觉语言模型，在图表问答任务上超越Gemini-1.5-Pro等闭源模型

文本生成图像

Safetensors

英语#图表问答 #视觉语言模型 #金融数据分析

下载量 437

发布时间 : 4/25/2025

模型简介

专注于图表理解的视觉语言模型(VLM)，能够同时实现视觉感知与文本推理，在七个公开基准测试中创造新标杆

模型特点

领先的图表理解能力

在7B参数规模中实现最先进的图表理解性能，超越更大规模的闭源模型

SFT+DPO联合训练

采用监督微调(SFT)与直接偏好优化(DPO)结合的先进训练策略

多模态推理

能够同时处理图像输入和文本问题，进行复杂的视觉-语言联合推理

模型能力

图表理解

视觉问答

数据可视化分析

多模态推理

使用案例

商业智能

财报图表分析

自动解读企业财报中的各类数据图表

准确提取关键财务指标和趋势

学术研究

科研论文图表理解

解析学术论文中的实验数据图表

帮助快速掌握研究结果和结论

🚀 Bespoke-MiniChart-7B

这是一款由 Bespoke Labs 开发的开源图表理解视觉语言模型（VLM），由 Liyan Tang 和 Bespoke Labs 维护。在 70 亿参数模型的图表问答（Chart - QA）任务中，它创造了新的技术水平，在七个公开基准测试中超越了诸如 Gemini - 1.5 - Pro 和 Claude - 3.5 等更大的闭源模型。

🚀 快速开始

博客文章：https://www.bespokelabs.ai/blog/bespoke-minichart-7b
在线演示：https://playground.bespokelabs.ai/minichart

点击下面的按钮体验模型：

✨ 主要特性

卓越性能

Bespoke - MiniChart - 7B 在同规模模型的图表理解任务中达到了最先进的性能，甚至能够超越 Gemini - 1.5 - Pro 和 Claude - 3.5 等闭源模型。

多模态能力

以下示例展示了 Bespoke - MiniChart - 7B 如何同时进行视觉感知和文本推理。

模型性能对比

Bespoke - MiniChart - 7B 在图表理解方面表现出色，我们还对比了使用 SFT + DPO 微调的模型和仅使用 SFT 微调的模型的性能。

在下面的表格中，M1 和 M2 分别是使用 270K 和 1M SFT 示例微调的模型，而 Bespoke - MiniChart - 7B 是使用 SFT + DPO 微调的模型。

💻 使用示例

基础用法

你可以在这个在线演示中使用该模型：https://playground.bespokelabs.ai/minichart

也可以点击下面的按钮在 Colab 中运行：

以下是运行模型的代码示例：

import requests
from PIL import Image
from io import BytesIO
import base64
import matplotlib.pyplot as plt
from vllm import LLM, SamplingParams

QA_PROMPT = """Please answer the question using the chart image.

Question: [QUESTION]

Please first generate your reasoning process and then provide the user with the answer. Use the following format:

<think> 
... your thinking process here ... 
</think> 
<answer> 
... your final answer (entity(s) or number) ...
</answer>"""

def get_image_from_url(image_url):
    try:
        response = requests.get(image_url, stream=True)
        response.raise_for_status()
        return Image.open(BytesIO(response.content))
    except Exception as e:
        print(f"Error with image: {e}")
        return None

def get_answer(image_url, question, display=True):
    image = get_image_from_url(image_url)

    if display:
      plt.figure(figsize=(10, 8))
      plt.imshow(image)
      plt.axis('off')
      plt.show()

    if not image:
        return "Error downloading image" 
  
    buffered = BytesIO()
    image.save(buffered, format=image.format or 'JPEG')
    encoded_image = base64.b64encode(buffered.getvalue()).decode('utf-8')
    
    messages = [{
        "role": "user",
        "content": [
            {"type": "image_url", "image_url": {"url": f"data:image/jpeg;base64,{encoded_image}"}},
            {"type": "text", "text": QA_PROMPT.replace("[QUESTION]", question)}
        ]
    }]

    response = llm.chat([messages], sampling_params=SamplingParams(temperature=0, max_tokens=500))
    return response[0].outputs[0].text
    
# Initialize the LLM
llm = LLM(
    model="bespokelabs/Bespoke-MiniChart-7B",
    tokenizer_mode="auto",
    max_model_len=15000,
    tensor_parallel_size=1,
    gpu_memory_utilization=0.9,
    mm_processor_kwargs={"max_pixels": 1600*28*28},
    seed=2025,
    trust_remote_code=True,
)

# Running inference
image_url = "https://github.com/bespokelabsai/minichart-playground-examples/blob/main/images/ilyc9wk4jf8b1.png?raw=true"
question = "How many global regions maintained their startup funding losses below 30% in 2022?"

print("\n\n=================Model Output:===============\n\n", get_answer(image_url, question))

📄 许可证

本作品采用 CC BY - NC 4.0 许可协议。如需商业许可，请联系 company@bespokelabs.ai。

📚 详细文档

引用信息

@misc{bespoke_minichart_7b,
  title  = {Bespoke-MiniChart-7B: pushing the frontiers of open VLMs for chart understanding},
  author = {Liyan Tang and Shreyas Pimpalgaonkar and Kartik Sharma and Alexandros G. Dimakis and Mahesh Sathiamoorthy and Greg Durrett},
  howpublished = {blog post},
  year   = {2025},
  url={https://huggingface.co/bespokelabs/Bespoke-MiniChart-7B},
}