license: apache-2.0
language:
- en
base_model:
- Qwen/Qwen2.5-VL-3B-Instruct
pipeline_tag: image-text-to-text
TBAC-VLR1-3B预览版
概述
这是由腾讯PCG基础算法中心微调的多模态语言模型。基于Qwen2.5-VL-3B-Instruct,TBAC-VLR1-3B预览版采用分组相对策略优化(GRPO)技术增强多模态推理能力,在同规模模型中实现了多项多模态推理基准的最先进性能。
性能表现
模型 |
综合平均 |
MathVista |
MathVision |
MathVerse |
DynaMath |
WeMath |
LogicVista |
Qwen2-VL-2B |
20.5 |
48.0 |
16.1 |
17.5 |
3.8 |
10.8 |
26.6 |
InternVL2.5-2B |
21.2 |
51.1 |
14.0 |
22.3 |
4.4 |
8.0 |
27.3 |
InternVL3-2B |
29.1 |
57.6 |
20.2 |
24.5 |
14.8 |
22.9 |
40.3 |
Qwen2.5-VL-3B |
31.8 |
61.2 |
21.9 |
31.2 |
13.2 |
22.9 |
40.3 |
VLM-R1-3B-Math-0305 |
33.4 |
62.7 |
21.9 |
32.2 |
13.0 |
30.0 |
40.5 |
Taichu-VLR-3B |
33.6 |
64.9 |
23.1 |
32.1 |
12.6 |
30.4 |
38.7 |
VLAA-Thinker-Qwen2.5VL-3B |
35.4 |
61.0 |
24.4 |
36.4 |
18.2 |
33.8 |
38.5 |
TBAC-VLR1-3B预览版 |
35.7 |
64.8 |
25.0 |
33.2 |
17.7 |
32.4 |
40.8 |

对比结果来源于https://opencompass.org.cn。
本模型结果为自测数据,通过各基准测试线下运行评估获得。
使用方式
from transformers import Qwen2_5_VLForConditionalGeneration, AutoTokenizer, AutoProcessor
from qwen_vl_utils import process_vision_info
model = Qwen2_5_VLForConditionalGeneration.from_pretrained(
"TencentBAC/TBAC-VLR1-3B-preview", torch_dtype="auto", device_map="auto"
)
processor = AutoProcessor.from_pretrained("TencentBAC/TBAC-VLR1-3B-preview")
messages = [
{
"role": "system",
"content": "你是一个乐于助人的助手。用户提出问题后,你需要先在脑海中思考推理过程,然后将答案用\\boxed{}标签包裹提供给用户,即:推理过程在此 \\boxed{ 答案在此 }。"
},
{
"role": "user",
"content": [
{
"type": "image",
"image": 图片路径,
},
{"type": "text", "text": 问题文本},
],
}
]
text = processor.apply_chat_template(
messages, tokenize=False, add_generation_prompt=True
)
image_inputs, video_inputs = process_vision_info(messages)
inputs = processor(
text=[text],
images=image_inputs,
videos=video_inputs,
padding=True,
return_tensors="pt",
)
inputs = inputs.to("cuda")
generated_ids = model.generate(**inputs, max_new_tokens=128, do_sample=False)
generated_ids_trimmed = [
out_ids[len(in_ids) :] for in_ids, out_ids in zip(inputs.input_ids, generated_ids)
]
output_text = processor.batch_decode(
generated_ids_trimmed, skip_special_tokens=True, clean_up_tokenization_spaces=False
)
print(output_text)
引用
如果您的研究中使用到本模型,请考虑给予❤️并引用。感谢!
@misc{Xu2025tbacvlr1,
title={TBAC-VLR1-3B预览版},
author={徐俊哲和殷宇阳},
url={https://huggingface.co/TencentBAC/TBAC-VLR1-3B-preview},
year={2025},
}
关于
由腾讯PCG基础算法中心创建。保留所有权利。