license: other
pipeline_tag: visual-question-answering
书生·浦语2.5-多模态对话大模型
InternLM-XComposer2.5-Chat是基于internlm/internlm-xcomposer2d5-7b训练的对话模型,在多模态指令跟随和开放式对话能力方面有显著提升。
Transformers调用方式
通过Transformers加载模型:
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM
ckpt_path = "internlm/internlm-xcomposer2d5-7b-chat"
tokenizer = AutoTokenizer.from_pretrained(ckpt_path, trust_remote_code=True).cuda()
model = AutoModelForCausalLM.from_pretrained(ckpt_path, torch_dtype=torch.bfloat16, trust_remote_code=True).cuda()
model = model.eval()
快速开始
以下示例展示如何使用🤗 Transformers调用InternLM-XComposer2.5模型。
视频理解
import torch
from transformers import AutoModel, AutoTokenizer
torch.set_grad_enabled(False)
model = AutoModel.from_pretrained('internlm/internlm-xcomposer2d5-7b-chat', torch_dtype=torch.bfloat16, trust_remote_code=True).cuda().eval()
tokenizer = AutoTokenizer.from_pretrained('internlm/internlm-xcomposer2d5-7b-chat', trust_remote_code=True)
model.tokenizer = tokenizer
query = '这是视频的若干帧画面。请详细描述该视频内容'
image = ['./examples/liuxiang.mp4',]
with torch.autocast(device_type='cuda', dtype=torch.float16):
response, his = model.chat(tokenizer, query, image, do_sample=False, num_beams=3, use_meta=True)
print(response)
query = '告诉我刘翔的运动员编号'
image = ['./examples/liuxiang.mp4',]
with torch.autocast(device_type='cuda', dtype=torch.float16):
response, _ = model.chat(tokenizer, query, image, history=his, do_sample=False, num_beams=3, use_meta=True)
print(response)
多图多轮对话
import torch
from transformers import AutoModel, AutoTokenizer
torch.set_grad_enabled(False)
model = AutoModel.from_pretrained('internlm/internlm-xcomposer2d5-7b-chat', torch_dtype=torch.bfloat16, trust_remote_code=True).cuda().eval()
tokenizer = AutoTokenizer.from_pretrained('internlm/internlm-xcomposer2d5-7b-chat', trust_remote_code=True)
model.tokenizer = tokenizer
query = '图1 <ImageHere>; 图2 <ImageHere>; 图3 <ImageHere>; 我想从这三款车中选购,请逐一分析它们的优劣势'
image = ['./examples/cars1.jpg',
'./examples/cars2.jpg',
'./examples/cars3.jpg',]
with torch.autocast(device_type='cuda', dtype=torch.float16):
response, his = model.chat(tokenizer, query, image, do_sample=False, num_beams=3, use_meta=True)
print(response)
query = '图4 <ImageHere>; 请分析图4中的车辆'
image.append('./examples/cars4.jpg')
with torch.autocast(device_type='cuda', dtype=torch.float16):
response, _ = model.chat(tokenizer, query, image, do_sample=False, num_beams=3, history= his, use_meta=True)
print(response)
高分辨率图像解析
import torch
from transformers import AutoModel, AutoTokenizer
torch.set_grad_enabled(False)
model = AutoModel.from_pretrained('internlm/internlm-xcomposer2d5-7b-chat', torch_dtype=torch.bfloat16, trust_remote_code=True).cuda().eval()
tokenizer = AutoTokenizer.from_pretrained('internlm/internlm-xcomposer2d5-7b-chat', trust_remote_code=True)
model.tokenizer = tokenizer
query = '详细解析给定信息图'
image = ['./examples/dubai.png']
with torch.autocast(device_type='cuda', dtype=torch.float16):
response, _ = model.chat(tokenizer, query, image, do_sample=False, num_beams=3, use_meta=True)
print(response)
开源协议
代码遵循Apache-2.0协议,模型权重对学术研究完全开放,同时允许免费商业使用。申请商业授权请填写[英文申请表]/[中文申请表]。其他合作请联系internlm@pjlab.org.cn。