Llama-3.2-11B-Vision-Radiology-mini开源模型 - 助力放射影像解读，速度提升两倍

首页

Llama 3.2 11B Vision Radiology Mini

由 0llheaven 开发

基于unsloth/Llama-3.2-11B-Vision-Instruct微调的放射影像辅助解读模型，优化后运行速度提升两倍

图像生成文本

Transformers

英语开源协议:Apache-2.0 #放射影像辅助诊断 #高效微调优化 #多模态医学分析

下载量 885

发布时间 : 12/3/2024

模型简介

该模型专为辅助解读X光片、CT扫描和MRI等放射影像而设计，可提供初步疾病识别以支持医疗专业人员工作

模型特点

高效微调

采用unsloth平台优化，实现两倍于前代的运行速度提升

医疗专用

针对放射影像解读任务专项优化，支持X光/CT/MRI等多种影像类型

轻量数据集

使用精简版radiology_mini数据集（原数据集规模的0.33%）

模型能力

放射影像分析

医学异常检测

影像特征描述生成

多模态理解（图像+文本）

使用案例

医疗辅助诊断

X光片解读

自动识别胸部X光片中的肺炎迹象

可提供初步异常区域定位和描述

CT扫描分析

检测脑部CT中的出血或肿瘤病变

生成结构化报告框架供医师复核

医学教育

影像教学辅助

为医学生提供实时影像解读反馈

🚀 上传的模型

本模型由Anukul开发，基于unsloth/Llama-3.2-11B-Vision-Instruct模型进行微调，使用的数据集为unsloth/radiology_mini。该模型旨在辅助解读X光、CT扫描和MRI等放射学图像，还能提供初步的疾病识别，以支持医疗专业人员。

🚀 快速开始

模型概述

此模型专为辅助解读放射学图像（如X光、CT扫描和MRI）而设计，还能提供初步的疾病识别，为医疗专业人员提供支持。它是对unsloth/Llama-3.2-11B-Vision-Instruct模型进行微调，以完成放射学图像字幕任务。该模型经过优化，速度比之前的版本快两倍，可实现高效微调。

数据集描述

本项目使用的数据集是unsloth/radiology_mini，这是一个从ROCOv2 - 放射学数据集中提取的小规模数据集。它包括训练集和测试集，该数据集占Hugging Face上ROCOv2 - 放射学原始数据集的0.33%。

许可证

本模型使用的许可证为Apache - 2.0。

标签信息

属性	详情
标签	text - generation - inference、transformers、unsloth、mllama
模型类型	基于`unsloth/Llama-3.2-11B-Vision-Instruct`微调的放射学图像解读模型
训练数据	unsloth/Radiology_mini
基础模型	unsloth/Llama-3.2-11B-Vision-Instruct
库名称	transformers

💻 使用示例

基础用法

from unsloth import FastVisionModel
from PIL import Image
import numpy as np
import torch

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

# Load model and tokenizer
model, tokenizer = FastVisionModel.from_pretrained(
    "0llheaven/Llama-3.2-11B-Vision-Radiology-mini",
    load_in_4bit=True,
    use_gradient_checkpointing="unsloth",
)
FastVisionModel.for_inference(model)

model.to(device)

def predict_radiology_description(image, instruction):
    try:
        messages = [{"role": "user", "content": [
            {"type": "image"},
            {"type": "text", "text": instruction}
        ]}]
        input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)

        inputs = tokenizer(
            image,
            input_text,
            add_special_tokens=False,
            return_tensors="pt",
        ).to(device)

        output_ids = model.generate(
            **inputs,
            max_new_tokens=256,
            temperature=1.5,
            min_p=0.1
        )

        generated_text = tokenizer.decode(output_ids[0], skip_special_tokens=True)
        return generated_text.replace("assistant", "\n\nassistant").strip()
    except Exception as e:
        return f"Error: {str(e)}"

# Example of usage!
image_path = 'example_image.jpeg'
instruction = 'You are an expert radiographer. Describe accurately what you see in this image.'

image = Image.open(image_path).convert("RGB")
output = predict_radiology_description(image, instruction)
print(output)