开源图像实例分割模型：Mask2Former精准识别并分割图像中不同对象实例

首页

Finetune Instance Segmentation Ade20k Mini Mask2former No Trainer

由 qubvel-hf 开发

这是一个在ADE20K-mini数据集上微调的Mask2Former实例分割模型，能够识别和分割图像中的不同对象实例。

图像分割

Transformers

#实例分割 #小尺寸图像处理 #ADE20K数据集

下载量 24

发布时间 : 5/26/2024

模型简介

该模型基于Facebook的Mask2Former架构，专门用于实例分割任务，能够在图像中识别并分割出不同的对象实例。

模型特点

高效的实例分割

能够准确识别并分割图像中的多个对象实例

基于Transformer架构

采用Swin Transformer和Mask2Former架构，具有强大的特征提取能力

小尺寸输入支持

支持256x256像素的输入尺寸，适合资源有限的环境

模型能力

图像分割

对象实例识别

像素级标注

使用案例

计算机视觉

场景理解

分析复杂场景中的各个对象及其位置关系

可输出每个对象的精确边界和类别信息

自动驾驶

识别道路场景中的车辆、行人等关键对象

为自动驾驶系统提供精确的环境感知

🚀 实例分割示例

本项目提供了一个图像分割的实例，基于相关模型和脚本实现了实例分割任务的训练和推理，可在多种环境下运行。

🚀 快速开始

本项目涵盖了实例分割的训练和推理过程，下面将详细介绍具体步骤。

📦 安装指南

首先，你需要配置环境以确保能够顺利进行训练。

配置环境

accelerate config

根据提示回答关于训练环境的问题。

测试环境

accelerate test

此命令用于确保一切准备就绪，可以开始训练。

启动训练

accelerate launch run_instance_segmentation_no_trainer.py \
    --model_name_or_path facebook/mask2former-swin-tiny-coco-instance \
    --output_dir finetune-instance-segmentation-ade20k-mini-mask2former-no-trainer \
    --dataset_name qubvel-hf/ade20k-mini \
    --do_reduce_labels \
    --image_height 256 \
    --image_width 256 \
    --num_train_epochs 40 \
    --learning_rate 1e-5 \
    --lr_scheduler_type constant \
    --per_device_train_batch_size 8 \
    --gradient_accumulation_steps 2 \
    --dataloader_num_workers 8 \
    --push_to_hub

💻 使用示例

基础用法

以下代码展示了如何加载训练好的模型并进行推理：

import torch
import requests
import matplotlib.pyplot as plt

from PIL import Image
from transformers import Mask2FormerForUniversalSegmentation, Mask2FormerImageProcessor

# 加载图像
image = Image.open(requests.get("http://farm4.staticflickr.com/3017/3071497290_31f0393363_z.jpg", stream=True).raw)

# 加载模型和图像处理器
device = "cuda"
checkpoint = "qubvel-hf/finetune-instance-segmentation-ade20k-mini-mask2former-no-trainer"

model = Mask2FormerForUniversalSegmentation.from_pretrained(checkpoint, device_map=device)
image_processor = Mask2FormerImageProcessor.from_pretrained(checkpoint)

# 在图像上运行推理
inputs = image_processor(images=[image], return_tensors="pt").to(device)
with torch.no_grad():
    outputs = model(**inputs)

# 后处理输出
outputs = image_processor.post_process_instance_segmentation(outputs, target_sizes=[image.size[::-1]])

print("Mask shape: ", outputs[0]["segmentation"].shape)
print("Mask values: ", outputs[0]["segmentation"].unique())
for segment in outputs[0]["segments_info"]:
    print("Segment: ", segment)

运行上述代码后，你将看到如下输出：

Mask shape:  torch.Size([427, 640])
Mask values:  tensor([-1.,  0.,  1.,  2.,  3.,  4.,  5.,  6.])
Segment:  {'id': 0, 'label_id': 0, 'was_fused': False, 'score': 0.946127}
Segment:  {'id': 1, 'label_id': 1, 'was_fused': False, 'score': 0.961582}
Segment:  {'id': 2, 'label_id': 1, 'was_fused': False, 'score': 0.968367}
Segment:  {'id': 3, 'label_id': 1, 'was_fused': False, 'score': 0.819527}
Segment:  {'id': 4, 'label_id': 1, 'was_fused': False, 'score': 0.655761}
Segment:  {'id': 5, 'label_id': 1, 'was_fused': False, 'score': 0.531299}
Segment:  {'id': 6, 'label_id': 1, 'was_fused': False, 'score': 0.929477}

高级用法

使用以下代码可视化推理结果：

import numpy as np
import matplotlib.pyplot as plt

segmentation = outputs[0]["segmentation"].numpy()

plt.figure(figsize=(10, 10))
plt.subplot(1, 2, 1)
plt.imshow(np.array(image))
plt.axis("off")
plt.subplot(1, 2, 2)
plt.imshow(segmentation)
plt.axis("off")
plt.show()