开源sam2-hiera-tiny模型 - 高效图像和视频提示式视觉分割工具

首页

Sam2 Hiera Tiny

由 facebook 开发

SAM 2是FAIR研发的面向图像和视频可提示视觉分割的基础模型，支持通过提示进行高效分割。

图像分割开源协议:Apache-2.0 #可提示分割 #视频对象追踪 #零样本学习

下载量 41.88k

发布时间 : 8/2/2024

模型简介

SAM 2是一个先进的视觉分割模型，能够在图像和视频中根据用户提供的提示（如点或框）快速生成高质量的分割掩码。

模型特点

多模态提示支持

支持通过点、框等多种提示方式进行交互式分割

图像视频通用

同一模型架构可同时处理图像和视频分割任务

高效推理

支持bfloat16精度和CUDA加速，实现快速推理

实时传播

视频处理时可实时传播提示并跟踪对象

模型能力

图像分割

视频对象分割

交互式分割

掩码生成

使用案例

计算机视觉

图像编辑

快速分离图像中的对象进行编辑

高质量的对象分割掩码

视频分析

跟踪视频中的特定对象

跨帧一致的对象分割

增强现实

AR内容叠加

实时分割现实场景中的对象

为AR应用提供精确的对象边界

🚀 SAM 2：图像和视频中的任意分割模型

SAM 2 是由 FAIR 开发的基础模型，旨在解决图像和视频中可提示的视觉分割问题。更多信息请参阅 SAM 2 论文。

官方代码已在该仓库公开发布。

🚀 快速开始

✨ 主要特性

支持图像和视频的可提示视觉分割。
提供了预训练模型，方便快速使用。

📦 安装指南

文档未提供具体安装步骤，可参考官方仓库进行安装。

💻 使用示例

基础用法

图像预测

import torch
from sam2.sam2_image_predictor import SAM2ImagePredictor

predictor = SAM2ImagePredictor.from_pretrained("facebook/sam2-hiera-tiny")

with torch.inference_mode(), torch.autocast("cuda", dtype=torch.bfloat16):
    predictor.set_image(<your_image>)
    masks, _, _ = predictor.predict(<input_prompts>)

视频预测

import torch
from sam2.sam2_video_predictor import SAM2VideoPredictor

predictor = SAM2VideoPredictor.from_pretrained("facebook/sam2-hiera-tiny")

with torch.inference_mode(), torch.autocast("cuda", dtype=torch.bfloat16):
    state = predictor.init_state(<your_video>)

    # add new prompts and instantly get the output on the same frame
    frame_idx, object_ids, masks = predictor.add_new_points_or_box(state, <your_prompts>):

    # propagate the prompts to get masklets throughout the video
    for frame_idx, object_ids, masks in predictor.propagate_in_video(state):
        ...

更多详细信息请参考演示笔记本。

引用

如需引用该论文、模型或软件，请使用以下 BibTeX：

@article{ravi2024sam2,
  title={SAM 2: Segment Anything in Images and Videos},
  author={Ravi, Nikhila and Gabeur, Valentin and Hu, Yuan-Ting and Hu, Ronghang and Ryali, Chaitanya and Ma, Tengyu and Khedr, Haitham and R{\"a}dle, Roman and Rolland, Chloe and Gustafson, Laura and Mintun, Eric and Pan, Junting and Alwala, Kalyan Vasudev and Carion, Nicolas and Wu, Chao-Yuan and Girshick, Ross and Doll{\'a}r, Piotr and Feichtenhofer, Christoph},
  journal={arXiv preprint arXiv:2408.00714},
  url={https://arxiv.org/abs/2408.00714},
  year={2024}
}