许可协议:apache-2.0
库名称:timm
标签:
samvit_base_patch16.sa1b 模型卡
一个Segment-Anything视觉变换器(SAM ViT)图像特征模型(注意:仅包含特征提取和微调功能,不包含分割头)。由论文作者在SA-1B数据集上通过MAE权重初始化进行预训练,用于分割任务。
模型详情
- 模型类型: 图像分类/特征骨干网络
- 模型统计:
- 参数量(百万):89.7
- GMACs运算量:486.4
- 激活量(百万):1343.3
- 图像尺寸:1024 x 1024
- 相关论文:
- 《Segment Anything》:https://arxiv.org/abs/2304.02643
- 《An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale》:https://arxiv.org/abs/2010.11929v2
- 原始代码库: https://github.com/facebookresearch/segment-anything
- 预训练数据集: SA-1B
模型使用
图像分类
from urllib.request import urlopen
from PIL import Image
import timm
img = Image.open(urlopen(
'https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/beignets-task-guide.png'
))
model = timm.create_model('samvit_base_patch16.sa1b', pretrained=True)
model = model.eval()
data_config = timm.data.resolve_model_data_config(model)
transforms = timm.data.create_transform(**data_config, is_training=False)
output = model(transforms(img).unsqueeze(0))
top5_probabilities, top5_class_indices = torch.topk(output.softmax(dim=1) * 100, k=5)
图像嵌入
from urllib.request import urlopen
from PIL import Image
import timm
img = Image.open(urlopen(
'https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/beignets-task-guide.png'
))
model = timm.create_model(
'samvit_base_patch16.sa1b',
pretrained=True,
num_classes=0,
)
model = model.eval()
data_config = timm.data.resolve_model_data_config(model)
transforms = timm.data.create_transform(**data_config, is_training=False)
output = model(transforms(img).unsqueeze(0))
output = model.forward_features(transforms(img).unsqueeze(0))
output = model.forward_head(output, pre_logits=True)
模型比较
在timm的模型结果中探索该模型的数据集和运行时指标。
引用文献
@article{kirillov2023segany,
标题={Segment Anything},
作者={Kirillov, Alexander 等人},
期刊={arXiv:2304.02643},
年份={2023}
}
@article{dosovitskiy2020vit,
标题={An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale},
作者={Dosovitskiy, Alexey 等人},
期刊={ICLR},
年份={2021}
}
@misc{rw2019timm,
作者 = {Ross Wightman},
标题 = {PyTorch Image Models},
年份 = {2019},
发布方 = {GitHub},
期刊 = {GitHub仓库},
DOI = {10.5281/zenodo.4414861},
发布方式 = {\url{https://github.com/huggingface/pytorch-image-models}}
}