许可协议:cc-by-nc-4.0
库名称:timm
标签:
convnextv2_tiny.fcmae 模型卡
一个基于ConvNeXt-V2的自监督特征表示模型。采用全卷积掩码自编码器框架(FCMAE)进行预训练。该模型未包含预训练头部,仅适用于微调或特征提取任务。
模型详情
- 模型类型: 图像分类/特征主干网络
- 模型统计:
- 参数量(百万):27.9
- GMACs运算量:4.5
- 激活值(百万):13.4
- 图像尺寸:224 x 224像素
- 相关论文:
- ConvNeXt V2: 通过掩码自编码器协同设计与扩展卷积网络:https://arxiv.org/abs/2301.00808
- 原始实现: https://github.com/facebookresearch/ConvNeXt-V2
- 预训练数据集: ImageNet-1k
模型使用
图像分类
from urllib.request import urlopen
from PIL import Image
import timm
img = Image.open(urlopen(
'https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/beignets-task-guide.png'
))
model = timm.create_model('convnextv2_tiny.fcmae', pretrained=True)
model = model.eval()
data_config = timm.data.resolve_model_data_config(model)
transforms = timm.data.create_transform(**data_config, is_training=False)
output = model(transforms(img).unsqueeze(0))
top5_probabilities, top5_class_indices = torch.topk(output.softmax(dim=1) * 100, k=5)
特征图提取
from urllib.request import urlopen
from PIL import Image
import timm
img = Image.open(urlopen(
'https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/beignets-task-guide.png'
))
model = timm.create_model(
'convnextv2_tiny.fcmae',
pretrained=True,
features_only=True,
)
model = model.eval()
data_config = timm.data.resolve_model_data_config(model)
transforms = timm.data.create_transform(**data_config, is_training=False)
output = model(transforms(img).unsqueeze(0))
for o in output:
print(o.shape)
图像嵌入
from urllib.request import urlopen
from PIL import Image
import timm
img = Image.open(urlopen(
'https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/beignets-task-guide.png'
))
model = timm.create_model(
'convnextv2_tiny.fcmae',
pretrained=True,
num_classes=0,
)
model = model.eval()
data_config = timm.data.resolve_model_data_config(model)
transforms = timm.data.create_transform(**data_config, is_training=False)
output = model(transforms(img).unsqueeze(0))
output = model.forward_features(transforms(img).unsqueeze(0))
output = model.forward_head(output, pre_logits=True)
模型对比
在timm的模型结果中查看该模型的数据集和运行时指标。
所有计时数据来自RTX 3090显卡上PyTorch 1.13的eager模式(启用AMP)。