许可证: 其他
库名称: timm
标签:
模型卡片: regnety_640.seer
一个RegNetY-64GF特征/骨干模型。采用SEER方法预训练:基于SwAV自监督学习框架,在“20亿张随机互联网图像”上完成。
SEER遵循SEER许可证,版权归Meta Platforms, Inc.所有。许可证为包含使用和分发限制的非商业许可。
timm
库中的RegNet实现包含多项独特增强功能:
- 随机深度
- 梯度检查点
- 分层学习率衰减
- 可配置的输出步长(空洞卷积)
- 可配置的激活函数与归一化层
- 支持RegNetV变体的预激活瓶颈块选项
- 唯一提供预训练权重的RegNetZ模型定义
模型详情
- 模型类型: 图像分类/特征骨干
- 模型统计:
- 参数量(M): 276.5
- GMACs运算量: 64.2
- 激活值(M): 42.5
- 图像尺寸: 224×224
- 相关论文:
- 《野外观测数据的视觉特征自监督预训练》: https://arxiv.org/abs/2103.01988v2
- 《网络设计空间的设计》: https://arxiv.org/abs/2003.13678
- 原始实现: https://github.com/facebookresearch/vissl
- 预训练数据集: RandomInternetImages-2B
模型使用
图像分类
from urllib.request import urlopen
from PIL import Image
import timm
img = Image.open(urlopen(
'https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/beignets-task-guide.png'
))
model = timm.create_model('regnety_640.seer', pretrained=True)
model = model.eval()
data_config = timm.data.resolve_model_data_config(model)
transforms = timm.data.create_transform(**data_config, is_training=False)
output = model(transforms(img).unsqueeze(0))
top5_probabilities, top5_class_indices = torch.topk(output.softmax(dim=1) * 100, k=5)
特征图提取
from urllib.request import urlopen
from PIL import Image
import timm
img = Image.open(urlopen(
'https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/beignets-task-guide.png'
))
model = timm.create_model(
'regnety_640.seer',
pretrained=True,
features_only=True,
)
model = model.eval()
data_config = timm.data.resolve_model_data_config(model)
transforms = timm.data.create_transform(**data_config, is_training=False)
output = model(transforms(img).unsqueeze(0))
for o in output:
print(o.shape)
图像嵌入
from urllib.request import urlopen
from PIL import Image
import timm
img = Image.open(urlopen(
'https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/beignets-task-guide.png'
))
model = timm.create_model(
'regnety_640.seer',
pretrained=True,
num_classes=0,
)
model = model.eval()
data_config = timm.data.resolve_model_data_config(model)
transforms = timm.data.create_transform(**data_config, is_training=False)
output = model(transforms(img).unsqueeze(0))
output = model.forward_features(transforms(img).unsqueeze(0))
output = model.forward_head(output, pre_logits=True)
模型对比
在timm的模型结果中探索该模型的数据集和运行时指标。
下方对比表中,带有ra_in1k、ra3_in1k、ch_in1k、sw_*和lion_*标签的权重由timm
训练。
模型 |
图像尺寸 |
Top1准确率 |
Top5准确率 |
参数量(M) |
GMACs |
M激活值 |
regnety_1280.swag_ft_in1k |
384 |
88.228 |
98.684 |
644.81 |
374.99 |
210.2 |
...(其余模型对比数据保持原样)... |
|
|
|
|
|
|
引用文献
@article{goyal2022vision,
title={Vision Models Are More Robust And Fair When Pretrained On Uncurated Images Without Supervision},
author={Priya Goyal and Quentin Duval and Isaac Seessel and Mathilde Caron and Ishan Misra and Levent Sagun and Armand Joulin and Piotr Bojanowski},
year={2022},
eprint={2202.08360},
archivePrefix={arXiv},
primaryClass={cs.CV}
}
@InProceedings{Radosavovic2020,
title = {Designing Network Design Spaces},
author = {Ilija Radosavovic and Raj Prateek Kosaraju and Ross Girshick and Kaiming He and Piotr Doll{'a}r},
booktitle = {CVPR},
year = {2020}
}
@misc{rw2019timm,
author = {Ross Wightman},
title = {PyTorch Image Models},
year = {2019},
publisher = {GitHub},
journal = {GitHub repository},
doi = {10.5281/zenodo.4414861},
howpublished = {\url{https://github.com/huggingface/pytorch-image-models}}
}