license: other
library_name: timm
tags:
模型卡片:regnety_320.seer
RegNetY-32GF特征/骨干模型。采用SEER方法预训练:基于SwAV自监督学习框架,在"20亿张随机网络图片"上完成预训练。
SEER采用SEER许可证授权,版权归Meta Platforms, Inc.所有。许可证为非商业用途许可,包含使用和分发限制。
timm
库的RegNet实现包含多项独特增强功能:
- 随机深度
- 梯度检查点
- 分层学习率衰减
- 可配置输出步长(空洞卷积)
- 可配置激活函数与归一化层
- RegNetV变体特有的预激活瓶颈块选项
- 唯一提供预训练权重的RegNetZ模型定义
模型详情
- 模型类型: 图像分类/特征骨干
- 模型统计:
- 参数量(百万):141.3
- GMACs运算量:32.3
- 激活值(百万):30.3
- 图像尺寸:224×224像素
- 相关论文:
- 《野外视觉特征的自监督预训练》:https://arxiv.org/abs/2103.01988v2
- 《网络设计空间的设计》:https://arxiv.org/abs/2003.13678
- 原始实现: https://github.com/facebookresearch/vissl
- 预训练数据集: RandomInternetImages-2B
使用示例
图像分类
from urllib.request import urlopen
from PIL import Image
import timm
img = Image.open(urlopen(
'https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/beignets-task-guide.png'
))
model = timm.create_model('regnety_320.seer', pretrained=True)
model = model.eval()
data_config = timm.data.resolve_model_data_config(model)
transforms = timm.data.create_transform(**data_config, is_training=False)
output = model(transforms(img).unsqueeze(0))
top5_probabilities, top5_class_indices = torch.topk(output.softmax(dim=1) * 100, k=5)
特征图提取
from urllib.request import urlopen
from PIL import Image
import timm
img = Image.open(urlopen(
'https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/beignets-task-guide.png'
))
model = timm.create_model(
'regnety_320.seer',
pretrained=True,
features_only=True,
)
model = model.eval()
data_config = timm.data.resolve_model_data_config(model)
transforms = timm.data.create_transform(**data_config, is_training=False)
output = model(transforms(img).unsqueeze(0))
for o in output:
print(o.shape)
图像嵌入
from urllib.request import urlopen
from PIL import Image
import timm
img = Image.open(urlopen(
'https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/beignets-task-guide.png'
))
model = timm.create_model(
'regnety_320.seer',
pretrained=True,
num_classes=0,
)
model = model.eval()
data_config = timm.data.resolve_model_data_config(model)
transforms = timm.data.create_transform(**data_config, is_training=False)
output = model(transforms(img).unsqueeze(0))
output = model.forward_features(transforms(img).unsqueeze(0))
output = model.forward_head(output, pre_logits=True)
模型对比
完整模型指标参见timm测试结果。
下表对比中,带ra_in1k/ra3_in1k/ch_in1k/sw_*/lion_*标签的模型由timm
训练。
模型名称 |
图像尺寸 |
Top1准确率 |
Top5准确率 |
参数量(M) |
GMACs |
M激活值 |
regnety_1280.swag_ft_in1k |
384 |
88.228 |
98.684 |
644.81 |
374.99 |
210.2 |
...(后续模型对比数据保持原样)... |
|
|
|
|
|
|
引用文献
@article{goyal2022vision,
title={视觉模型在无监督非精选图像预训练后更具鲁棒性和公平性},
author={Priya Goyal等},
year={2022},
eprint={2202.08360},
archivePrefix={arXiv},
primaryClass={cs.CV}
}
@InProceedings{Radosavovic2020,
title = {网络设计空间的设计},
author = {Ilija Radosavovic等},
booktitle = {CVPR},
year = {2020}
}
@misc{rw2019timm,
author = {Ross Wightman},
title = {PyTorch图像模型库},
year = {2019},
publisher = {GitHub},
journal = {GitHub仓库},
doi = {10.5281/zenodo.4414861},
howpublished = {\url{https://github.com/huggingface/pytorch-image-models}}
}