license: apache-2.0
inference: false
pipeline_tag: zero-shot-image-classification
pipeline_tag: feature-extraction
inference:
parameters:
tags:
太乙-CLIP-罗伯特大型-3.26亿参数-中文版
简介
这是首个开源的中文CLIP模型,文本编码器采用RoBERTa-large架构,基于1.23亿图文对进行预训练。
模型分类
需求类型 |
任务领域 |
模型系列 |
模型名称 |
参数量 |
特性 |
特殊需求 |
多模态 |
太乙系列 |
CLIP(RoBERTa) |
3.26亿 |
中文支持 |
模型详情
我们严格遵循CLIP的实验框架来构建强大的图文表征系统。在开发中文CLIP过程中:
这是Huggingface社区首个开源的中文CLIP实现。
性能表现
零样本分类
模型 |
数据集 |
Top1准确率 |
Top5准确率 |
本模型 |
ImageNet1k中文版 |
53.05% |
79.55% |
零样本图文检索
模型 |
数据集 |
Top1 |
Top5 |
Top10 |
本模型 |
Flickr30k中文测试集 |
54.36% |
80.56% |
87.90% |
本模型 |
COCO中文测试集 |
51.47% |
81.00% |
90.40% |
本模型 |
悟空50k数据集 |
61.18% |
90.46% |
95.74% |
使用示例
from PIL import Image
import requests
import clip
import torch
from transformers import BertForSequenceClassification, BertConfig, BertTokenizer
from transformers import CLIPProcessor, CLIPModel
import numpy as np
query_texts = ["一只猫", "一只狗", "两只猫", "两只老虎", "一只老虎"]
text_tokenizer = BertTokenizer.from_pretrained("IDEA-CCNL/Taiyi-CLIP-Roberta-large-326M-Chinese")
text_encoder = BertForSequenceClassification.from_pretrained("IDEA-CCNL/Taiyi-CLIP-Roberta-large-326M-Chinese").eval()
text = text_tokenizer(query_texts, return_tensors='pt', padding=True)['input_ids']
url = "http://images.cocodataset.org/val2017/000000039769.jpg"
clip_model = CLIPModel.from_pretrained("openai/clip-vit-large-patch14")
processor = CLIPProcessor.from_pretrained("openai/clip-vit-large-patch14")
image = processor(images=Image.open(requests.get(url, stream=True).raw), return_tensors="pt")
with torch.no_grad():
image_features = clip_model.get_image_features(**image)
text_features = text_encoder(text).logits
image_features = image_features / image_features.norm(dim=1, keepdim=True)
text_features = text_features / text_features.norm(dim=1, keepdim=True)
logit_scale = clip_model.logit_scale.exp()
logits_per_image = logit_scale * image_features @ text_features.t()
probs = logits_per_image.softmax(dim=-1).cpu().numpy()
print(np.around(probs, 3))
引用规范
若使用本模型,请引用我们的论文:
@article{fengshenbang,
author = {张嘉旭等},
title = {封神榜1.0:中文认知智能基础体系},
journal = {CoRR},
volume = {abs/2209.02970},
year = {2022}
}
或引用项目主页:
@misc{封神榜大模型,
title={封神榜大模型},
author={IDEA-CCNL},
year={2021},
howpublished={\url{https://github.com/IDEA-CCNL/Fengshenbang-LM}},
}