基础模型: openai/clip-vit-base-patch16
库名称: transformers.js
适配Transformers.js的ONNX权重版 https://huggingface.co/openai/clip-vit-base-patch16
使用方法 (Transformers.js)
若尚未安装,可通过NPM安装Transformers.js:
npm i @xenova/transformers
示例: 使用pipeline
API进行零样本图像分类
const classifier = await pipeline('zero-shot-image-classification', 'Xenova/clip-vit-base-patch16');
const url = 'https://huggingface.co/datasets/Xenova/transformers.js-docs/resolve/main/tiger.jpg';
const output = await classifier(url, ['tiger', 'horse', 'dog']);
示例: 使用CLIPModel
进行零样本图像分类
import { AutoTokenizer, AutoProcessor, CLIPModel, RawImage } from '@xenova/transformers';
const tokenizer = await AutoTokenizer.from_pretrained('Xenova/clip-vit-base-patch16');
const processor = await AutoProcessor.from_pretrained('Xenova/clip-vit-base-patch16');
const model = await CLIPModel.from_pretrained('Xenova/clip-vit-base-patch16');
const texts = ['一辆汽车的照片', '足球比赛的照片'];
const text_inputs = tokenizer(texts, { padding: true, truncation: true });
const image = await RawImage.read('https://huggingface.co/datasets/Xenova/transformers.js-docs/resolve/main/football-match.jpg');
const image_inputs = await processor(image);
const output = await model({ ...text_inputs, ...image_inputs });
示例: 使用CLIPTextModelWithProjection
计算文本嵌入
import { AutoTokenizer, CLIPTextModelWithProjection } from '@xenova/transformers';
const tokenizer = await AutoTokenizer.from_pretrained('Xenova/clip-vit-base-patch16');
const text_model = await CLIPTextModelWithProjection.from_pretrained('Xenova/clip-vit-base-patch16');
const texts = ['一辆汽车的照片', '足球比赛的照片'];
const { text_embeds } = await text_model(tokenizer(texts, { padding: true, truncation: true }));
示例: 使用CLIPVisionModelWithProjection
计算视觉嵌入
import { AutoProcessor, CLIPVisionModelWithProjection, RawImage } from '@xenova/transformers';
const processor = await AutoProcessor.from_pretrained('Xenova/clip-vit-base-patch16');
const vision_model = await CLIPVisionModelWithProjection.from_pretrained('Xenova/clip-vit-base-patch16');
const image = await RawImage.read('https://huggingface.co/datasets/Xenova/transformers.js-docs/resolve/main/football-match.jpg');
const { image_embeds } = await vision_model(await processor(image));
注:当前ONNX权重独立存储方案是WebML普及前的过渡方案。建议使用🤗 Optimum转换模型,并参照本仓库结构(ONNX权重存放在onnx
子目录)。