基础模型: google/siglip-base-patch16-224
库名称: transformers.js
任务标签: 零样本图像分类
适配 https://huggingface.co/google/siglip-base-patch16-224 的ONNX权重版本,以兼容Transformers.js。
使用方法 (Transformers.js)
若未安装,可通过NPM安装Transformers.js库:
npm i @xenova/transformers
示例: 使用Xenova/siglip-base-patch16-224
进行零样本图像分类:
import { pipeline } from '@xenova/transformers';
const classifier = await pipeline('zero-shot-image-classification', 'Xenova/siglip-base-patch16-224');
const url = 'http://images.cocodataset.org/val2017/000000039769.jpg';
const output = await classifier(url, ['2只猫', '2只狗'], {
hypothesis_template: '一张{}的照片',
});
console.log(output);
示例: 使用SiglipTextModel
计算文本嵌入向量。
import { AutoTokenizer, SiglipTextModel } from '@xenova/transformers';
const tokenizer = await AutoTokenizer.from_pretrained('Xenova/siglip-base-patch16-224');
const text_model = await SiglipTextModel.from_pretrained('Xenova/siglip-base-patch16-224');
const texts = ['一张2只猫的照片', '一张2只狗的照片'];
const text_inputs = tokenizer(texts, { padding: 'max_length', truncation: true });
const { pooler_output } = await text_model(text_inputs);
示例: 使用SiglipVisionModel
计算视觉嵌入向量。
import { AutoProcessor, SiglipVisionModel, RawImage} from '@xenova/transformers';
const processor = await AutoProcessor.from_pretrained('Xenova/siglip-base-patch16-224');
const vision_model = await SiglipVisionModel.from_pretrained('Xenova/siglip-base-patch16-224');
const image = await RawImage.read('https://huggingface.co/datasets/Xenova/transformers.js-docs/resolve/main/football-match.jpg');
const image_inputs = await processor(image);
const { pooler_output } = await vision_model(image_inputs);
注:为ONNX权重单独创建仓库是临时方案,直至WebML获得更广泛支持。如需使模型适配网页端,建议使用🤗 Optimum转换为ONNX格式,并参照本仓库结构(ONNX权重存放在onnx
子文件夹中)。