Food 101 93M
基于google/siglip2-base-patch16-224微调的食物图像分类模型,可识别101种流行菜肴
下载量 158
发布时间 : 4/4/2025
模型简介
该模型使用SiglipForImageClassification架构,在Food-101数据集上训练,专门用于食物图像分类任务。
模型特点
高精度食物识别
在101类食物分类任务上达到89.73%的准确率
基于SigLIP2架构
采用谷歌先进的SigLIP2视觉架构,具有强大的特征提取能力
广泛的食物类别覆盖
支持101种国际流行菜肴的分类识别
模型能力
食物图像分类
多类别图像识别
菜品识别
使用案例
餐饮行业
智能菜单系统
自动识别顾客拍摄的菜品照片并匹配菜单
提高点餐效率和准确性
营养分析
通过识别食物图片估算餐食营养成分
辅助健康饮食管理
社交媒体
美食内容自动标记
为社交媒体上的食物图片自动添加标签
提升内容分类和搜索体验
🚀 Food-101-93M
Food-101-93M是一个基于图像分类的模型,它基于google/siglip2-base-patch16-224模型微调而来,使用了SiglipForImageClassification架构。该模型经过训练,能够将食物图像分类为101种常见菜肴中的一种,其训练数据来源于Food-101数据集。
🚀 快速开始
安装依赖
!pip install -q transformers torch pillow gradio
运行代码示例
import gradio as gr
from transformers import AutoImageProcessor, SiglipForImageClassification
from PIL import Image
import torch
# Load model and processor
model_name = "prithivMLmods/Food-101-93M"
model = SiglipForImageClassification.from_pretrained(model_name)
processor = AutoImageProcessor.from_pretrained(model_name)
# Food-101 labels
labels = {
"0": "apple_pie", "1": "baby_back_ribs", "2": "baklava", "3": "beef_carpaccio", "4": "beef_tartare",
"5": "beet_salad", "6": "beignets", "7": "bibimbap", "8": "bread_pudding", "9": "breakfast_burrito",
"10": "bruschetta", "11": "caesar_salad", "12": "cannoli", "13": "caprese_salad", "14": "carrot_cake",
"15": "ceviche", "16": "cheesecake", "17": "cheese_plate", "18": "chicken_curry", "19": "chicken_quesadilla",
"20": "chicken_wings", "21": "chocolate_cake", "22": "chocolate_mousse", "23": "churros", "24": "clam_chowder",
"25": "club_sandwich", "26": "crab_cakes", "27": "creme_brulee", "28": "croque_madame", "29": "cup_cakes",
"30": "deviled_eggs", "31": "donuts", "32": "dumplings", "33": "edamame", "34": "eggs_benedict",
"35": "escargots", "36": "falafel", "37": "filet_mignon", "38": "fish_and_chips", "39": "foie_gras",
"40": "french_fries", "41": "french_onion_soup", "42": "french_toast", "43": "fried_calamari", "44": "fried_rice",
"45": "frozen_yogurt", "46": "garlic_bread", "47": "gnocchi", "48": "greek_salad", "49": "grilled_cheese_sandwich",
"50": "grilled_salmon", "51": "guacamole", "52": "gyoza", "53": "hamburger", "54": "hot_and_sour_soup",
"55": "hot_dog", "56": "huevos_rancheros", "57": "hummus", "58": "ice_cream", "59": "lasagna",
"60": "lobster_bisque", "61": "lobster_roll_sandwich", "62": "macaroni_and_cheese", "63": "macarons", "64": "miso_soup",
"65": "mussels", "66": "nachos", "67": "omelette", "68": "onion_rings", "69": "oysters",
"70": "pad_thai", "71": "paella", "72": "pancakes", "73": "panna_cotta", "74": "peking_duck",
"75": "pho", "76": "pizza", "77": "pork_chop", "78": "poutine", "79": "prime_rib",
"80": "pulled_pork_sandwich", "81": "ramen", "82": "ravioli", "83": "red_velvet_cake", "84": "risotto",
"85": "samosa", "86": "sashimi", "87": "scallops", "88": "seaweed_salad", "89": "shrimp_and_grits",
"90": "spaghetti_bolognese", "91": "spaghetti_carbonara", "92": "spring_rolls", "93": "steak", "94": "strawberry_shortcake",
"95": "sushi", "96": "tacos", "97": "takoyaki", "98": "tiramisu", "99": "tuna_tartare", "100": "waffles"
}
def classify_food(image):
"""Predicts the type of food in the image."""
image = Image.fromarray(image).convert("RGB")
inputs = processor(images=image, return_tensors="pt")
with torch.no_grad():
outputs = model(**inputs)
logits = outputs.logits
probs = torch.nn.functional.softmax(logits, dim=1).squeeze().tolist()
predictions = {labels[str(i)]: round(probs[i], 3) for i in range(len(probs))}
# Sort by descending probability
predictions = dict(sorted(predictions.items(), key=lambda item: item[1], reverse=True)[:5])
return predictions
# Gradio Interface
iface = gr.Interface(
fn=classify_food,
inputs=gr.Image(type="numpy"),
outputs=gr.Label(num_top_classes=5, label="Top 5 Prediction Scores"),
title="Food-101-93M 🍽️",
description="Upload an image of food to classify it into one of 101 dish categories based on the Food-101 dataset."
)
# Launch app
if __name__ == "__main__":
iface.launch()
✨ 主要特性
- 精准分类:该模型能够将图像分类为101种食物类别,如
sushi
(寿司)、hamburger
(汉堡)、waffles
(华夫饼)、pad_thai
(泰式炒河粉)等。 - 多场景应用:可用于食谱推荐引擎、食物记录与卡路里追踪应用、智能厨房以及餐厅菜单数字化等多个场景。
💻 使用示例
基础用法
# 安装依赖
!pip install -q transformers torch pillow gradio
# 运行代码示例
import gradio as gr
from transformers import AutoImageProcessor, SiglipForImageClassification
from PIL import Image
import torch
# Load model and processor
model_name = "prithivMLmods/Food-101-93M"
model = SiglipForImageClassification.from_pretrained(model_name)
processor = AutoImageProcessor.from_pretrained(model_name)
# Food-101 labels
labels = {
"0": "apple_pie", "1": "baby_back_ribs", "2": "baklava", "3": "beef_carpaccio", "4": "beef_tartare",
"5": "beet_salad", "6": "beignets", "7": "bibimbap", "8": "bread_pudding", "9": "breakfast_burrito",
"10": "bruschetta", "11": "caesar_salad", "12": "cannoli", "13": "caprese_salad", "14": "carrot_cake",
"15": "ceviche", "16": "cheesecake", "17": "cheese_plate", "18": "chicken_curry", "19": "chicken_quesadilla",
"20": "chicken_wings", "21": "chocolate_cake", "22": "chocolate_mousse", "23": "churros", "24": "clam_chowder",
"25": "club_sandwich", "26": "crab_cakes", "27": "creme_brulee", "28": "croque_madame", "29": "cup_cakes",
"30": "deviled_eggs", "31": "donuts", "32": "dumplings", "33": "edamame", "34": "eggs_benedict",
"35": "escargots", "36": "falafel", "37": "filet_mignon", "38": "fish_and_chips", "39": "foie_gras",
"40": "french_fries", "41": "french_onion_soup", "42": "french_toast", "43": "fried_calamari", "44": "fried_rice",
"45": "frozen_yogurt", "46": "garlic_bread", "47": "gnocchi", "48": "greek_salad", "49": "grilled_cheese_sandwich",
"50": "grilled_salmon", "51": "guacamole", "52": "gyoza", "53": "hamburger", "54": "hot_and_sour_soup",
"55": "hot_dog", "56": "huevos_rancheros", "57": "hummus", "58": "ice_cream", "59": "lasagna",
"60": "lobster_bisque", "61": "lobster_roll_sandwich", "62": "macaroni_and_cheese", "63": "macarons", "64": "miso_soup",
"65": "mussels", "66": "nachos", "67": "omelette", "68": "onion_rings", "69": "oysters",
"70": "pad_thai", "71": "paella", "72": "pancakes", "73": "panna_cotta", "74": "peking_duck",
"75": "pho", "76": "pizza", "77": "pork_chop", "78": "poutine", "79": "prime_rib",
"80": "pulled_pork_sandwich", "81": "ramen", "82": "ravioli", "83": "red_velvet_cake", "84": "risotto",
"85": "samosa", "86": "sashimi", "87": "scallops", "88": "seaweed_salad", "89": "shrimp_and_grits",
"90": "spaghetti_bolognese", "91": "spaghetti_carbonara", "92": "spring_rolls", "93": "steak", "94": "strawberry_shortcake",
"95": "sushi", "96": "tacos", "97": "takoyaki", "98": "tiramisu", "99": "tuna_tartare", "100": "waffles"
}
def classify_food(image):
"""Predicts the type of food in the image."""
image = Image.fromarray(image).convert("RGB")
inputs = processor(images=image, return_tensors="pt")
with torch.no_grad():
outputs = model(**inputs)
logits = outputs.logits
probs = torch.nn.functional.softmax(logits, dim=1).squeeze().tolist()
predictions = {labels[str(i)]: round(probs[i], 3) for i in range(len(probs))}
# Sort by descending probability
predictions = dict(sorted(predictions.items(), key=lambda item: item[1], reverse=True)[:5])
return predictions
# Gradio Interface
iface = gr.Interface(
fn=classify_food,
inputs=gr.Image(type="numpy"),
outputs=gr.Label(num_top_classes=5, label="Top 5 Prediction Scores"),
title="Food-101-93M 🍽️",
description="Upload an image of food to classify it into one of 101 dish categories based on the Food-101 dataset."
)
# Launch app
if __name__ == "__main__":
iface.launch()
📚 详细文档
分类报告
Classification Report:
precision recall f1-score support
apple_pie 0.8399 0.8253 0.8325 750
baby_back_ribs 0.9445 0.8853 0.9140 750
baklava 0.9736 0.9347 0.9537 750
beef_carpaccio 0.9079 0.9200 0.9139 750
beef_tartare 0.8486 0.8293 0.8388 750
beet_salad 0.8649 0.8707 0.8678 750
beignets 0.8961 0.9080 0.9020 750
bibimbap 0.9361 0.9373 0.9367 750
bread_pudding 0.7979 0.8000 0.7989 750
breakfast_burrito 0.8784 0.9053 0.8917 750
bruschetta 0.8672 0.8533 0.8602 750
caesar_salad 0.9444 0.9293 0.9368 750
cannoli 0.9263 0.9547 0.9402 750
caprese_salad 0.9110 0.9280 0.9194 750
carrot_cake 0.9068 0.8040 0.8523 750
ceviche 0.8375 0.8453 0.8414 750
cheesecake 0.8225 0.8093 0.8159 750
cheese_plate 0.9627 0.9627 0.9627 750
chicken_curry 0.8970 0.8827 0.8898 750
chicken_quesadilla 0.9254 0.9093 0.9173 750
chicken_wings 0.9512 0.9360 0.9435 750
chocolate_cake 0.7958 0.8107 0.8032 750
chocolate_mousse 0.6947 0.7827 0.7361 750
churros 0.9440 0.9440 0.9440 750
clam_chowder 0.8883 0.9120 0.9000 750
club_sandwich 0.9396 0.9133 0.9263 750
crab_cakes 0.9185 0.8720 0.8947 750
creme_brulee 0.9141 0.9227 0.9184 750
croque_madame 0.9106 0.8960 0.9032 750
cup_cakes 0.8986 0.9333 0.9156 750
deviled_eggs 0.9787 0.9813 0.9800 750
donuts 0.8893 0.8787 0.8840 750
dumplings 0.9212 0.8880 0.9043 750
edamame 0.9960 0.9920 0.9940 750
eggs_benedict 0.9207 0.9440 0.9322 750
escargots 0.8709 0.8907 0.8807 750
falafel 0.8945 0.8933 0.8939 750
filet_mignon 0.7598 0.7467 0.7532 750
fish_and_chips 0.9454 0.9467 0.9460 750
foie_gras 0.6659 0.8027 0.7279 750
french_fries 0.9447 0.9333 0.9390 750
french_onion_soup 0.8667 0.9187 0.8919 750
french_toast 0.8890 0.8760 0.8825 750
fried_calamari 0.9448 0.9133 0.9288 750
fried_rice 0.9325 0.9213 0.9269 750
frozen_yogurt 0.8716 0.9507 0.9094 750
garlic_bread 0.9103 0.8800 0.8949 750
gnocchi 0.8554 0.8280 0.8415 750
greek_salad 0.9203 0.9240 0.9222 750
grilled_cheese_sandwich 0.8523 0.8773 0.8647 750
grilled_salmon 0.8463 0.8960 0.8705 750
guacamole 0.9537 0.9347 0.9441 750
gyoza 0.8970 0.9173 0.9071 750
hamburger 0.8899 0.8947 0.8923 750
hot_and_sour_soup 0.9439 0.9413 0.9426 750
hot_dog 0.8859 0.9320 0.9084 750
huevos_rancheros 0.8465 0.8827 0.8642 750
hummus 0.9394 0.9093 0.9241 750
ice_cream 0.8633 0.8507 0.8570 750
lasagna 0.8780 0.8733 0.8757 750
lobster_bisque 0.8952 0.9107 0.9028 750
lobster_roll_sandwich 0.9664 0.9573 0.9618 750
macaroni_and_cheese 0.9273 0.9013 0.9141 750
macarons 0.9892 0.9747 0.9819 750
miso_soup 0.9565 0.9667 0.9615 750
mussels 0.9602 0.9640 0.9621 750
nachos 0.9337 0.9387 0.9362 750
omelette 0.8889 0.8960 0.8924 750
onion_rings 0.9493 0.9493 0.9493 750
oysters 0.9808 0.9533 0.9669 750
pad_thai 0.9188 0.9507 0.9345 750
paella 0.9352 0.9240 0.9296 750
pancakes 0.9277 0.9067 0.9171 750
panna_cotta 0.8056 0.8507 0.8275 750
peking_duck 0.8529 0.9120 0.8814 750
pho 0.9746 0.9227 0.9479 750
pizza 0.9512 0.9360 0.9435 750
pork_chop 0.8085 0.7373 0.7713 750
poutine 0.9424 0.9387 0.9405 750
prime_rib 0.9106 0.8147 0.8600 750
pulled_pork_sandwich 0.8887 0.9053 0.8970 750
ramen 0.8986 0.9213 0.9098 750
ravioli 0.8532 0.8293 0.8411 750
red_velvet_cake 0.9330 0.8907 0.9113 750
risotto 0.8809 0.8680 0.8744 750
samosa 0.9153 0.9227 0.9190 750
sashimi 0.9248 0.9187 0.9217 750
scallops 0.8564 0.8507 0.8535 750
seaweed_salad 0.9597 0.9533 0.9565 750
shrimp_and_grits 0.8995 0.8947 0.8971 750
spaghetti_bolognese 0.9667 0.9667 0.9667 750
spaghetti_carbonara 0.9601 0.9627 0.9614 750
spring_rolls 0.9045 0.9467 0.9251 750
steak 0.6311 0.7027 0.6650 750
strawberry_shortcake 0.8832 0.8467 0.8645 750
sushi 0.9204 0.8947 0.9074 750
tacos 0.9225 0.8893 0.9056 750
takoyaki 0.9419 0.9507 0.9463 750
tiramisu 0.9074 0.8627 0.8845 750
tuna_tartare 0.7691 0.7773 0.7732 750
waffles 0.9629 0.9347 0.9486 750
accuracy 0.8973 75750
macro avg 0.8987 0.8973 0.8977 75750
weighted avg 0.8987 0.8973 0.8977 75750
预期用途
Food-101-93M 模型适用于以下场景:
- 食谱推荐引擎:自动标记食物图像,以推荐食谱。
- 食物记录与卡路里追踪应用:根据照片对餐食进行分类。
- 智能厨房:协助智能电器进行食物识别。
- 餐厅菜单数字化:自动对菜肴进行分类,用于可视化菜单或点餐系统。
- 数据集标注:为其他机器学习模型的训练数据集提供自动标注。
📄 许可证
本项目采用 Apache-2.0 许可证。
信息表格
属性 | 详情 |
---|---|
模型类型 | 图像分类模型 |
训练数据 | 来源于 Food-101 数据集 |
基础模型 | google/siglip2-base-patch16-224 |
库名称 | transformers |
标签 | Food、101、siglip2、vit、biology |
Nsfw Image Detection
Apache-2.0
基于ViT架构的NSFW图像分类模型,通过监督学习在ImageNet-21k数据集上预训练,并在80,000张图像上微调,用于区分正常和NSFW内容。
图像分类
Transformers

N
Falconsai
82.4M
588
Fairface Age Image Detection
Apache-2.0
基于Vision Transformer架构的图像分类模型,在ImageNet-21k数据集上预训练,适用于多类别图像分类任务
图像分类
Transformers

F
dima806
76.6M
10
Dinov2 Small
Apache-2.0
基于DINOv2方法训练的小尺寸视觉Transformer模型,通过自监督学习提取图像特征
图像分类
Transformers

D
facebook
5.0M
31
Vit Base Patch16 224
Apache-2.0
基于ImageNet-21k预训练和ImageNet微调的视觉变换器模型,用于图像分类任务
图像分类
V
google
4.8M
775
Vit Base Patch16 224 In21k
Apache-2.0
基于ImageNet-21k数据集预训练的视觉Transformer模型,用于图像分类任务。
图像分类
V
google
2.2M
323
Dinov2 Base
Apache-2.0
基于DINOv2方法训练的视觉Transformer模型,通过自监督学习提取图像特征
图像分类
Transformers

D
facebook
1.9M
126
Gender Classification
一个基于PyTorch和HuggingPics构建的图像分类模型,用于识别图像中的性别
图像分类
Transformers

G
rizvandwiki
1.8M
48
Vit Base Nsfw Detector
Apache-2.0
基于Vision Transformer (ViT)架构的图像分类模型,专门用于检测图像是否包含NSFW(不安全)内容。
图像分类
Transformers

V
AdamCodd
1.2M
47
Vit Hybrid Base Bit 384
Apache-2.0
混合视觉变换器(ViT)模型结合了卷积网络和Transformer架构,用于图像分类任务,在ImageNet上表现出色。
图像分类
Transformers

V
google
992.28k
6
Gender Classification 2
这是一个基于PyTorch框架和HuggingPics工具生成的图像分类模型,专门用于性别分类任务。
图像分类
Transformers

G
rizvandwiki
906.98k
32
精选推荐AI模型
Llama 3 Typhoon V1.5x 8b Instruct
专为泰语设计的80亿参数指令模型,性能媲美GPT-3.5-turbo,优化了应用场景、检索增强生成、受限生成和推理任务
大型语言模型
Transformers 支持多种语言

L
scb10x
3,269
16
Cadet Tiny
Openrail
Cadet-Tiny是一个基于SODA数据集训练的超小型对话模型,专为边缘设备推理设计,体积仅为Cosmo-3B模型的2%左右。
对话系统
Transformers 英语

C
ToddGoldfarb
2,691
6
Roberta Base Chinese Extractive Qa
基于RoBERTa架构的中文抽取式问答模型,适用于从给定文本中提取答案的任务。
问答系统 中文
R
uer
2,694
98
智启未来,您的人工智能解决方案智库
简体中文