Model Description
A CLIP ViT-B/32 model trained with the IconStack dataset using OpenCLIP.
It scores 78.82% on zero-shot classification on icon-dataset.
Usage
Installation
You need to install open_clip
to use this model:
pip install open_clip_torch
Icon-to-Text Zero-Shot Classification
import torch
from PIL import Image
import open_clip
CLIP_TEXT_TEMPLATE = "an icon of {}"
ICON_CLASSES = ["add", "close", "play", ...] # Modify your class names here
model_checkpoint = "<path_to_your_local_model>"
model, _, preprocess = open_clip.create_model_and_transforms('ViT-L-14', pretrained=model_checkpoint)
model.eval()
tokenizer = open_clip.get_tokenizer('ViT-L-14')
image = preprocess(Image.open("icon.png")).unsqueeze(0)
text = tokenizer([CLIP_TEXT_TEMPLATE.format(cls) for cls in ICON_CLASSES])
with torch.no_grad(), torch.autocast("cuda"):
image_features = model.encode_image(image)
text_features = model.encode_text(text)
image_features /= image_features.norm(dim=-1, keepdim=True)
text_features /= text_features.norm(dim=-1, keepdim=True)
text_probs = (100.0 * image_features @ text_features.T).softmax(dim=-1)
print("Label probs:", text_probs) # prints something like: [[1., 0., 0., ...]]
- Downloads last month
- 2
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support
Model tree for likaixin/IconClip-ViT-B-32
Base model
laion/CLIP-ViT-B-32-laion2B-s34B-b79KDatasets used to train likaixin/IconClip-ViT-B-32
Evaluation results
- acc@1 on ui-icon-datasetself-reported78.815
- acc@5 on ui-icon-datasetself-reported93.966