initial upload

Browse files

Files changed (6) hide show

.gitattributes +1 -0
README.md +58 -3
config.json +52 -0
model.safetensors +3 -0
preprocessor_config.json +26 -0
sample_image.png +3 -0

.gitattributes CHANGED Viewed

@@ -33,3 +33,4 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
 *.zip filter=lfs diff=lfs merge=lfs -text
 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text

 *.zip filter=lfs diff=lfs merge=lfs -text
 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text
+sample_image.png filter=lfs diff=lfs merge=lfs -text

README.md CHANGED Viewed

@@ -1,3 +1,58 @@
----
-license: mit
----

+---
+library_name: transformers
+license: mit
+language:
+- en
+pipeline_tag: object-detection
+base_model:
+- hustvl/yolos-tiny
+tags:
+- object-detection
+- fashion
+- search
+---
+This model is fine-tuned version of hustvl/yolos-tiny.
+You can find details of model in this github repo -> [fashion-visual-search](https://github.com/yainage90/fashion-visual-search)
+And you can find fashion image feature extractor model -> [yainage90/fashion-image-feature-extractor](https://huggingface.co/yainage90/fashion-image-feature-extractor)
+This model was trained using a combination of two datasets: [modanet](https://github.com/eBay/modanet) and [fashionpedia](https://fashionpedia.github.io/home/)
+The labels are ['bag', 'bottom', 'dress', 'hat', 'shoes', 'outer', 'top']
+In the 96th epoch out of total of 100 epochs, the best score was achieved with mAP 0.697400.
+``` python
+from PIL import Image
+import torch
+from transformers import  YolosImageProcessor, YolosForObjectDetection
+device = 'cpu'
+if torch.cuda.is_available():
+    device = torch.device('cuda')
+elif torch.backends.mps.is_available():
+    device = torch.device('mps')
+ckpt = 'yainage90/fashion-object-detection-yolos-tiny'
+image_processor = YolosImageProcessor.from_pretrained(ckpt)
+model = YolosForObjectDetection.from_pretrained(ckpt).to(device)
+image = Image.open('<path/to/image>').convert('RGB')
+with torch.no_grad():
+    inputs = image_processor(images=[image], return_tensors="pt")
+    outputs = model(**inputs.to(device))
+    target_sizes = torch.tensor([[image.size[1], image.size[0]]])
+    results = image_processor.post_process_object_detection(outputs, threshold=0.85, target_sizes=target_sizes)[0]
+    items = []
+    for score, label, box in zip(results["scores"], results["labels"], results["boxes"]):
+        score = score.item()
+        label = label.item()
+        box = [i.item() for i in box]
+        print(f"{model.config.id2label[label]}: {round(score, 3)} at {box}")
+        items.append((score, label, box))
+```
+![sample_image](sample_image.png)

config.json ADDED Viewed

	@@ -0,0 +1,52 @@

+{
+  "_name_or_path": "hustvl/yolos-tiny",
+  "architectures": [
+    "YolosForObjectDetection"
+  ],
+  "attention_probs_dropout_prob": 0.0,
+  "auxiliary_loss": false,
+  "bbox_cost": 5,
+  "bbox_loss_coefficient": 5,
+  "class_cost": 1,
+  "eos_coefficient": 0.1,
+  "giou_cost": 2,
+  "giou_loss_coefficient": 2,
+  "hidden_act": "gelu",
+  "hidden_dropout_prob": 0.0,
+  "hidden_size": 192,
+  "id2label": {
+    "0": "bag",
+    "1": "bottom",
+    "2": "dress",
+    "3": "hat",
+    "4": "outer",
+    "5": "shoes",
+    "6": "top"
+  },
+  "image_size": [
+    800,
+    1333
+  ],
+  "initializer_range": 0.02,
+  "intermediate_size": 768,
+  "label2id": {
+    "bag": 0,
+    "bottom": 1,
+    "dress": 2,
+    "hat": 3,
+    "outer": 4,
+    "shoes": 5,
+    "top": 6
+  },
+  "layer_norm_eps": 1e-12,
+  "model_type": "yolos",
+  "num_attention_heads": 3,
+  "num_channels": 3,
+  "num_detection_tokens": 100,
+  "num_hidden_layers": 12,
+  "patch_size": 16,
+  "qkv_bias": true,
+  "torch_dtype": "float32",
+  "transformers_version": "4.48.0",
+  "use_mid_position_embeddings": false
+}

model.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:b4638099a1ed79c362037b94f1648a5cf1b141feab309796e54f33f0d74c66bc
+size 25914032

preprocessor_config.json ADDED Viewed

	@@ -0,0 +1,26 @@

+{
+  "do_convert_annotations": true,
+  "do_normalize": true,
+  "do_pad": true,
+  "do_rescale": true,
+  "do_resize": true,
+  "format": "coco_detection",
+  "image_mean": [
+    0.485,
+    0.456,
+    0.406
+  ],
+  "image_processor_type": "YolosImageProcessor",
+  "image_std": [
+    0.229,
+    0.224,
+    0.225
+  ],
+  "pad_size": null,
+  "resample": 2,
+  "rescale_factor": 0.00392156862745098,
+  "size": {
+    "longest_edge": 1333,
+    "shortest_edge": 512
+  }
+}

sample_image.png ADDED Viewed

Git LFS Details

SHA256: f58eb91b8ba3a0a7f92f6a80feaebff0562cdf96a6a97a0fdab10ae3e65351f8
Pointer size: 131 Bytes
Size of remote file: 621 kB