YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

YOLOv4-tiny

Introduction

YOLO (You Only Look Once) is a series of object detection models designed for fast inference, which makes them well suited for edge devices.

YOLOv4 [2] was released in 2020 and provides many small improvements over YOLOv3 [3]. These improvements add up to create a more precise network at the same speed.

The model regresses bounding boxes (4 coordinates) and a confidence score for each box. The bounding box decoding and non-maximum suppression (NMS) steps are NOT included in the model. Please look at example.py for an example of implementation of box decoding and NMS.

Model Information

Information Value
Input shape RGB image (416, 416, 3)
Input example (Image source, Public domain)
Output shape Tensors of size (26, 26, 255) and (13, 13, 255) containing bounding box coordinates (not decoded) and class scores for two resolution levels and 3 anchor boxes per cell. More information in example.py.
Output example
FLOPS 6.9G
Number of parameters 6.05M
File size (int8) 5.9M
Source framework DarkNet
Target platform MPUs

Version and changelog

Initial release of quantized int8 and float32 models.

Tested configurations

The int8 model has been tested on i.MX 8MP and i.MX 93 (BSP LF6.1.22_2.0.0) using benchmark-model.

Training and evaluation

The model has been trained and evaluated on the COCO dataset [1], which features 80 classes. The floating point model achieved a score of 40mAP@0.5IoU on the test set, according to the source of the model. Using the evaluate.py script, we evaluate the int8 quantized model on the validation set and obtain 33mAP@0.5IoU.

Instructions to re-train the network can be found in the original repository

Conversion/Quantization

The original model is converted from the DarkNet framework to TensorFlow Lite.

The export_model.py conversion script performs this conversion and outputs the int8 quantized model and float32 model. 100 random images from the COCO 2017 validation dataset are used as calibration for the quantization.

Use case and limitations

This model can be used for fast object detection on 416x416 pixel images. It is not the most accurate model, but it is enough for many applications. We noticed that the model performs well for large objects but has issues will small objects. This is probably due to the fact that it only features two output levels instead of three for larger models.

Performance

Here are performance figures evaluated on i.MX 8M Plus and i.MX 93 (BSP LF6.1.22_2.0.0):

Model Average latency Platform Accelerator Command
Int8 908ms i.MX 8M Plus CPU (1 thread) /usr/bin/tensorflow-lite-2.10.0/examples/benchmark_model --graph=yolov4-tiny_416_quant.tflite
Int8 363ms i.MX 8M Plus CPU (4 threads) /usr/bin/tensorflow-lite-2.10.0/examples/benchmark_model --graph=yolov4-tiny_416_quant.tflite --num_threads=4
Int8 18.0ms i.MX 8M Plus NPU /usr/bin/tensorflow-lite-2.10.0/examples/benchmark_model --graph=yolov4-tiny_416_quant.tflite --external_delegate_path=/usr/lib/libvx_delegate.so
Int8 404ms i.MX 93 CPU (1 thread) /usr/bin/tensorflow-lite-2.10.0/examples/benchmark_model --graph=yolov4-tiny_416_quant.tflite
Int8 299ms i.MX 93 CPU (2 threads) /usr/bin/tensorflow-lite-2.10.0/examples/benchmark_model --graph=yolov4-tiny_416_quant.tflite --num_threads=2
Int8 21.1ms i.MX 93 NPU /usr/bin/tensorflow-lite-2.10.0/examples/benchmark_model --graph=yolov4-tiny_416_quant_vela.tflite --external_delegate_path=/usr/lib/libethosu_delegate.so

Download and run

To create the TensorFlow Lite model fully quantized in int8 with int8 input and float32 output and the float32 model, run:

bash recipe.sh

The TensorFlow Lite model file for i.MX 8M Plus and i.MX 93 CPU is yolov4-tiny_416_quant.tflite. The model for i.MX 93 NPU will be in model_imx93.

The 32-bit floating point model is yolov4-tiny_416_float32.tflite.

An example of how to use the model is in example.py.

Origin

Model implementation: https://github.com/AlexeyAB/darknet/

[1] Lin, Tsung-Yi, et al. "Microsoft coco: Common objects in context." European conference on computer vision. Springer, Cham, 2014.

[2] Bochkovskiy, Alexey, Chien-Yao Wang, and Hong-Yuan Mark Liao. "Yolov4: Optimal speed and accuracy of object detection." arXiv preprint arXiv:2004.10934 (2020).

[3] Redmon, Joseph, and Ali Farhadi. "Yolov3: An incremental improvement." arXiv preprint arXiv:1804.02767 (2018).

Downloads last month
8
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support