Upload folder using huggingface_hub

Browse files

Files changed (9) hide show

.gitattributes +1 -0
README.md +90 -3
coco-labels-2014_2017.txt +80 -0
evaluate.py +97 -0
example.py +216 -0
example_input.jpg +0 -0
example_output.jpg +3 -0
export_model.py +451 -0
recipe.sh +35 -0

.gitattributes CHANGED Viewed

@@ -33,3 +33,4 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
 *.zip filter=lfs diff=lfs merge=lfs -text
 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text

 *.zip filter=lfs diff=lfs merge=lfs -text
 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text
+example_output.jpg filter=lfs diff=lfs merge=lfs -text

README.md CHANGED Viewed

@@ -1,3 +1,90 @@
----
-license: apache-2.0
----

+# YOLOv4-tiny
+## Introduction
+YOLO (You Only Look Once) is a series of object detection models designed for fast inference, which makes them well suited for edge devices.
+YOLOv4 [2] was released in 2020 and provides many small improvements over YOLOv3 [3]. These improvements add up to create a more precise network at the same speed.
+The model regresses bounding boxes (4 coordinates) and a confidence score for each box. The bounding box decoding and non-maximum suppression (NMS) steps are NOT included in the model.
+Please look at `example.py` for an example of implementation of box decoding and NMS.
+## Model Information
+Information   | Value
+---           | ---
+Input shape   | RGB image (416, 416, 3)
+Input example | <img src="example_input.jpg" width=320px> ([Image source](https://commons.wikimedia.org/wiki/File:Moscow_bus_151872_2022-05.jpg), Public domain)
+Output shape  | Tensors of size (26, 26, 255) and (13, 13, 255) containing bounding box coordinates (not decoded) and class scores for two resolution levels and 3 anchor boxes per cell. More information in `example.py`.
+Output example | <img src="example_output.jpg" width=320px>
+FLOPS | 6.9G
+Number of parameters | 6.05M
+File size (int8) | 5.9M
+Source framework | DarkNet
+Target platform | MPUs
+## Version and changelog
+Initial release of quantized int8 and float32 models.
+## Tested configurations
+The int8 model has been tested on i.MX 8MP and i.MX 93 (BSP LF6.1.22_2.0.0) using benchmark-model.
+## Training and evaluation
+The model has been trained and evaluated on the [COCO dataset](https://cocodataset.org/) [1], which features 80 classes.
+The floating point model achieved a score of 40mAP@0.5IoU on the test set, according to [the source of the model](https://github.com/AlexeyAB/darknet/).
+Using the `evaluate.py` script, we evaluate the int8 quantized model on the validation set and obtain 33mAP@0.5IoU.
+Instructions to re-train the network can be found [in the original repository](https://github.com/AlexeyAB/darknet/)
+## Conversion/Quantization
+The original model is converted from the DarkNet framework to TensorFlow Lite.
+The `export_model.py` conversion script performs this conversion and outputs the int8 quantized model and float32 model.
+100 random images from the COCO 2017 validation dataset are used as calibration for the quantization.
+## Use case and limitations
+This model can be used for fast object detection on 416x416 pixel images.
+It is not the most accurate model, but it is enough for many applications.
+We noticed that the model performs well for large objects but has issues will small objects.
+This is probably due to the fact that it only features two output levels instead of three for larger models.
+## Performance
+Here are performance figures evaluated on i.MX 8M Plus and i.MX 93 (BSP LF6.1.22_2.0.0):
+Model   | Average latency  | Platform     | Accelerator       | Command
+---     | ---              | ---          | ---               | ---
+Int8    | 908ms            | i.MX 8M Plus |   CPU (1 thread)  | /usr/bin/tensorflow-lite-2.10.0/examples/benchmark_model --graph=yolov4-tiny_416_quant.tflite
+Int8    | 363ms            | i.MX 8M Plus |   CPU (4 threads) | /usr/bin/tensorflow-lite-2.10.0/examples/benchmark_model --graph=yolov4-tiny_416_quant.tflite --num_threads=4
+Int8    | 18.0ms           | i.MX 8M Plus |   NPU             | /usr/bin/tensorflow-lite-2.10.0/examples/benchmark_model --graph=yolov4-tiny_416_quant.tflite --external_delegate_path=/usr/lib/libvx_delegate.so
+Int8    | 404ms            | i.MX 93      |   CPU (1 thread)  | /usr/bin/tensorflow-lite-2.10.0/examples/benchmark_model --graph=yolov4-tiny_416_quant.tflite
+Int8    | 299ms            | i.MX 93      |   CPU (2 threads) | /usr/bin/tensorflow-lite-2.10.0/examples/benchmark_model --graph=yolov4-tiny_416_quant.tflite --num_threads=2
+Int8    | 21.1ms           | i.MX 93      |   NPU             | /usr/bin/tensorflow-lite-2.10.0/examples/benchmark_model --graph=yolov4-tiny_416_quant_vela.tflite --external_delegate_path=/usr/lib/libethosu_delegate.so
+## Download and run
+To create the TensorFlow Lite model fully quantized in int8 with int8 input and float32 output and the float32 model, run:
+    bash recipe.sh
+The TensorFlow Lite model file for i.MX 8M Plus and i.MX 93 CPU is `yolov4-tiny_416_quant.tflite`. The model for i.MX 93 NPU will be in `model_imx93`.
+The 32-bit floating point model is `yolov4-tiny_416_float32.tflite`.
+An example of how to use the model is in `example.py`.
+## Origin
+Model implementation: https://github.com/AlexeyAB/darknet/
+[1] Lin, Tsung-Yi, et al. "Microsoft coco: Common objects in context." European conference on computer vision. Springer, Cham, 2014.
+[2] Bochkovskiy, Alexey, Chien-Yao Wang, and Hong-Yuan Mark Liao. "Yolov4: Optimal speed and accuracy of object detection." arXiv preprint arXiv:2004.10934 (2020).
+[3] Redmon, Joseph, and Ali Farhadi. "Yolov3: An incremental improvement." arXiv preprint arXiv:1804.02767 (2018).

coco-labels-2014_2017.txt ADDED Viewed

	@@ -0,0 +1,80 @@

+person
+bicycle
+car
+motorcycle
+airplane
+bus
+train
+truck
+boat
+traffic light
+fire hydrant
+stop sign
+parking meter
+bench
+bird
+cat
+dog
+horse
+sheep
+cow
+elephant
+bear
+zebra
+giraffe
+backpack
+umbrella
+handbag
+tie
+suitcase
+frisbee
+skis
+snowboard
+sports ball
+kite
+baseball bat
+baseball glove
+skateboard
+surfboard
+tennis racket
+bottle
+wine glass
+cup
+fork
+knife
+spoon
+bowl
+banana
+apple
+sandwich
+orange
+broccoli
+carrot
+hot dog
+pizza
+donut
+cake
+chair
+couch
+potted plant
+bed
+dining table
+toilet
+tv
+laptop
+mouse
+remote
+keyboard
+cell phone
+microwave
+oven
+toaster
+sink
+refrigerator
+book
+clock
+vase
+scissors
+teddy bear
+hair drier
+toothbrush

evaluate.py ADDED Viewed

	@@ -0,0 +1,97 @@

+#!/usr/bin/env python3
+# Copyright 2023-2024 NXP
+# SPDX-License-Identifier: MIT
+import wget
+import zipfile
+import json
+import glob
+import os
+import tensorflow as tf
+import numpy as np
+from tqdm import tqdm
+from pycocotools.cocoeval import COCOeval
+from pycocotools.coco import COCO
+from example import load_image
+from example import decode_output, run_inference, gen_box_colors
+OBJECT_DETECTOR_TFLITE = "yolov4-tiny_416_quant.tflite"
+SCORE_THRESHOLD = 0.0
+NMS_IOU_THRESHOLD = 0.5
+INFERENCE_IMG_SIZE = 416
+LABEL_MAP = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 13, 14, 15, 16, 17, 18, 19, 20,
+             21, 22, 23, 24, 25, 27, 28, 31, 32, 33, 34, 35, 36, 37, 38, 39,
+             40, 41, 42, 43, 44, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56,
+             57, 58, 59, 60, 61, 62, 63, 64, 65, 67, 70, 72, 73, 74, 75, 76,
+             77, 78, 79, 80, 81, 82, 84, 85, 86, 87, 88, 89, 90]
+COCO_WEBSITE = "http://images.cocodataset.org"
+VAL_IMG_URL = COCO_WEBSITE + "/zips/val2017.zip"
+VAL_ANNO_URL = COCO_WEBSITE + "/annotations/annotations_trainval2017.zip"
+BOX_COLORS = gen_box_colors()
+print("Downloading COCO validation dataset...")
+response = wget.download(VAL_IMG_URL, "val2017.zip")
+response = wget.download(VAL_ANNO_URL, "annotations_trainval2017.zip")
+with zipfile.ZipFile("val2017.zip", 'r') as zip_ref:
+    zip_ref.extractall("coco")
+with zipfile.ZipFile("annotations_trainval2017.zip", 'r') as zip_ref:
+    zip_ref.extractall("coco")
+interpreter = tf.lite.Interpreter(OBJECT_DETECTOR_TFLITE)
+interpreter.allocate_tensors()
+annotations = COCO(annotation_file="coco/annotations/instances_val2017.json")
+def evaluate(interpreter):
+    image_filenames = glob.glob("coco/val2017/*")
+    results = []
+    for image_fn in tqdm(image_filenames, desc="Evaluating"):
+        image_id = int(os.path.splitext(os.path.basename(image_fn))[0])
+        orig_image, img = load_image(image_fn)
+        yolo_output = run_inference(interpreter, img)
+        scores, boxes, classes = decode_output(yolo_output,
+                                               SCORE_THRESHOLD,
+                                               NMS_IOU_THRESHOLD)
+        shp = orig_image.shape
+        boxes = boxes.numpy()
+        boxes /= INFERENCE_IMG_SIZE
+        boxes *= np.array([shp[1], shp[0], shp[1], shp[0]])
+        boxes = boxes.astype(np.int32)
+        boxes[..., 2] = boxes[..., 2] - boxes[..., 0]
+        boxes[..., 3] = boxes[..., 3] - boxes[..., 1]
+        for score, box, clas in zip(scores.numpy(), boxes, classes.numpy()):
+            results.append({"image_id": image_id,
+                            "category_id": int(LABEL_MAP[clas]),
+                            "bbox": [float(x) for x in list(box)],
+                            "score": float(score)})
+    return results
+predictions = evaluate(interpreter)
+with open("predictions.json", "w") as f:
+    json.dump(predictions, f, indent=4)
+predictions = annotations.loadRes("predictions.json")
+cocoeval = COCOeval(annotations, predictions, "bbox")
+cocoeval.evaluate()
+cocoeval.accumulate()
+cocoeval.summarize()

example.py ADDED Viewed

	@@ -0,0 +1,216 @@

+#!/usr/bin/env python3
+# Copyright 2023-2024 NXP
+# SPDX-License-Identifier: MIT
+import cv2
+import tensorflow as tf
+import numpy as np
+import time
+import random
+random.seed(42)
+OBJECT_DETECTOR_TFLITE = 'yolov4-tiny_416_quant.tflite'
+LABELS_FILE = 'coco-labels-2014_2017.txt'
+IMAGE_FILENAME = 'example_input.jpg'
+SCORE_THRESHOLD = 0.20
+NMS_IOU_THRESHOLD = 0.5
+INFERENCE_IMG_SIZE = 416
+MAX_DETS = 100
+ANCHORS = [[[81, 82], [135, 169], [344, 319]], [[23, 27], [37, 58], [81, 82]]]
+SIGMOID_FACTOR = [1.05, 1.05]
+NUM_ANCHORS = 3
+STRIDES = [32, 16]
+GRID_SIZES = [int(INFERENCE_IMG_SIZE / s) for s in STRIDES]
+with open(LABELS_FILE, 'r') as f:
+    COCO_CLASSES = [line.strip() for line in f.readlines()]
+interpreter = tf.lite.Interpreter(OBJECT_DETECTOR_TFLITE)
+interpreter.allocate_tensors()
+def gen_box_colors():
+    colors = []
+    for _ in range(len(COCO_CLASSES)):
+        r = random.randint(100, 255)
+        g = random.randint(100, 255)
+        b = random.randint(100, 255)
+        colors.append((r, g, b))
+    return colors
+BOX_COLORS = gen_box_colors()
+def load_image(filename):
+    orig_image = cv2.imread(filename, 1)
+    image = cv2.cvtColor(orig_image, cv2.COLOR_BGR2RGB)
+    image = cv2.resize(image, (INFERENCE_IMG_SIZE, INFERENCE_IMG_SIZE))
+    image = np.expand_dims(image, axis=0)
+    image = image / 255.0
+    return orig_image, image
+def np_sigmoid(x):
+    return 1 / (1 + np.exp(-x))
+def reciprocal_sigmoid(x):
+    return -np.log(1 / x - 1)
+def decode_boxes_prediction(yolo_output):
+    # Each output level represents a grid of predictions.
+    # The first output level is a 26x26 grid and the second 13x13.
+    # Each cell of each grid is assigned to 3 anchor bounding boxes.
+    # The bounding box predictions are regressed
+    # relatively to these anchor boxes.
+    # Thus, the model predicts 3 bounding boxes per cell per output level.
+    # The output is structured as follows:
+    # For each cell [[x, y, w, h, conf, cl_0, cl_1, ..., cl_79], # anchor 1
+    #                [x, y, w, h, conf, cl_0, cl_1, ..., cl_79], # anchor 2
+    #                [x, y, w, h, conf, cl_0, cl_1, ..., cl_79]] # anchor 3
+    # Hence, we have 85 values per anchor box, and thus 255 values per cell.
+    # The decoding of the output bounding boxes is described in Figure 2 of
+    # the YOLOv3 paper https://arxiv.org/pdf/1804.02767.pdf;
+    boxes_list = []
+    scores_list = []
+    classes_list = []
+    for idx, feats in enumerate(yolo_output):
+        features = np.reshape(feats, (NUM_ANCHORS * GRID_SIZES[idx] ** 2, 85))
+        anchor = np.array(ANCHORS[idx])
+        factor = SIGMOID_FACTOR[idx]
+        grid_size = GRID_SIZES[idx]
+        stride = STRIDES[idx]
+        cell_confidence = features[..., 4]
+        logit_threshold = reciprocal_sigmoid(SCORE_THRESHOLD)
+        over_threshold_list = np.where(cell_confidence > logit_threshold)
+        if over_threshold_list[0].size > 0:
+            indices = np.array(over_threshold_list[0])
+            box_positions = np.floor_divide(indices, 3)
+            list_xy = np.array(np.divmod(box_positions, grid_size)).T
+            list_xy = list_xy[..., ::-1]
+            boxes_xy = np.reshape(list_xy, (int(list_xy.size / 2), 2))
+            outxy = features[indices, :2]
+            # boxes center coordinates
+            centers = np_sigmoid(outxy * factor) - 0.5 * (factor - 1)
+            centers += boxes_xy
+            centers *= stride
+            # boxes width and height
+            width_height = np.exp(features[indices, 2:4])
+            width_height *= anchor[np.divmod(indices, NUM_ANCHORS)[1]]
+            boxes_list.append(np.stack([centers[:, 0] - width_height[:, 0]/2,
+                                        centers[:, 1] - width_height[:, 1]/2,
+                                        centers[:, 0] + width_height[:, 0]/2,
+                                        centers[:, 1] + width_height[:, 1]/2],
+                                       axis=1))
+            # confidence that cell contains an object
+            scores_list.append(np_sigmoid(features[indices, 4:5]))
+            # class with the highest probability in this cell
+            classes_list.append(np.argmax(features[indices, 5:], axis=1))
+    if len(boxes_list) > 0:
+        boxes = np.concatenate(boxes_list, axis=0)
+        scores = np.concatenate(scores_list, axis=0)[:, 0]
+        classes = np.concatenate(classes_list, axis=0)
+        return boxes, scores, classes
+    else:
+        return np.zeros((0, 4)), np.zeros((0)), np.zeros((0))
+def decode_output(yolo_outputs,
+                  score_threshold=SCORE_THRESHOLD,
+                  iou_threshold=NMS_IOU_THRESHOLD):
+    '''
+    Decode output from YOLOv4 tiny in inference size referential (416x416)
+    '''
+    boxes, scores, classes = decode_boxes_prediction(yolo_outputs)
+    # apply NMS from tensorflow
+    inds = tf.image.non_max_suppression(boxes, scores, MAX_DETS,
+                                        score_threshold=score_threshold,
+                                        iou_threshold=iou_threshold)
+    # keep only selected boxes
+    boxes = tf.gather(boxes, inds)
+    scores = tf.gather(scores, inds)
+    classes = tf.gather(classes, inds)
+    return scores, boxes, classes
+def run_inference(interpreter, image, threshold=SCORE_THRESHOLD):
+    input_details = interpreter.get_input_details()
+    output_details = interpreter.get_output_details()
+    input_scale, input_zero_point = input_details[0]["quantization"]
+    image = image / input_scale + input_zero_point
+    image = image.astype(np.int8)
+    interpreter.set_tensor(input_details[0]['index'], image)
+    interpreter.invoke()
+    boxes = interpreter.get_tensor(output_details[0]['index'])
+    boxes2 = interpreter.get_tensor(output_details[1]['index'])
+    return [boxes, boxes2]
+if __name__ == "__main__":
+    orig_image, processed_image = load_image(IMAGE_FILENAME)
+    start = time.time()
+    yolo_output = run_inference(interpreter, processed_image)
+    end = time.time()
+    scores, boxes, classes = decode_output(yolo_output)
+    # rescale boxes for display
+    shp = orig_image.shape
+    boxes = boxes.numpy()
+    boxes /= INFERENCE_IMG_SIZE
+    boxes *= np.array([shp[1], shp[0], shp[1], shp[0]])
+    boxes = boxes.astype(np.int32)
+    print("Inference time", end - start, "ms")
+    print("Detected", boxes.shape[0], "object(s)")
+    print("Box coordinates:")
+    for i in range(boxes.shape[0]):
+        box = boxes[i, :]
+        print(box, end=" ")
+        class_name = COCO_CLASSES[classes[i].numpy()]
+        score = scores[i].numpy()
+        color = BOX_COLORS[classes[i]]
+        print("class", class_name, end=" ")
+        print("score", score)
+        cv2.rectangle(orig_image, (box[0], box[1]), (box[2], box[3]),
+                      color, 3)
+        cv2.putText(orig_image,  f"{class_name} {score:.2f}",
+                    (box[0], box[1] - 10),
+                    cv2.FONT_HERSHEY_SIMPLEX, 0.6, color, 2)
+    cv2.imwrite('example_output.jpg', orig_image)
+    cv2.imshow('', orig_image)
+    cv2.waitKey()

example_input.jpg ADDED Viewed

example_output.jpg ADDED Viewed

Git LFS Details

SHA256: 351d09a268a84019fa6afa014437f188e4824d2df5ba03159f391587e268f851
Pointer size: 131 Bytes
Size of remote file: 126 kB

export_model.py ADDED Viewed

	@@ -0,0 +1,451 @@

+# Copyright 2023-2024 NXP
+# SPDX-License-Identifier: BSD-3-Clause
+import os
+import numpy as np
+import struct
+import tensorflow as tf
+from tensorflow.keras.layers import Conv2D, Input, LeakyReLU
+from tensorflow.keras.layers import ZeroPadding2D, UpSampling2D
+from tensorflow.keras.layers import MaxPool2D, add, concatenate
+from tensorflow.keras.models import Model
+import argparse
+import PIL.Image as im
+import random
+random.seed(42)
+N_CALIBRATION_IMAGES = 100
+parser = argparse.ArgumentParser()
+parser.add_argument('--weights_path', help='path to darknet weights')
+parser.add_argument('--output_path', help='path to save tflite model')
+parser.add_argument('--images_path',
+                    help='path to representative images for quantization',
+                    default=None)
+args = parser.parse_args()
+def _conv_block(inp, convs, skip=False):
+    x = inp
+    count = 0
+    for conv in convs:
+        if count == (len(convs) - 2) and skip:
+            skip_connection = x
+        count += 1
+        if conv['stride'] > 1:
+            x = ZeroPadding2D(((1, 0), (1, 0)),
+                              name='zerop_' + str(conv['layer_idx']))(
+                x)  # peculiar padding as darknet prefer left and top
+        x = Conv2D(conv['filter'],
+                   conv['kernel'],
+                   strides=conv['stride'],
+                   # peculiar padding as darknet prefer left and top
+                   padding='valid' if conv['stride'] > 1 else 'same',
+                   name='convn_' + str(conv['layer_idx']) \
+                   if conv['bnorm'] else 'conv_' + str(conv['layer_idx']),
+                   activation=None,
+                   use_bias=True)(x)
+        if conv['activ'] == 1:
+            x = LeakyReLU(alpha=0.1, name='leaky_' + str(conv['layer_idx']))(x)
+    return add([skip_connection, x],
+               name='add_' + str(conv['layer_idx'] + 1)) if skip else x
+def _split_block(input_layer, layer_idx):
+    s = tf.split(input_layer,
+                 num_or_size_splits=2,
+                 axis=-1,
+                 name=f"split_{layer_idx}")
+    return s[1]
+def make_yolov4_tiny_model():
+    input_image = Input(shape=(416, 416, 3),
+                        batch_size=1,
+                        name='input_0')
+    # Layer 0
+    x = _conv_block(input_image, [{'filter': 32,
+                                   'kernel': 3,
+                                   'stride': 2,
+                                   'bnorm': True,
+                                   'activ': 1,
+                                   'layer_idx': 0}])
+    layer_0 = x
+    # Layer 1
+    x = _conv_block(x, [{'filter': 64,
+                         'kernel': 3,
+                         'stride': 2,
+                         'bnorm': True,
+                         'activ': 1,
+                         'layer_idx': 1}])
+    layer_1 = x
+    # Layer  2, concat1
+    x = _conv_block(x, [{'filter': 64,
+                         'kernel': 3,
+                         'stride': 1,
+                         'bnorm': True,
+                         'activ': 1,
+                         'layer_idx': 2}])
+    layer_2 = x
+    # Layer  3, route group
+    x = _split_block(x, layer_idx=3)
+    # Layer 4, concat_route_1
+    x = _conv_block(x, [{'filter': 32,
+                         'kernel': 3,
+                         'stride': 1,
+                         'bnorm': True,
+                         'activ': 1,
+                         'layer_idx': 4}])
+    layer_4 = x
+    # Layer 5, concat_route_2
+    x = _conv_block(x, [{'filter': 32,
+                         'kernel': 3,
+                         'stride': 1,
+                         'bnorm': True,
+                         'activ': 1,
+                         'layer_idx': 5}])
+    layer_5 = x
+    # Layer 6, concat route
+    x = concatenate([layer_5, layer_4], axis=-1, name='concat_6')
+    # Layer 7, concat2
+    x = _conv_block(x, [{'filter': 64,
+                         'kernel': 1,
+                         'stride': 1,
+                         'bnorm': True,
+                         'activ': 1,
+                         'layer_idx': 7}])
+    layer_7 = x
+    # Layer 8, concat
+    x = concatenate([layer_2, layer_7], axis=-1, name='concat_8')
+    # Layer 9
+    x = MaxPool2D(pool_size=(2, 2), padding='same', name='layer_9')(x)
+    # Layer 10, concat 1
+    x = _conv_block(x, [{'filter': 128,
+                         'kernel': 3,
+                         'stride': 1,
+                         'bnorm': True,
+                         'activ': 1,
+                         'layer_idx': 10}])
+    layer_10 = x
+    # Layer 11
+    x = _split_block(x, layer_idx=11)
+    # Layer 12, concat route 1
+    x = _conv_block(x, [{'filter': 64,
+                         'kernel': 3,
+                         'stride': 1,
+                         'bnorm': True,
+                         'activ': 1,
+                         'layer_idx': 12}])
+    layer_12 = x
+    # Layer 13, concat route 2
+    x = _conv_block(x, [{'filter': 64,
+                         'kernel': 3,
+                         'stride': 1,
+                         'bnorm': True,
+                         'activ': 1,
+                         'layer_idx': 13}])
+    layer_13 = x
+    # Layer 14
+    x = concatenate([layer_13, layer_12], axis=-1, name='concat_14')
+    # Layer 15, concat 2
+    x = _conv_block(x, [{'filter': 128,
+                         'kernel': 1,
+                         'stride': 1,
+                         'bnorm': True,
+                         'activ': 1,
+                         'layer_idx': 15}])
+    layer_15 = x
+    # Layer 16
+    x = concatenate([layer_10, layer_15], axis=-1, name='concat_16')
+    # Layer 17
+    x = MaxPool2D(pool_size=(2, 2), padding='same', name='layer_17')(x)
+    # Layer 18, concat 1
+    x = _conv_block(x, [{'filter': 256,
+                         'kernel': 3,
+                         'stride': 1,
+                         'bnorm': True,
+                         'activ': 1,
+                         'layer_idx': 18}])
+    layer_18 = x
+    # Layer 19
+    x = _split_block(x, layer_idx=19)
+    # Layer 20, concat route 1
+    x = _conv_block(x, [{'filter': 128,
+                         'kernel': 3,
+                         'stride': 1,
+                         'bnorm': True,
+                         'activ': 1,
+                         'layer_idx': 20}])
+    layer_20 = x
+    # Layer 21, concat route 2
+    x = _conv_block(x, [{'filter': 128,
+                         'kernel': 3,
+                         'stride': 1,
+                         'bnorm': True,
+                         'activ': 1,
+                         'layer_idx': 21}])
+    layer_21 = x
+    # Layer 22
+    x = concatenate([layer_21, layer_20], axis=-1, name='concat_22')
+    # Layer 23, concat 2, output 1 of cspdarknet
+    x = _conv_block(x, [{'filter': 256,
+                         'kernel': 1,
+                         'stride': 1,
+                         'bnorm': True,
+                         'activ': 1,
+                         'layer_idx': 23}])
+    layer_23 = x
+    # Layer 24
+    x = concatenate([layer_18, layer_23], axis=-1, name='concat_24')
+    # Layer 25
+    x = MaxPool2D(pool_size=(2, 2), padding='same', name='layer_25')(x)
+    # Layer 26, output 2 of cspdarknet
+    x = _conv_block(x, [{'filter': 512,
+                         'kernel': 3,
+                         'stride': 1,
+                         'bnorm': True,
+                         'activ': 1,
+                         'layer_idx': 26}])
+    layer_26 = x
+    # After backbone
+    # Layer 27, concat 1, branch 1
+    x = _conv_block(layer_26, [{'filter': 256,
+                                'kernel': 1,
+                                'stride': 1,
+                                'bnorm': True,
+                                'activ': 1,
+                                'layer_idx': 27}])
+    layer_27 = x
+    # Layer 28
+    x = _conv_block(x, [{'filter': 512,
+                         'kernel': 3,
+                         'stride': 1,
+                         'bnorm': True,
+                         'activ': 1,
+                         'layer_idx': 28}])
+    layer_28 = x
+    # Layer 29, output of large grid
+    x = _conv_block(x, [{'filter': 255,
+                         'kernel': 1,
+                         'stride': 1,
+                         'bnorm': True,
+                         'activ': 0,
+                         'layer_idx': 29}])
+    layer_29 = x
+    # Layer 30, continue from layer_27
+    x = _conv_block(layer_27, [{'filter': 128,
+                                'kernel': 1,
+                                'stride': 1,
+                                'bnorm': True,
+                                'activ': 1,
+                                'layer_idx': 30}])
+    layer_30 = x
+    # Layer 31
+    x = UpSampling2D(size=(2, 2),
+                     name='upsamp_31',
+                     interpolation='bilinear')(x)
+    layer_31 = x
+    # Layer 32
+    x = concatenate([layer_31, layer_23], axis=-1, name='concat_32')
+    # Layer 33
+    x = _conv_block(x, [{'filter': 256,
+                         'kernel': 3,
+                         'stride': 1,
+                         'bnorm': True,
+                         'activ': 1,
+                         'layer_idx': 33}])
+    # Layer 34, output of medium grid
+    x = _conv_block(x, [{'filter': 255,
+                         'kernel': 1,
+                         'stride': 1,
+                         'bnorm': True,
+                         'activ': 0,
+                         'layer_idx': 34}])
+    layer_34 = x
+    # End
+    model = Model(input_image, [layer_34, layer_29], name='Yolov4-tiny')
+    model.summary()
+    return model
+# Define the model
+model = make_yolov4_tiny_model()
+model.summary()
+# load weights in keras
+class WeightReader:
+    def __init__(self, weight_file):
+        with open(weight_file, 'rb') as w_f:
+            major, = struct.unpack('i', w_f.read(4))
+            minor, = struct.unpack('i', w_f.read(4))
+            revision, = struct.unpack('i', w_f.read(4))
+            if (major * 10 + minor) >= 2 and major < 1000 and minor < 1000:
+                print("reading 64 bytes")
+                w_f.read(8)
+            else:
+                print("reading 32 bytes")
+                w_f.read(4)
+            transpose = (major > 1000) or (minor > 1000)
+            binary = w_f.read()
+        self.offset = 0
+        self.all_weights = np.frombuffer(binary, dtype='float32')
+        print(f"weight total length {len(self.all_weights)}")
+    def read_bytes(self, size):
+        self.offset = self.offset + size
+        return self.all_weights[self.offset - size:self.offset]
+    def load_weights(self, model):
+        count = 0
+        ncount = 0
+        for i in range(35):
+            try:
+                conv_layer = model.get_layer('convn_' + str(i))
+                filter = conv_layer.kernel.shape[-1]
+                # kernel*kernel*c*filter
+                nweights = np.prod(conv_layer.kernel.shape)
+                print(f"loading weights of convolution #" +
+                      str(i) + "- nb parameters: " +
+                      str(nweights + filter))
+                if i in [29, 34]:
+                    bias = self.read_bytes(filter)  # bias
+                    weights = self.read_bytes(nweights)  # weights
+                else:
+                    bias = self.read_bytes(filter)  # bias
+                    scale = self.read_bytes(filter)  # scale
+                    mean = self.read_bytes(filter)  # mean
+                    var = self.read_bytes(filter)  # variance
+                    weights = self.read_bytes(nweights)  # weights
+                    # normalize bias
+                    bias = bias - scale * mean / (np.sqrt(var + 0.00001))
+                    # normalize weights
+                    weights = np.reshape(weights,
+                                         (filter, int(nweights / filter)))
+                    A = scale / (np.sqrt(var + 0.00001))
+                    A = np.expand_dims(A, axis=0)
+                    weights = weights * A.T
+                    weights = np.reshape(weights, (nweights))
+                shp = list(reversed(conv_layer.get_weights()[0].shape))
+                weights = weights.reshape(shp)
+                weights = weights.transpose([2, 3, 1, 0])
+                if len(conv_layer.get_weights()) > 1:
+                    a = conv_layer.set_weights([weights, bias])
+                else:
+                    a = conv_layer.set_weights([weights])
+                count = count + 1
+                ncount = ncount + nweights + filter
+            except ValueError:
+                print("no convolution #" + str(i))
+        print(count,
+              "Convolution Normalized Layers are loaded with ",
+              ncount,
+              " parameters")
+    def reset(self):
+        self.offset = 0
+darknet_model = args.weights_path + '/yolov4-tiny.weights'
+weight_reader = WeightReader(darknet_model)
+weight_reader.load_weights(model)
+def image_resize(image, resize_shape):
+    image_copy = np.copy(image)
+    resize_h, resize_w = resize_shape
+    orig_h, orig_w, _ = image_copy.shape
+    scale = min(resize_h / orig_h, resize_w / orig_w)
+    temp_w, temp_h = int(scale * orig_w), int(scale * orig_h)
+    image_resized = image.resize((temp_w, temp_h), im.BILINEAR)
+    image_paded = np.full(shape=[resize_h, resize_w, 3], fill_value=128.0)
+    r_w = (resize_w - temp_w) // 2  # real_w
+    r_h = (resize_h - temp_h) // 2  # real_h
+    image_paded[r_h:temp_h + r_h, r_w:temp_w + r_w, :] = image_resized
+    image_paded = image_paded / 255.
+    return image_paded
+def representative_dataset():
+    _, h, w, _ = model.input_shape
+    image_folder = args.images_path
+    image_files = os.listdir(image_folder)
+    random.shuffle(image_files)
+    image_files = image_files[:N_CALIBRATION_IMAGES]
+    for image_file in image_files:
+        image_path = os.path.join(image_folder, image_file)
+        original_image = im.open(image_path)
+        if original_image.mode != "RGB":
+            continue
+        image_data = image_resize(original_image, [h, w])
+        img_in = image_data[np.newaxis, ...].astype(np.float32)
+        yield [img_in]
+def dummy_dataset():
+    _, h, w, _ = model.input_shape
+    for i in range(N_CALIBRATION_IMAGES):
+        # Tensorflow basic format : NHWC
+        img_in = np.random.randn(1, h, w, 3).astype('float32')
+        yield [img_in]
+converter = tf.lite.TFLiteConverter.from_keras_model(model)
+# quantized model
+tflite_quant = args.output_path + '/yolov4-tiny_416_quant.tflite'
+converter.optimizations = [tf.lite.Optimize.DEFAULT]
+if args.images_path is not None:
+    converter.representative_dataset = representative_dataset
+else:  # Dummy dataset if no representative dataset is given
+    converter.representative_dataset = dummy_dataset
+converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS_INT8]
+converter.inference_input_type = tf.int8
+converter.inference_output_type = tf.float32
+tflite_model = converter.convert()
+with open(tflite_quant, 'wb') as f:
+    f.write(tflite_model)
+# float32 model
+converter = tf.lite.TFLiteConverter.from_keras_model(model)
+tflite_float = args.output_path + '/yolov4-tiny_416_float32.tflite'
+tflite_model = converter.convert()
+with open(tflite_float, 'wb') as f:
+    f.write(tflite_model)

recipe.sh ADDED Viewed

	@@ -0,0 +1,35 @@

+#!/usr/bin/env bash
+# Copyright 2023-2024 NXP
+# SPDX-License-Identifier: MIT
+set -e
+wget https://github.com/AlexeyAB/darknet/releases/download/darknet_yolo_v4_pre/yolov4-tiny.weights
+wget https://raw.githubusercontent.com/amikelive/coco-labels/master/coco-labels-2014_2017.txt
+# tensorflow -> tflite
+python3.8 -m venv env
+source ./env/bin/activate
+pip install --upgrade pip
+pip install tensorflow==2.10.0
+pip install Pillow
+wget --no-check-certificate https://images.cocodataset.org/zips/val2017.zip
+unzip val2017.zip
+# convert model from darknet to tensorflow lite
+python3.8 export_model.py --weights_path=./ --output_path=./ --images_path=val2017
+# install vela
+pip install numpy==1.20
+pip install git+https://github.com/nxp-imx/ethos-u-vela.git@lf-6.1.22-2.0.0
+vela --output-dir model_imx93 yolov4-tiny_416_quant.tflite
+# cleanup
+deactivate
+rm -rf val2017 env
+rm val2017.zip
+rm yolov4-tiny.weights