gbahlnxp commited on
Commit
1d6d5bf
·
verified ·
1 Parent(s): 2914546

Upload folder using huggingface_hub

Browse files
Files changed (9) hide show
  1. .gitattributes +1 -0
  2. README.md +90 -3
  3. coco-labels-2014_2017.txt +80 -0
  4. evaluate.py +97 -0
  5. example.py +216 -0
  6. example_input.jpg +0 -0
  7. example_output.jpg +3 -0
  8. export_model.py +451 -0
  9. recipe.sh +35 -0
.gitattributes CHANGED
@@ -33,3 +33,4 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
 
 
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
36
+ example_output.jpg filter=lfs diff=lfs merge=lfs -text
README.md CHANGED
@@ -1,3 +1,90 @@
1
- ---
2
- license: apache-2.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # YOLOv4-tiny
2
+
3
+ ## Introduction
4
+
5
+ YOLO (You Only Look Once) is a series of object detection models designed for fast inference, which makes them well suited for edge devices.
6
+
7
+ YOLOv4 [2] was released in 2020 and provides many small improvements over YOLOv3 [3]. These improvements add up to create a more precise network at the same speed.
8
+
9
+ The model regresses bounding boxes (4 coordinates) and a confidence score for each box. The bounding box decoding and non-maximum suppression (NMS) steps are NOT included in the model.
10
+ Please look at `example.py` for an example of implementation of box decoding and NMS.
11
+
12
+ ## Model Information
13
+
14
+ Information | Value
15
+ --- | ---
16
+ Input shape | RGB image (416, 416, 3)
17
+ Input example | <img src="example_input.jpg" width=320px> ([Image source](https://commons.wikimedia.org/wiki/File:Moscow_bus_151872_2022-05.jpg), Public domain)
18
+ Output shape | Tensors of size (26, 26, 255) and (13, 13, 255) containing bounding box coordinates (not decoded) and class scores for two resolution levels and 3 anchor boxes per cell. More information in `example.py`.
19
+ Output example | <img src="example_output.jpg" width=320px>
20
+ FLOPS | 6.9G
21
+ Number of parameters | 6.05M
22
+ File size (int8) | 5.9M
23
+ Source framework | DarkNet
24
+ Target platform | MPUs
25
+
26
+ ## Version and changelog
27
+
28
+ Initial release of quantized int8 and float32 models.
29
+
30
+ ## Tested configurations
31
+
32
+ The int8 model has been tested on i.MX 8MP and i.MX 93 (BSP LF6.1.22_2.0.0) using benchmark-model.
33
+
34
+ ## Training and evaluation
35
+
36
+ The model has been trained and evaluated on the [COCO dataset](https://cocodataset.org/) [1], which features 80 classes.
37
+ The floating point model achieved a score of 40mAP@0.5IoU on the test set, according to [the source of the model](https://github.com/AlexeyAB/darknet/).
38
+ Using the `evaluate.py` script, we evaluate the int8 quantized model on the validation set and obtain 33mAP@0.5IoU.
39
+
40
+ Instructions to re-train the network can be found [in the original repository](https://github.com/AlexeyAB/darknet/)
41
+
42
+ ## Conversion/Quantization
43
+
44
+ The original model is converted from the DarkNet framework to TensorFlow Lite.
45
+
46
+ The `export_model.py` conversion script performs this conversion and outputs the int8 quantized model and float32 model.
47
+ 100 random images from the COCO 2017 validation dataset are used as calibration for the quantization.
48
+
49
+ ## Use case and limitations
50
+
51
+ This model can be used for fast object detection on 416x416 pixel images.
52
+ It is not the most accurate model, but it is enough for many applications.
53
+ We noticed that the model performs well for large objects but has issues will small objects.
54
+ This is probably due to the fact that it only features two output levels instead of three for larger models.
55
+
56
+
57
+ ## Performance
58
+
59
+ Here are performance figures evaluated on i.MX 8M Plus and i.MX 93 (BSP LF6.1.22_2.0.0):
60
+
61
+ Model | Average latency | Platform | Accelerator | Command
62
+ --- | --- | --- | --- | ---
63
+ Int8 | 908ms | i.MX 8M Plus | CPU (1 thread) | /usr/bin/tensorflow-lite-2.10.0/examples/benchmark_model --graph=yolov4-tiny_416_quant.tflite
64
+ Int8 | 363ms | i.MX 8M Plus | CPU (4 threads) | /usr/bin/tensorflow-lite-2.10.0/examples/benchmark_model --graph=yolov4-tiny_416_quant.tflite --num_threads=4
65
+ Int8 | 18.0ms | i.MX 8M Plus | NPU | /usr/bin/tensorflow-lite-2.10.0/examples/benchmark_model --graph=yolov4-tiny_416_quant.tflite --external_delegate_path=/usr/lib/libvx_delegate.so
66
+ Int8 | 404ms | i.MX 93 | CPU (1 thread) | /usr/bin/tensorflow-lite-2.10.0/examples/benchmark_model --graph=yolov4-tiny_416_quant.tflite
67
+ Int8 | 299ms | i.MX 93 | CPU (2 threads) | /usr/bin/tensorflow-lite-2.10.0/examples/benchmark_model --graph=yolov4-tiny_416_quant.tflite --num_threads=2
68
+ Int8 | 21.1ms | i.MX 93 | NPU | /usr/bin/tensorflow-lite-2.10.0/examples/benchmark_model --graph=yolov4-tiny_416_quant_vela.tflite --external_delegate_path=/usr/lib/libethosu_delegate.so
69
+
70
+ ## Download and run
71
+
72
+ To create the TensorFlow Lite model fully quantized in int8 with int8 input and float32 output and the float32 model, run:
73
+
74
+ bash recipe.sh
75
+
76
+ The TensorFlow Lite model file for i.MX 8M Plus and i.MX 93 CPU is `yolov4-tiny_416_quant.tflite`. The model for i.MX 93 NPU will be in `model_imx93`.
77
+
78
+ The 32-bit floating point model is `yolov4-tiny_416_float32.tflite`.
79
+
80
+ An example of how to use the model is in `example.py`.
81
+
82
+ ## Origin
83
+
84
+ Model implementation: https://github.com/AlexeyAB/darknet/
85
+
86
+ [1] Lin, Tsung-Yi, et al. "Microsoft coco: Common objects in context." European conference on computer vision. Springer, Cham, 2014.
87
+
88
+ [2] Bochkovskiy, Alexey, Chien-Yao Wang, and Hong-Yuan Mark Liao. "Yolov4: Optimal speed and accuracy of object detection." arXiv preprint arXiv:2004.10934 (2020).
89
+
90
+ [3] Redmon, Joseph, and Ali Farhadi. "Yolov3: An incremental improvement." arXiv preprint arXiv:1804.02767 (2018).
coco-labels-2014_2017.txt ADDED
@@ -0,0 +1,80 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ person
2
+ bicycle
3
+ car
4
+ motorcycle
5
+ airplane
6
+ bus
7
+ train
8
+ truck
9
+ boat
10
+ traffic light
11
+ fire hydrant
12
+ stop sign
13
+ parking meter
14
+ bench
15
+ bird
16
+ cat
17
+ dog
18
+ horse
19
+ sheep
20
+ cow
21
+ elephant
22
+ bear
23
+ zebra
24
+ giraffe
25
+ backpack
26
+ umbrella
27
+ handbag
28
+ tie
29
+ suitcase
30
+ frisbee
31
+ skis
32
+ snowboard
33
+ sports ball
34
+ kite
35
+ baseball bat
36
+ baseball glove
37
+ skateboard
38
+ surfboard
39
+ tennis racket
40
+ bottle
41
+ wine glass
42
+ cup
43
+ fork
44
+ knife
45
+ spoon
46
+ bowl
47
+ banana
48
+ apple
49
+ sandwich
50
+ orange
51
+ broccoli
52
+ carrot
53
+ hot dog
54
+ pizza
55
+ donut
56
+ cake
57
+ chair
58
+ couch
59
+ potted plant
60
+ bed
61
+ dining table
62
+ toilet
63
+ tv
64
+ laptop
65
+ mouse
66
+ remote
67
+ keyboard
68
+ cell phone
69
+ microwave
70
+ oven
71
+ toaster
72
+ sink
73
+ refrigerator
74
+ book
75
+ clock
76
+ vase
77
+ scissors
78
+ teddy bear
79
+ hair drier
80
+ toothbrush
evaluate.py ADDED
@@ -0,0 +1,97 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #!/usr/bin/env python3
2
+ # Copyright 2023-2024 NXP
3
+ # SPDX-License-Identifier: MIT
4
+
5
+ import wget
6
+ import zipfile
7
+ import json
8
+ import glob
9
+ import os
10
+ import tensorflow as tf
11
+ import numpy as np
12
+ from tqdm import tqdm
13
+ from pycocotools.cocoeval import COCOeval
14
+ from pycocotools.coco import COCO
15
+ from example import load_image
16
+ from example import decode_output, run_inference, gen_box_colors
17
+
18
+ OBJECT_DETECTOR_TFLITE = "yolov4-tiny_416_quant.tflite"
19
+ SCORE_THRESHOLD = 0.0
20
+ NMS_IOU_THRESHOLD = 0.5
21
+ INFERENCE_IMG_SIZE = 416
22
+
23
+ LABEL_MAP = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 13, 14, 15, 16, 17, 18, 19, 20,
24
+ 21, 22, 23, 24, 25, 27, 28, 31, 32, 33, 34, 35, 36, 37, 38, 39,
25
+ 40, 41, 42, 43, 44, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56,
26
+ 57, 58, 59, 60, 61, 62, 63, 64, 65, 67, 70, 72, 73, 74, 75, 76,
27
+ 77, 78, 79, 80, 81, 82, 84, 85, 86, 87, 88, 89, 90]
28
+
29
+ COCO_WEBSITE = "http://images.cocodataset.org"
30
+ VAL_IMG_URL = COCO_WEBSITE + "/zips/val2017.zip"
31
+ VAL_ANNO_URL = COCO_WEBSITE + "/annotations/annotations_trainval2017.zip"
32
+
33
+ BOX_COLORS = gen_box_colors()
34
+
35
+ print("Downloading COCO validation dataset...")
36
+ response = wget.download(VAL_IMG_URL, "val2017.zip")
37
+ response = wget.download(VAL_ANNO_URL, "annotations_trainval2017.zip")
38
+
39
+ with zipfile.ZipFile("val2017.zip", 'r') as zip_ref:
40
+ zip_ref.extractall("coco")
41
+
42
+ with zipfile.ZipFile("annotations_trainval2017.zip", 'r') as zip_ref:
43
+ zip_ref.extractall("coco")
44
+
45
+
46
+ interpreter = tf.lite.Interpreter(OBJECT_DETECTOR_TFLITE)
47
+ interpreter.allocate_tensors()
48
+
49
+ annotations = COCO(annotation_file="coco/annotations/instances_val2017.json")
50
+
51
+
52
+ def evaluate(interpreter):
53
+ image_filenames = glob.glob("coco/val2017/*")
54
+
55
+ results = []
56
+
57
+ for image_fn in tqdm(image_filenames, desc="Evaluating"):
58
+
59
+ image_id = int(os.path.splitext(os.path.basename(image_fn))[0])
60
+
61
+ orig_image, img = load_image(image_fn)
62
+ yolo_output = run_inference(interpreter, img)
63
+ scores, boxes, classes = decode_output(yolo_output,
64
+ SCORE_THRESHOLD,
65
+ NMS_IOU_THRESHOLD)
66
+
67
+ shp = orig_image.shape
68
+ boxes = boxes.numpy()
69
+ boxes /= INFERENCE_IMG_SIZE
70
+ boxes *= np.array([shp[1], shp[0], shp[1], shp[0]])
71
+
72
+ boxes = boxes.astype(np.int32)
73
+
74
+ boxes[..., 2] = boxes[..., 2] - boxes[..., 0]
75
+ boxes[..., 3] = boxes[..., 3] - boxes[..., 1]
76
+
77
+ for score, box, clas in zip(scores.numpy(), boxes, classes.numpy()):
78
+ results.append({"image_id": image_id,
79
+ "category_id": int(LABEL_MAP[clas]),
80
+ "bbox": [float(x) for x in list(box)],
81
+ "score": float(score)})
82
+
83
+ return results
84
+
85
+
86
+ predictions = evaluate(interpreter)
87
+
88
+ with open("predictions.json", "w") as f:
89
+ json.dump(predictions, f, indent=4)
90
+
91
+ predictions = annotations.loadRes("predictions.json")
92
+
93
+ cocoeval = COCOeval(annotations, predictions, "bbox")
94
+
95
+ cocoeval.evaluate()
96
+ cocoeval.accumulate()
97
+ cocoeval.summarize()
example.py ADDED
@@ -0,0 +1,216 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #!/usr/bin/env python3
2
+ # Copyright 2023-2024 NXP
3
+ # SPDX-License-Identifier: MIT
4
+
5
+ import cv2
6
+ import tensorflow as tf
7
+ import numpy as np
8
+ import time
9
+ import random
10
+
11
+ random.seed(42)
12
+
13
+ OBJECT_DETECTOR_TFLITE = 'yolov4-tiny_416_quant.tflite'
14
+ LABELS_FILE = 'coco-labels-2014_2017.txt'
15
+ IMAGE_FILENAME = 'example_input.jpg'
16
+
17
+ SCORE_THRESHOLD = 0.20
18
+ NMS_IOU_THRESHOLD = 0.5
19
+ INFERENCE_IMG_SIZE = 416
20
+ MAX_DETS = 100
21
+
22
+ ANCHORS = [[[81, 82], [135, 169], [344, 319]], [[23, 27], [37, 58], [81, 82]]]
23
+ SIGMOID_FACTOR = [1.05, 1.05]
24
+ NUM_ANCHORS = 3
25
+ STRIDES = [32, 16]
26
+ GRID_SIZES = [int(INFERENCE_IMG_SIZE / s) for s in STRIDES]
27
+
28
+ with open(LABELS_FILE, 'r') as f:
29
+ COCO_CLASSES = [line.strip() for line in f.readlines()]
30
+
31
+ interpreter = tf.lite.Interpreter(OBJECT_DETECTOR_TFLITE)
32
+ interpreter.allocate_tensors()
33
+
34
+
35
+ def gen_box_colors():
36
+ colors = []
37
+ for _ in range(len(COCO_CLASSES)):
38
+ r = random.randint(100, 255)
39
+ g = random.randint(100, 255)
40
+ b = random.randint(100, 255)
41
+ colors.append((r, g, b))
42
+
43
+ return colors
44
+
45
+
46
+ BOX_COLORS = gen_box_colors()
47
+
48
+
49
+ def load_image(filename):
50
+ orig_image = cv2.imread(filename, 1)
51
+ image = cv2.cvtColor(orig_image, cv2.COLOR_BGR2RGB)
52
+ image = cv2.resize(image, (INFERENCE_IMG_SIZE, INFERENCE_IMG_SIZE))
53
+ image = np.expand_dims(image, axis=0)
54
+ image = image / 255.0
55
+ return orig_image, image
56
+
57
+
58
+ def np_sigmoid(x):
59
+ return 1 / (1 + np.exp(-x))
60
+
61
+
62
+ def reciprocal_sigmoid(x):
63
+ return -np.log(1 / x - 1)
64
+
65
+
66
+ def decode_boxes_prediction(yolo_output):
67
+ # Each output level represents a grid of predictions.
68
+ # The first output level is a 26x26 grid and the second 13x13.
69
+ # Each cell of each grid is assigned to 3 anchor bounding boxes.
70
+ # The bounding box predictions are regressed
71
+ # relatively to these anchor boxes.
72
+ # Thus, the model predicts 3 bounding boxes per cell per output level.
73
+ # The output is structured as follows:
74
+ # For each cell [[x, y, w, h, conf, cl_0, cl_1, ..., cl_79], # anchor 1
75
+ # [x, y, w, h, conf, cl_0, cl_1, ..., cl_79], # anchor 2
76
+ # [x, y, w, h, conf, cl_0, cl_1, ..., cl_79]] # anchor 3
77
+ # Hence, we have 85 values per anchor box, and thus 255 values per cell.
78
+ # The decoding of the output bounding boxes is described in Figure 2 of
79
+ # the YOLOv3 paper https://arxiv.org/pdf/1804.02767.pdf;
80
+
81
+ boxes_list = []
82
+ scores_list = []
83
+ classes_list = []
84
+
85
+ for idx, feats in enumerate(yolo_output):
86
+
87
+ features = np.reshape(feats, (NUM_ANCHORS * GRID_SIZES[idx] ** 2, 85))
88
+
89
+ anchor = np.array(ANCHORS[idx])
90
+ factor = SIGMOID_FACTOR[idx]
91
+ grid_size = GRID_SIZES[idx]
92
+ stride = STRIDES[idx]
93
+
94
+ cell_confidence = features[..., 4]
95
+ logit_threshold = reciprocal_sigmoid(SCORE_THRESHOLD)
96
+ over_threshold_list = np.where(cell_confidence > logit_threshold)
97
+
98
+ if over_threshold_list[0].size > 0:
99
+ indices = np.array(over_threshold_list[0])
100
+
101
+ box_positions = np.floor_divide(indices, 3)
102
+
103
+ list_xy = np.array(np.divmod(box_positions, grid_size)).T
104
+ list_xy = list_xy[..., ::-1]
105
+ boxes_xy = np.reshape(list_xy, (int(list_xy.size / 2), 2))
106
+
107
+ outxy = features[indices, :2]
108
+
109
+ # boxes center coordinates
110
+ centers = np_sigmoid(outxy * factor) - 0.5 * (factor - 1)
111
+ centers += boxes_xy
112
+ centers *= stride
113
+
114
+ # boxes width and height
115
+ width_height = np.exp(features[indices, 2:4])
116
+ width_height *= anchor[np.divmod(indices, NUM_ANCHORS)[1]]
117
+
118
+ boxes_list.append(np.stack([centers[:, 0] - width_height[:, 0]/2,
119
+ centers[:, 1] - width_height[:, 1]/2,
120
+ centers[:, 0] + width_height[:, 0]/2,
121
+ centers[:, 1] + width_height[:, 1]/2],
122
+ axis=1))
123
+
124
+ # confidence that cell contains an object
125
+ scores_list.append(np_sigmoid(features[indices, 4:5]))
126
+
127
+ # class with the highest probability in this cell
128
+ classes_list.append(np.argmax(features[indices, 5:], axis=1))
129
+
130
+ if len(boxes_list) > 0:
131
+ boxes = np.concatenate(boxes_list, axis=0)
132
+ scores = np.concatenate(scores_list, axis=0)[:, 0]
133
+ classes = np.concatenate(classes_list, axis=0)
134
+
135
+ return boxes, scores, classes
136
+ else:
137
+ return np.zeros((0, 4)), np.zeros((0)), np.zeros((0))
138
+
139
+
140
+ def decode_output(yolo_outputs,
141
+ score_threshold=SCORE_THRESHOLD,
142
+ iou_threshold=NMS_IOU_THRESHOLD):
143
+ '''
144
+ Decode output from YOLOv4 tiny in inference size referential (416x416)
145
+ '''
146
+ boxes, scores, classes = decode_boxes_prediction(yolo_outputs)
147
+
148
+ # apply NMS from tensorflow
149
+ inds = tf.image.non_max_suppression(boxes, scores, MAX_DETS,
150
+ score_threshold=score_threshold,
151
+ iou_threshold=iou_threshold)
152
+
153
+ # keep only selected boxes
154
+ boxes = tf.gather(boxes, inds)
155
+ scores = tf.gather(scores, inds)
156
+ classes = tf.gather(classes, inds)
157
+
158
+ return scores, boxes, classes
159
+
160
+
161
+ def run_inference(interpreter, image, threshold=SCORE_THRESHOLD):
162
+
163
+ input_details = interpreter.get_input_details()
164
+ output_details = interpreter.get_output_details()
165
+ input_scale, input_zero_point = input_details[0]["quantization"]
166
+ image = image / input_scale + input_zero_point
167
+ image = image.astype(np.int8)
168
+
169
+ interpreter.set_tensor(input_details[0]['index'], image)
170
+ interpreter.invoke()
171
+
172
+ boxes = interpreter.get_tensor(output_details[0]['index'])
173
+ boxes2 = interpreter.get_tensor(output_details[1]['index'])
174
+
175
+ return [boxes, boxes2]
176
+
177
+
178
+ if __name__ == "__main__":
179
+
180
+ orig_image, processed_image = load_image(IMAGE_FILENAME)
181
+
182
+ start = time.time()
183
+ yolo_output = run_inference(interpreter, processed_image)
184
+ end = time.time()
185
+
186
+ scores, boxes, classes = decode_output(yolo_output)
187
+
188
+ # rescale boxes for display
189
+ shp = orig_image.shape
190
+ boxes = boxes.numpy()
191
+ boxes /= INFERENCE_IMG_SIZE
192
+ boxes *= np.array([shp[1], shp[0], shp[1], shp[0]])
193
+
194
+ boxes = boxes.astype(np.int32)
195
+
196
+ print("Inference time", end - start, "ms")
197
+ print("Detected", boxes.shape[0], "object(s)")
198
+ print("Box coordinates:")
199
+
200
+ for i in range(boxes.shape[0]):
201
+ box = boxes[i, :]
202
+ print(box, end=" ")
203
+ class_name = COCO_CLASSES[classes[i].numpy()]
204
+ score = scores[i].numpy()
205
+ color = BOX_COLORS[classes[i]]
206
+ print("class", class_name, end=" ")
207
+ print("score", score)
208
+ cv2.rectangle(orig_image, (box[0], box[1]), (box[2], box[3]),
209
+ color, 3)
210
+ cv2.putText(orig_image, f"{class_name} {score:.2f}",
211
+ (box[0], box[1] - 10),
212
+ cv2.FONT_HERSHEY_SIMPLEX, 0.6, color, 2)
213
+
214
+ cv2.imwrite('example_output.jpg', orig_image)
215
+ cv2.imshow('', orig_image)
216
+ cv2.waitKey()
example_input.jpg ADDED
example_output.jpg ADDED

Git LFS Details

  • SHA256: 351d09a268a84019fa6afa014437f188e4824d2df5ba03159f391587e268f851
  • Pointer size: 131 Bytes
  • Size of remote file: 126 kB
export_model.py ADDED
@@ -0,0 +1,451 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Copyright 2023-2024 NXP
2
+ # SPDX-License-Identifier: BSD-3-Clause
3
+
4
+ import os
5
+ import numpy as np
6
+ import struct
7
+ import tensorflow as tf
8
+ from tensorflow.keras.layers import Conv2D, Input, LeakyReLU
9
+ from tensorflow.keras.layers import ZeroPadding2D, UpSampling2D
10
+ from tensorflow.keras.layers import MaxPool2D, add, concatenate
11
+ from tensorflow.keras.models import Model
12
+ import argparse
13
+ import PIL.Image as im
14
+ import random
15
+
16
+ random.seed(42)
17
+
18
+ N_CALIBRATION_IMAGES = 100
19
+
20
+ parser = argparse.ArgumentParser()
21
+ parser.add_argument('--weights_path', help='path to darknet weights')
22
+ parser.add_argument('--output_path', help='path to save tflite model')
23
+ parser.add_argument('--images_path',
24
+ help='path to representative images for quantization',
25
+ default=None)
26
+ args = parser.parse_args()
27
+
28
+
29
+ def _conv_block(inp, convs, skip=False):
30
+ x = inp
31
+ count = 0
32
+
33
+ for conv in convs:
34
+ if count == (len(convs) - 2) and skip:
35
+ skip_connection = x
36
+ count += 1
37
+
38
+ if conv['stride'] > 1:
39
+ x = ZeroPadding2D(((1, 0), (1, 0)),
40
+ name='zerop_' + str(conv['layer_idx']))(
41
+ x) # peculiar padding as darknet prefer left and top
42
+
43
+ x = Conv2D(conv['filter'],
44
+ conv['kernel'],
45
+ strides=conv['stride'],
46
+ # peculiar padding as darknet prefer left and top
47
+ padding='valid' if conv['stride'] > 1 else 'same',
48
+ name='convn_' + str(conv['layer_idx']) \
49
+ if conv['bnorm'] else 'conv_' + str(conv['layer_idx']),
50
+ activation=None,
51
+ use_bias=True)(x)
52
+
53
+ if conv['activ'] == 1:
54
+ x = LeakyReLU(alpha=0.1, name='leaky_' + str(conv['layer_idx']))(x)
55
+
56
+ return add([skip_connection, x],
57
+ name='add_' + str(conv['layer_idx'] + 1)) if skip else x
58
+
59
+
60
+ def _split_block(input_layer, layer_idx):
61
+ s = tf.split(input_layer,
62
+ num_or_size_splits=2,
63
+ axis=-1,
64
+ name=f"split_{layer_idx}")
65
+ return s[1]
66
+
67
+
68
+ def make_yolov4_tiny_model():
69
+ input_image = Input(shape=(416, 416, 3),
70
+ batch_size=1,
71
+ name='input_0')
72
+ # Layer 0
73
+ x = _conv_block(input_image, [{'filter': 32,
74
+ 'kernel': 3,
75
+ 'stride': 2,
76
+ 'bnorm': True,
77
+ 'activ': 1,
78
+ 'layer_idx': 0}])
79
+ layer_0 = x
80
+ # Layer 1
81
+ x = _conv_block(x, [{'filter': 64,
82
+ 'kernel': 3,
83
+ 'stride': 2,
84
+ 'bnorm': True,
85
+ 'activ': 1,
86
+ 'layer_idx': 1}])
87
+ layer_1 = x
88
+ # Layer 2, concat1
89
+ x = _conv_block(x, [{'filter': 64,
90
+ 'kernel': 3,
91
+ 'stride': 1,
92
+ 'bnorm': True,
93
+ 'activ': 1,
94
+ 'layer_idx': 2}])
95
+ layer_2 = x
96
+ # Layer 3, route group
97
+ x = _split_block(x, layer_idx=3)
98
+ # Layer 4, concat_route_1
99
+ x = _conv_block(x, [{'filter': 32,
100
+ 'kernel': 3,
101
+ 'stride': 1,
102
+ 'bnorm': True,
103
+ 'activ': 1,
104
+ 'layer_idx': 4}])
105
+ layer_4 = x
106
+ # Layer 5, concat_route_2
107
+ x = _conv_block(x, [{'filter': 32,
108
+ 'kernel': 3,
109
+ 'stride': 1,
110
+ 'bnorm': True,
111
+ 'activ': 1,
112
+ 'layer_idx': 5}])
113
+ layer_5 = x
114
+ # Layer 6, concat route
115
+ x = concatenate([layer_5, layer_4], axis=-1, name='concat_6')
116
+ # Layer 7, concat2
117
+ x = _conv_block(x, [{'filter': 64,
118
+ 'kernel': 1,
119
+ 'stride': 1,
120
+ 'bnorm': True,
121
+ 'activ': 1,
122
+ 'layer_idx': 7}])
123
+ layer_7 = x
124
+ # Layer 8, concat
125
+ x = concatenate([layer_2, layer_7], axis=-1, name='concat_8')
126
+ # Layer 9
127
+ x = MaxPool2D(pool_size=(2, 2), padding='same', name='layer_9')(x)
128
+
129
+ # Layer 10, concat 1
130
+ x = _conv_block(x, [{'filter': 128,
131
+ 'kernel': 3,
132
+ 'stride': 1,
133
+ 'bnorm': True,
134
+ 'activ': 1,
135
+ 'layer_idx': 10}])
136
+ layer_10 = x
137
+ # Layer 11
138
+ x = _split_block(x, layer_idx=11)
139
+ # Layer 12, concat route 1
140
+ x = _conv_block(x, [{'filter': 64,
141
+ 'kernel': 3,
142
+ 'stride': 1,
143
+ 'bnorm': True,
144
+ 'activ': 1,
145
+ 'layer_idx': 12}])
146
+ layer_12 = x
147
+ # Layer 13, concat route 2
148
+ x = _conv_block(x, [{'filter': 64,
149
+ 'kernel': 3,
150
+ 'stride': 1,
151
+ 'bnorm': True,
152
+ 'activ': 1,
153
+ 'layer_idx': 13}])
154
+ layer_13 = x
155
+ # Layer 14
156
+ x = concatenate([layer_13, layer_12], axis=-1, name='concat_14')
157
+ # Layer 15, concat 2
158
+ x = _conv_block(x, [{'filter': 128,
159
+ 'kernel': 1,
160
+ 'stride': 1,
161
+ 'bnorm': True,
162
+ 'activ': 1,
163
+ 'layer_idx': 15}])
164
+ layer_15 = x
165
+ # Layer 16
166
+ x = concatenate([layer_10, layer_15], axis=-1, name='concat_16')
167
+ # Layer 17
168
+ x = MaxPool2D(pool_size=(2, 2), padding='same', name='layer_17')(x)
169
+
170
+ # Layer 18, concat 1
171
+ x = _conv_block(x, [{'filter': 256,
172
+ 'kernel': 3,
173
+ 'stride': 1,
174
+ 'bnorm': True,
175
+ 'activ': 1,
176
+ 'layer_idx': 18}])
177
+ layer_18 = x
178
+ # Layer 19
179
+ x = _split_block(x, layer_idx=19)
180
+ # Layer 20, concat route 1
181
+ x = _conv_block(x, [{'filter': 128,
182
+ 'kernel': 3,
183
+ 'stride': 1,
184
+ 'bnorm': True,
185
+ 'activ': 1,
186
+ 'layer_idx': 20}])
187
+ layer_20 = x
188
+ # Layer 21, concat route 2
189
+ x = _conv_block(x, [{'filter': 128,
190
+ 'kernel': 3,
191
+ 'stride': 1,
192
+ 'bnorm': True,
193
+ 'activ': 1,
194
+ 'layer_idx': 21}])
195
+ layer_21 = x
196
+ # Layer 22
197
+ x = concatenate([layer_21, layer_20], axis=-1, name='concat_22')
198
+ # Layer 23, concat 2, output 1 of cspdarknet
199
+ x = _conv_block(x, [{'filter': 256,
200
+ 'kernel': 1,
201
+ 'stride': 1,
202
+ 'bnorm': True,
203
+ 'activ': 1,
204
+ 'layer_idx': 23}])
205
+ layer_23 = x
206
+ # Layer 24
207
+ x = concatenate([layer_18, layer_23], axis=-1, name='concat_24')
208
+ # Layer 25
209
+ x = MaxPool2D(pool_size=(2, 2), padding='same', name='layer_25')(x)
210
+
211
+ # Layer 26, output 2 of cspdarknet
212
+ x = _conv_block(x, [{'filter': 512,
213
+ 'kernel': 3,
214
+ 'stride': 1,
215
+ 'bnorm': True,
216
+ 'activ': 1,
217
+ 'layer_idx': 26}])
218
+ layer_26 = x
219
+
220
+ # After backbone
221
+ # Layer 27, concat 1, branch 1
222
+ x = _conv_block(layer_26, [{'filter': 256,
223
+ 'kernel': 1,
224
+ 'stride': 1,
225
+ 'bnorm': True,
226
+ 'activ': 1,
227
+ 'layer_idx': 27}])
228
+ layer_27 = x
229
+
230
+ # Layer 28
231
+ x = _conv_block(x, [{'filter': 512,
232
+ 'kernel': 3,
233
+ 'stride': 1,
234
+ 'bnorm': True,
235
+ 'activ': 1,
236
+ 'layer_idx': 28}])
237
+ layer_28 = x
238
+ # Layer 29, output of large grid
239
+ x = _conv_block(x, [{'filter': 255,
240
+ 'kernel': 1,
241
+ 'stride': 1,
242
+ 'bnorm': True,
243
+ 'activ': 0,
244
+ 'layer_idx': 29}])
245
+ layer_29 = x
246
+
247
+ # Layer 30, continue from layer_27
248
+ x = _conv_block(layer_27, [{'filter': 128,
249
+ 'kernel': 1,
250
+ 'stride': 1,
251
+ 'bnorm': True,
252
+ 'activ': 1,
253
+ 'layer_idx': 30}])
254
+ layer_30 = x
255
+ # Layer 31
256
+ x = UpSampling2D(size=(2, 2),
257
+ name='upsamp_31',
258
+ interpolation='bilinear')(x)
259
+ layer_31 = x
260
+ # Layer 32
261
+ x = concatenate([layer_31, layer_23], axis=-1, name='concat_32')
262
+ # Layer 33
263
+ x = _conv_block(x, [{'filter': 256,
264
+ 'kernel': 3,
265
+ 'stride': 1,
266
+ 'bnorm': True,
267
+ 'activ': 1,
268
+ 'layer_idx': 33}])
269
+ # Layer 34, output of medium grid
270
+ x = _conv_block(x, [{'filter': 255,
271
+ 'kernel': 1,
272
+ 'stride': 1,
273
+ 'bnorm': True,
274
+ 'activ': 0,
275
+ 'layer_idx': 34}])
276
+ layer_34 = x
277
+
278
+ # End
279
+ model = Model(input_image, [layer_34, layer_29], name='Yolov4-tiny')
280
+ model.summary()
281
+ return model
282
+
283
+
284
+ # Define the model
285
+ model = make_yolov4_tiny_model()
286
+
287
+ model.summary()
288
+
289
+
290
+ # load weights in keras
291
+
292
+ class WeightReader:
293
+ def __init__(self, weight_file):
294
+ with open(weight_file, 'rb') as w_f:
295
+ major, = struct.unpack('i', w_f.read(4))
296
+ minor, = struct.unpack('i', w_f.read(4))
297
+ revision, = struct.unpack('i', w_f.read(4))
298
+
299
+ if (major * 10 + minor) >= 2 and major < 1000 and minor < 1000:
300
+ print("reading 64 bytes")
301
+ w_f.read(8)
302
+ else:
303
+ print("reading 32 bytes")
304
+ w_f.read(4)
305
+
306
+ transpose = (major > 1000) or (minor > 1000)
307
+
308
+ binary = w_f.read()
309
+
310
+ self.offset = 0
311
+ self.all_weights = np.frombuffer(binary, dtype='float32')
312
+ print(f"weight total length {len(self.all_weights)}")
313
+
314
+ def read_bytes(self, size):
315
+ self.offset = self.offset + size
316
+ return self.all_weights[self.offset - size:self.offset]
317
+
318
+ def load_weights(self, model):
319
+ count = 0
320
+ ncount = 0
321
+ for i in range(35):
322
+ try:
323
+
324
+ conv_layer = model.get_layer('convn_' + str(i))
325
+
326
+ filter = conv_layer.kernel.shape[-1]
327
+ # kernel*kernel*c*filter
328
+ nweights = np.prod(conv_layer.kernel.shape)
329
+
330
+ print(f"loading weights of convolution #" +
331
+ str(i) + "- nb parameters: " +
332
+ str(nweights + filter))
333
+
334
+ if i in [29, 34]:
335
+ bias = self.read_bytes(filter) # bias
336
+ weights = self.read_bytes(nweights) # weights
337
+
338
+ else:
339
+ bias = self.read_bytes(filter) # bias
340
+ scale = self.read_bytes(filter) # scale
341
+ mean = self.read_bytes(filter) # mean
342
+ var = self.read_bytes(filter) # variance
343
+ weights = self.read_bytes(nweights) # weights
344
+
345
+ # normalize bias
346
+ bias = bias - scale * mean / (np.sqrt(var + 0.00001))
347
+
348
+ # normalize weights
349
+ weights = np.reshape(weights,
350
+ (filter, int(nweights / filter)))
351
+ A = scale / (np.sqrt(var + 0.00001))
352
+ A = np.expand_dims(A, axis=0)
353
+ weights = weights * A.T
354
+ weights = np.reshape(weights, (nweights))
355
+
356
+ shp = list(reversed(conv_layer.get_weights()[0].shape))
357
+ weights = weights.reshape(shp)
358
+ weights = weights.transpose([2, 3, 1, 0])
359
+
360
+ if len(conv_layer.get_weights()) > 1:
361
+ a = conv_layer.set_weights([weights, bias])
362
+ else:
363
+ a = conv_layer.set_weights([weights])
364
+
365
+ count = count + 1
366
+ ncount = ncount + nweights + filter
367
+
368
+ except ValueError:
369
+ print("no convolution #" + str(i))
370
+
371
+ print(count,
372
+ "Convolution Normalized Layers are loaded with ",
373
+ ncount,
374
+ " parameters")
375
+
376
+ def reset(self):
377
+ self.offset = 0
378
+
379
+
380
+ darknet_model = args.weights_path + '/yolov4-tiny.weights'
381
+ weight_reader = WeightReader(darknet_model)
382
+ weight_reader.load_weights(model)
383
+
384
+
385
+ def image_resize(image, resize_shape):
386
+ image_copy = np.copy(image)
387
+ resize_h, resize_w = resize_shape
388
+ orig_h, orig_w, _ = image_copy.shape
389
+
390
+ scale = min(resize_h / orig_h, resize_w / orig_w)
391
+ temp_w, temp_h = int(scale * orig_w), int(scale * orig_h)
392
+ image_resized = image.resize((temp_w, temp_h), im.BILINEAR)
393
+ image_paded = np.full(shape=[resize_h, resize_w, 3], fill_value=128.0)
394
+ r_w = (resize_w - temp_w) // 2 # real_w
395
+ r_h = (resize_h - temp_h) // 2 # real_h
396
+ image_paded[r_h:temp_h + r_h, r_w:temp_w + r_w, :] = image_resized
397
+ image_paded = image_paded / 255.
398
+ return image_paded
399
+
400
+
401
+ def representative_dataset():
402
+ _, h, w, _ = model.input_shape
403
+ image_folder = args.images_path
404
+ image_files = os.listdir(image_folder)
405
+ random.shuffle(image_files)
406
+ image_files = image_files[:N_CALIBRATION_IMAGES]
407
+ for image_file in image_files:
408
+ image_path = os.path.join(image_folder, image_file)
409
+ original_image = im.open(image_path)
410
+ if original_image.mode != "RGB":
411
+ continue
412
+ image_data = image_resize(original_image, [h, w])
413
+ img_in = image_data[np.newaxis, ...].astype(np.float32)
414
+ yield [img_in]
415
+
416
+
417
+ def dummy_dataset():
418
+ _, h, w, _ = model.input_shape
419
+ for i in range(N_CALIBRATION_IMAGES):
420
+ # Tensorflow basic format : NHWC
421
+ img_in = np.random.randn(1, h, w, 3).astype('float32')
422
+ yield [img_in]
423
+
424
+
425
+ converter = tf.lite.TFLiteConverter.from_keras_model(model)
426
+
427
+ # quantized model
428
+ tflite_quant = args.output_path + '/yolov4-tiny_416_quant.tflite'
429
+
430
+ converter.optimizations = [tf.lite.Optimize.DEFAULT]
431
+ if args.images_path is not None:
432
+ converter.representative_dataset = representative_dataset
433
+ else: # Dummy dataset if no representative dataset is given
434
+ converter.representative_dataset = dummy_dataset
435
+ converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS_INT8]
436
+ converter.inference_input_type = tf.int8
437
+ converter.inference_output_type = tf.float32
438
+
439
+ tflite_model = converter.convert()
440
+ with open(tflite_quant, 'wb') as f:
441
+ f.write(tflite_model)
442
+
443
+
444
+ # float32 model
445
+ converter = tf.lite.TFLiteConverter.from_keras_model(model)
446
+
447
+ tflite_float = args.output_path + '/yolov4-tiny_416_float32.tflite'
448
+
449
+ tflite_model = converter.convert()
450
+ with open(tflite_float, 'wb') as f:
451
+ f.write(tflite_model)
recipe.sh ADDED
@@ -0,0 +1,35 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #!/usr/bin/env bash
2
+ # Copyright 2023-2024 NXP
3
+ # SPDX-License-Identifier: MIT
4
+
5
+ set -e
6
+
7
+ wget https://github.com/AlexeyAB/darknet/releases/download/darknet_yolo_v4_pre/yolov4-tiny.weights
8
+ wget https://raw.githubusercontent.com/amikelive/coco-labels/master/coco-labels-2014_2017.txt
9
+
10
+
11
+ # tensorflow -> tflite
12
+ python3.8 -m venv env
13
+ source ./env/bin/activate
14
+
15
+ pip install --upgrade pip
16
+ pip install tensorflow==2.10.0
17
+ pip install Pillow
18
+
19
+ wget --no-check-certificate https://images.cocodataset.org/zips/val2017.zip
20
+ unzip val2017.zip
21
+
22
+ # convert model from darknet to tensorflow lite
23
+ python3.8 export_model.py --weights_path=./ --output_path=./ --images_path=val2017
24
+
25
+ # install vela
26
+ pip install numpy==1.20
27
+ pip install git+https://github.com/nxp-imx/ethos-u-vela.git@lf-6.1.22-2.0.0
28
+
29
+ vela --output-dir model_imx93 yolov4-tiny_416_quant.tflite
30
+
31
+ # cleanup
32
+ deactivate
33
+ rm -rf val2017 env
34
+ rm val2017.zip
35
+ rm yolov4-tiny.weights