henry000 commited on
Commit
46b5efe
Β·
2 Parent(s): 60c4943 8fc6449

πŸ”€ [Merge] branch 'DATASET' into TRAIN

Browse files
This view is limited to 50 files because it contains too many changes. Β  See raw diff
Files changed (50) hide show
  1. .github/PULL_REQUEST_TEMPLATE/pull_request_template.md +8 -2
  2. .github/workflows/deploy.yaml +1 -1
  3. .github/workflows/develop.yaml +1 -1
  4. .readthedocs.yaml +5 -24
  5. README.md +3 -1
  6. demo/hf_demo.py +9 -9
  7. docs/0_get_start/0_quick_start.rst +71 -0
  8. docs/0_get_start/1_introduction.rst +66 -0
  9. docs/0_get_start/2_installations.rst +101 -0
  10. docs/1_tutorials/0_allIn1.rst +204 -0
  11. docs/1_tutorials/1_setup.rst +35 -0
  12. docs/1_tutorials/2_buildmodel.rst +62 -0
  13. docs/1_tutorials/3_dataset.rst +77 -0
  14. docs/1_tutorials/4_train.rst +55 -0
  15. docs/1_tutorials/5_inference.rst +20 -0
  16. docs/2_model_zoo/0_object_detection.rst +169 -0
  17. docs/2_model_zoo/1_segmentation.rst +11 -0
  18. docs/2_model_zoo/2_classification.rst +4 -0
  19. docs/3_custom/0_model.rst +12 -0
  20. docs/3_custom/1_data_augment.rst +4 -0
  21. docs/3_custom/2_loss.rst +2 -0
  22. docs/3_custom/3_task.rst +2 -0
  23. docs/4_deploy/1_deploy.rst +10 -0
  24. docs/4_deploy/2_onnx.rst +4 -0
  25. docs/4_deploy/3_tensorrt.rst +5 -0
  26. docs/5_features/0_small_object.rst +2 -0
  27. docs/5_features/1_version_convert.rst +2 -0
  28. docs/5_features/2_IPython.rst +2 -0
  29. docs/6_function_docs/0_solver.rst +12 -0
  30. docs/6_function_docs/1_tools.rst +4 -0
  31. docs/6_function_docs/2_module.rst +4 -0
  32. docs/6_function_docs/3_config.rst +188 -0
  33. docs/6_function_docs/4_dataloader.rst +8 -0
  34. docs/MODELS.md +0 -30
  35. docs/Makefile +20 -0
  36. docs/conf.py +50 -0
  37. docs/index.rst +90 -0
  38. docs/make.bat +35 -0
  39. docs/requirements.txt +6 -0
  40. examples/notebook_TensorRT.ipynb +15 -7
  41. examples/notebook_inference.ipynb +19 -2
  42. examples/notebook_smallobject.ipynb +15 -5
  43. examples/sample_inference.py +26 -16
  44. examples/sample_train.py +22 -16
  45. requirements-dev.txt +1 -0
  46. tests/test_tools/test_data_loader.py +1 -1
  47. yolo/__init__.py +2 -1
  48. yolo/config/config.py +3 -4
  49. yolo/config/dataset/coco.yaml +3 -0
  50. yolo/config/dataset/dev.yaml +3 -0
.github/PULL_REQUEST_TEMPLATE/pull_request_template.md CHANGED
@@ -11,7 +11,7 @@ assignees: ''
11
 
12
  [Please include a summary of the changes and the related issue. (Just overwrite this session directly)]
13
 
14
- ## Type of change
15
 
16
  Please delete options that are not relevant.
17
 
@@ -26,11 +26,17 @@ Please delete options that are not relevant.
26
  - [ ] Code and files are well organized.
27
  - [ ] All tests pass.
28
  - [ ] New code is covered by tests.
 
29
  - [ ] [Optional] We would be very happy if gitmoji :technologist: could be used to assist the commit message :speech_balloon:!
30
 
31
  ## Licensing:
32
 
33
- By submitting this pull request, I confirm that my contribution is made under the MIT License.
 
 
 
 
 
34
 
35
  ## Additional Information
36
 
 
11
 
12
  [Please include a summary of the changes and the related issue. (Just overwrite this session directly)]
13
 
14
+ ## Type of Change
15
 
16
  Please delete options that are not relevant.
17
 
 
26
  - [ ] Code and files are well organized.
27
  - [ ] All tests pass.
28
  - [ ] New code is covered by tests.
29
+ - [ ] The pull request is directed to the corresponding topic branch.
30
  - [ ] [Optional] We would be very happy if gitmoji :technologist: could be used to assist the commit message :speech_balloon:!
31
 
32
  ## Licensing:
33
 
34
+ By submitting this pull request, I confirm that:
35
+
36
+ - [ ] My contribution is made under the MIT License.
37
+ - [ ] I have not included any code from questionable or non-compliant sources (GPL, AGPL, ... etc).
38
+ - [ ] I understand that all contributions to this repository must comply with the MIT License, and I promise that my contributions do not violate this license.
39
+ - [ ] I have not used any code or content from sources that conflict with the MIT License or are otherwise legally questionable.
40
 
41
  ## Additional Information
42
 
.github/workflows/deploy.yaml CHANGED
@@ -13,7 +13,7 @@ jobs:
13
  strategy:
14
  matrix:
15
  operating-system: [ubuntu-latest, macos-latest]
16
- python-version: [3.8, '3.10', '3.12']
17
  fail-fast: false
18
 
19
  steps:
 
13
  strategy:
14
  matrix:
15
  operating-system: [ubuntu-latest, macos-latest]
16
+ python-version: [3.8, '3.10']
17
  fail-fast: false
18
 
19
  steps:
.github/workflows/develop.yaml CHANGED
@@ -23,7 +23,7 @@ jobs:
23
  uses: actions/cache@v2
24
  with:
25
  path: ~/.cache/pip
26
- key: ${{ runner.os }}-pip-${{ hashFiles('**/requirements.txt') }}
27
  restore-keys: |
28
  ${{ runner.os }}-pip-
29
 
 
23
  uses: actions/cache@v2
24
  with:
25
  path: ~/.cache/pip
26
+ key: ${{ runner.os }}-pip-${{ hashFiles('**/requirements.txt') }}-3.10
27
  restore-keys: |
28
  ${{ runner.os }}-pip-
29
 
.readthedocs.yaml CHANGED
@@ -1,32 +1,13 @@
1
- # .readthedocs.yaml
2
- # Read the Docs configuration file
3
- # See https://docs.readthedocs.io/en/stable/config-file/v2.html for details
4
-
5
- # Required
6
  version: 2
7
 
8
- # Set the OS, Python version and other tools you might need
9
  build:
10
  os: ubuntu-22.04
11
  tools:
12
  python: "3.12"
13
- # You can also specify other tool versions:
14
- # nodejs: "19"
15
- # rust: "1.64"
16
- # golang: "1.19"
17
-
18
- # Build documentation in the "docs/" directory with Sphinx
19
- # sphinx:
20
- # configuration: docs/conf.py
21
 
22
- # Optionally build your docs in additional formats such as PDF and ePub
23
- # formats:
24
- # - pdf
25
- # - epub
26
 
27
- # Optional but recommended, declare the Python requirements required
28
- # to build your documentation
29
- # See https://docs.readthedocs.io/en/stable/guides/reproducible-builds.html
30
- # python:
31
- # install:
32
- # - requirements: docs/requirements.txt
 
 
 
 
 
 
1
  version: 2
2
 
 
3
  build:
4
  os: ubuntu-22.04
5
  tools:
6
  python: "3.12"
 
 
 
 
 
 
 
 
7
 
8
+ sphinx:
9
+ configuration: docs/conf.py
 
 
10
 
11
+ python:
12
+ install:
13
+ - requirements: docs/requirements.txt
 
 
 
README.md CHANGED
@@ -1,9 +1,11 @@
1
  # YOLO: Official Implementation of YOLOv9, YOLOv7
2
 
 
3
  ![GitHub License](https://img.shields.io/github/license/WongKinYiu/YOLO)
4
  ![WIP](https://img.shields.io/badge/status-WIP-orange)
5
- ![](https://img.shields.io/github/actions/workflow/status/WongKinYiu/YOLO/deploy.yaml)
6
 
 
 
7
 
8
  [![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/yolov9-learning-what-you-want-to-learn-using/real-time-object-detection-on-coco)](https://paperswithcode.com/sota/real-time-object-detection-on-coco)
9
 
 
1
  # YOLO: Official Implementation of YOLOv9, YOLOv7
2
 
3
+ [![Documentation Status](https://readthedocs.org/projects/yolo-docs/badge/?version=latest)](https://yolo-docs.readthedocs.io/en/latest/?badge=latest)
4
  ![GitHub License](https://img.shields.io/github/license/WongKinYiu/YOLO)
5
  ![WIP](https://img.shields.io/badge/status-WIP-orange)
 
6
 
7
+ [![Developer Mode Build & Test](https://github.com/WongKinYiu/YOLO/actions/workflows/develop.yaml/badge.svg)](https://github.com/WongKinYiu/YOLO/actions/workflows/develop.yaml)
8
+ [![Deploy Mode Validation & Inference](https://github.com/WongKinYiu/YOLO/actions/workflows/deploy.yaml/badge.svg)](https://github.com/WongKinYiu/YOLO/actions/workflows/deploy.yaml)
9
 
10
  [![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/yolov9-learning-what-you-want-to-learn-using/real-time-object-detection-on-coco)](https://paperswithcode.com/sota/real-time-object-detection-on-coco)
11
 
demo/hf_demo.py CHANGED
@@ -11,7 +11,7 @@ from yolo import (
11
  AugmentationComposer,
12
  NMSConfig,
13
  PostProccess,
14
- Vec2Box,
15
  create_model,
16
  draw_bboxes,
17
  )
@@ -25,22 +25,22 @@ def load_model(model_name, device):
25
  model_cfg.model.auxiliary = {}
26
  model = create_model(model_cfg, True)
27
  model.to(device).eval()
28
- return model
29
 
30
 
31
  device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
32
- model = load_model(DEFAULT_MODEL, device)
33
- v2b = Vec2Box(model, IMAGE_SIZE, device)
34
- class_list = OmegaConf.load("yolo/config/general.yaml").class_list
35
 
36
  transform = AugmentationComposer([])
37
 
38
 
39
  def predict(model_name, image, nms_confidence, nms_iou):
40
- global DEFAULT_MODEL, model, device, v2b, class_list, post_proccess
41
  if model_name != DEFAULT_MODEL:
42
- model = load_model(model_name, device)
43
- v2b = Vec2Box(model, IMAGE_SIZE, device)
44
  DEFAULT_MODEL = model_name
45
 
46
  image_tensor, _, rev_tensor = transform(image)
@@ -49,7 +49,7 @@ def predict(model_name, image, nms_confidence, nms_iou):
49
  rev_tensor = rev_tensor.to(device)[None]
50
 
51
  nms_config = NMSConfig(nms_confidence, nms_iou)
52
- post_proccess = PostProccess(v2b, nms_config)
53
 
54
  with torch.no_grad():
55
  predict = model(image_tensor)
 
11
  AugmentationComposer,
12
  NMSConfig,
13
  PostProccess,
14
+ create_converter,
15
  create_model,
16
  draw_bboxes,
17
  )
 
25
  model_cfg.model.auxiliary = {}
26
  model = create_model(model_cfg, True)
27
  model.to(device).eval()
28
+ return model, model_cfg
29
 
30
 
31
  device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
32
+ model, model_cfg = load_model(DEFAULT_MODEL, device)
33
+ converter = create_converter(model_cfg.name, model, model_cfg.anchor, IMAGE_SIZE, device)
34
+ class_list = OmegaConf.load("yolo/config/dataset/coco.yaml").class_list
35
 
36
  transform = AugmentationComposer([])
37
 
38
 
39
  def predict(model_name, image, nms_confidence, nms_iou):
40
+ global DEFAULT_MODEL, model, device, converter, class_list, post_proccess
41
  if model_name != DEFAULT_MODEL:
42
+ model, model_cfg = load_model(model_name, device)
43
+ converter = create_converter(model_cfg.name, model, model_cfg.anchor, IMAGE_SIZE, device)
44
  DEFAULT_MODEL = model_name
45
 
46
  image_tensor, _, rev_tensor = transform(image)
 
49
  rev_tensor = rev_tensor.to(device)[None]
50
 
51
  nms_config = NMSConfig(nms_confidence, nms_iou)
52
+ post_proccess = PostProccess(converter, nms_config)
53
 
54
  with torch.no_grad():
55
  predict = model(image_tensor)
docs/0_get_start/0_quick_start.rst ADDED
@@ -0,0 +1,71 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ Quick Start
2
+ ===========
3
+
4
+ .. note::
5
+ We expect all customizations to be done primarily by passing arguments or modifying the YAML config files.
6
+ If more detailed modifications are needed, custom content should be modularized as much as possible to avoid extensive code modifications.
7
+
8
+ .. _QuickInstallYOLO:
9
+
10
+ Install YOLO
11
+ ------------
12
+
13
+ Clone the repository and install the dependencies:
14
+
15
+ .. code-block:: bash
16
+
17
+ git clone https://github.com/WongKinYiu/YOLO.git
18
+ cd YOLO
19
+ pip install -r requirements-dev.txt
20
+ # Make sure to work inside the cloned folder.
21
+
22
+ Alternatively, If you are planning to make a simple change:
23
+
24
+ **Note**: In the following examples, you should replace ``python yolo/lazy.py`` with ``yolo`` .
25
+
26
+ .. code-block:: bash
27
+
28
+ pip install git+https://github.com/WongKinYiu/YOLO.git
29
+
30
+ **Note**: Most tasks already include at yolo/lazy.py, so you can run with this prefix and follow arguments: ``python yolo/lazy.py``
31
+
32
+
33
+ Train Model
34
+ -----------
35
+
36
+ To train the model, use the following command:
37
+
38
+ .. code-block:: bash
39
+
40
+ python yolo/lazy.py task=train
41
+
42
+ yolo task=train # if installed via pip
43
+
44
+ - Overriding the ``dataset`` parameter, you can customize your dataset via a dataset config.
45
+ - Overriding YOLO model by setting the ``model`` parameter to ``{v9-c, v9-m, ...}``.
46
+ - More details can be found at :ref:`Train Tutorials<Train>`.
47
+
48
+ For example:
49
+
50
+ .. code-block:: bash
51
+
52
+ python yolo/lazy.py task=train dataset=AYamlFilePath model=v9-m
53
+
54
+ yolo task=train dataset=AYamlFilePath model=v9-m # if installed via pip
55
+
56
+ Inference & Deployment
57
+ ------------------------
58
+
59
+ Inference is the default task of ``yolo/lazy.py``. To run inference and deploy the model, use:
60
+ More details can be found at :ref:`Inference Tutorials <Inference>`.
61
+
62
+ .. code-block:: bash
63
+
64
+ python yolo/lazy.py task.data.source=AnySource
65
+
66
+ yolo task.data.source=AnySource # if installed via pip
67
+
68
+ You can enable fast inference modes by adding the parameter ``task.fast_inference={onnx, trt, deploy}``.
69
+
70
+ - Theoretical acceleration following :ref:`YOLOv9 <Deploy>`.
71
+ - Hardware acceleration like :ref:`ONNX <ONNX>` and :ref:`TensorRT <TensorRT>`. for optimized deployment.
docs/0_get_start/1_introduction.rst ADDED
@@ -0,0 +1,66 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ What is YOLO
2
+ ============
3
+
4
+ ``YOLO`` (You Only Look Once) is a state-of-the-art, real-time object detection system. It is designed to predict bounding boxes and class probabilities for objects in an image with high accuracy and speed. YOLO models, including the latest YOLOv9, are known for their efficiency in detecting objects in a single forward pass through the network, making them highly suitable for real-time applications.
5
+
6
+ YOLOv9 introduces improvements in both architecture and loss functions to enhance prediction accuracy and inference speed.
7
+
8
+ Forward Process
9
+ ---------------
10
+
11
+ The forward process of YOLOv9 can be visualized as follows:
12
+
13
+ .. mermaid::
14
+
15
+ graph LR
16
+ subgraph YOLOv9
17
+ Auxiliary
18
+ AP["Auxiliary Prediction"]
19
+ end
20
+ BackBone-->FPN;
21
+ FPN-->PAN;
22
+ PAN-->MP["Main Prediction"];
23
+ BackBone-->Auxiliary;
24
+ Auxiliary-->AP;
25
+
26
+ - **BackBone**: Extracts features from the input image.
27
+ - **FPN (Feature Pyramid Network)**: Aggregates features at different scales.
28
+ - **PAN (Region Proposal Network)**: Proposes regions of interest.
29
+ - **Main Prediction**: The primary detection output.
30
+ - **Auxiliary Prediction**: Additional predictions to assist the main prediction.
31
+
32
+ Loss Function
33
+ -------------
34
+
35
+ The loss function of YOLOv9 combines several components to optimize the model's performance:
36
+
37
+ .. mermaid::
38
+
39
+ flowchart LR
40
+ gtb-->cls
41
+ gtb["Ground Truth"]-->iou
42
+ pdm-.->cls["Max Class"]
43
+ pdm["Main Prediction"]-.->iou["Closest IoU"]
44
+ pdm-.->anc["box in anchor"]
45
+ cls-->gt
46
+ iou-->gt["Matched GT Box"]
47
+ anc-.->gt
48
+
49
+ gt-->Liou["IoU Loss"]
50
+ pdm-->Liou
51
+ pdm-->Lbce
52
+ gt-->Lbce["BCE Loss"]
53
+ gt-->Ldfl["DFL Loss"]
54
+ pdm-->Ldfl
55
+
56
+ Lbce-->ML
57
+ Liou-->ML
58
+ Ldfl-->ML["Total Loss"]
59
+
60
+ - **Ground Truth**: The actual labels and bounding boxes in the dataset.
61
+ - **Main Prediction**: The model's predicted bounding boxes and class scores.
62
+ - **IoU (Intersection over Union)**: Measures the overlap between the predicted and ground truth boxes.
63
+ - **BCE (Binary Cross-Entropy) Loss**: Used for class prediction.
64
+ - **DFL (Distribution Focal Loss)**: Used for improving the precision of bounding box regression.
65
+
66
+ By optimizing these components, YOLOv9 aims to achieve high accuracy and robustness in object detection tasks.
docs/0_get_start/2_installations.rst ADDED
@@ -0,0 +1,101 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ Install YOLO
2
+ ============
3
+
4
+ This guide will help you set up YOLO on your machine.
5
+ We recommend starting with `GitHub Settings <#git-github>`_ for more flexible customization.
6
+ If you are planning to perform inference only or require a simple customization, you can choose to install via `PyPI <#pypi-pip-install>`_.
7
+
8
+ Torch Requirements
9
+ -------------------
10
+
11
+ The following table summarizes the torch requirements for different operating systems and hardware configurations:
12
+
13
+
14
+ .. tabs::
15
+
16
+ .. tab:: Linux
17
+
18
+ .. tabs::
19
+
20
+ .. tab:: CUDA
21
+
22
+ PyTorch: 1.12+
23
+
24
+ .. tab:: CPU
25
+
26
+ PyTorch: 1.12+
27
+
28
+ .. tab:: MacOS
29
+
30
+ .. tabs::
31
+
32
+ .. tab:: MPS
33
+
34
+ PyTorch: 2.2+
35
+ .. tab:: CPU
36
+ PyTorch: 2.2+
37
+ .. tab:: Windows
38
+
39
+ .. tabs::
40
+
41
+ .. tab:: CUDA
42
+
43
+ [WIP]
44
+
45
+ .. tab:: CPU
46
+
47
+ [WIP]
48
+
49
+
50
+ Git & GitHub
51
+ ------------
52
+
53
+ First, Clone the repository:
54
+
55
+ .. code-block:: bash
56
+
57
+ git clone https://github.com/WongKinYiu/YOLO.git
58
+
59
+ Alternatively, you can directly download the repository via this `link <https://github.com/WongKinYiu/YOLO/archive/refs/heads/main.zip>`_.
60
+
61
+ Next, install the required packages:
62
+
63
+ .. code-block:: bash
64
+
65
+ # For the minimal requirements, use:
66
+ pip install -r requirements.txt
67
+ # For a full installation, use:
68
+ pip install -r requirements-dev.txt
69
+
70
+ Moreover, if you plan to utilize ONNX or TensorRT, please follow :ref:`ONNX`, :ref:`TensorRT` for more installation details.
71
+
72
+ PyPI (pip install)
73
+ ------------------
74
+
75
+ .. note::
76
+ Due to the :guilabel:`yolo` this name already being occupied in the PyPI library, we are still determining the package name.
77
+ Currently, we provide an alternative way to install via the GitHub repository. Ensure your shell has `git` and `pip3` (or `pip`).
78
+
79
+ To install YOLO via GitHub:
80
+
81
+ .. code-block:: bash
82
+
83
+ pip install git+https://github.com/WongKinYiu/YOLO.git
84
+
85
+ Docker
86
+ ------
87
+
88
+ To run YOLO using NVIDIA Docker, you can pull the Docker image and run it with GPU support:
89
+
90
+ .. code-block:: bash
91
+
92
+ docker pull henrytsui000/yolo
93
+ docker run --gpus all -it henrytsui000/yolo
94
+
95
+ Make sure you have the NVIDIA Docker toolkit installed. For more details on setting up NVIDIA Docker, refer to the `NVIDIA Docker documentation <https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/install-guide.html>`_.
96
+
97
+
98
+ Conda
99
+ -----
100
+
101
+ We will publish it in the near future!
docs/1_tutorials/0_allIn1.rst ADDED
@@ -0,0 +1,204 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ All In 1
2
+ ========
3
+
4
+ :file:`yolo.lazy` is a packaged file that includes :guilabel:`training`, :guilabel:`validation`, and :guilabel:`inference` tasks.
5
+ For detailed function documentation, thercheck out the IPython notebooks to learn how to import and use these function
6
+ the following section will break down operation inside of lazy, also supporting directly import/call the function.
7
+
8
+ [TOC], setup, build, dataset, train, validation, inference
9
+ To train the model, you can run:
10
+
11
+ Train Model
12
+ ----------
13
+
14
+
15
+ - batch size check / cuda
16
+ - training time / check
17
+ - build model / check
18
+ - dataset / check
19
+
20
+ .. code-block:: bash
21
+
22
+ python yolo/lazy.py task=train
23
+
24
+ You can customize the training process by overriding the following common arguments:
25
+
26
+ - ``name``: :guilabel:`str`
27
+ The experiment name.
28
+
29
+ - ``model``: :guilabel:`str`
30
+ Model backbone, options include [model_zoo] v9-c, v7, v9-e, etc.
31
+
32
+ - ``cpu_num``: :guilabel:`int`
33
+ Number of CPU workers (num_workers).
34
+
35
+ - ``out_path``: :guilabel:`Path`
36
+ The output path for saving models and logs.
37
+
38
+ - ``weight``: :guilabel:`Path | bool | None`
39
+ The path to pre-trained weights, False for training from scratch, None for default weights.
40
+
41
+ - ``use_wandb``: :guilabel:`bool`
42
+ Whether to use Weights and Biases for experiment tracking.
43
+
44
+ - ``use_TensorBoard``: :guilabel:`bool`
45
+ Whether to use TensorBoard for logging.
46
+
47
+ - ``image_size``: :guilabel:`int | [int, int]`
48
+ The input image size.
49
+
50
+ - ``+quiet``: :guilabel:`bool`
51
+ Optional, disable all output.
52
+
53
+ - ``task.epoch``: :guilabel:`int`
54
+ Total number of training epochs.
55
+
56
+ - ``task.data.batch_size``: :guilabel:`int`
57
+ The size of each batch (auto-batch sizing [WIP]).
58
+
59
+ Examples
60
+ ~~~~~~~~
61
+
62
+ To train a model with a specific batch size and image size, you can run:
63
+
64
+ .. code-block:: bash
65
+
66
+ python yolo/lazy.py task=train task.data.batch_size=12 image_size=1280
67
+
68
+ Multi-GPU Training with DDP
69
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
70
+
71
+ For multi-GPU training, we use Distributed Data Parallel (DDP) for efficient and scalable training.
72
+ DDP enable training model with mutliple GPU, even the GPUs aren't on the same machine. For more details, you can refer to the `DDP tutorial <https://pytorch.org/tutorials/intermediate/ddp_tutorial.html>`_.
73
+
74
+ To train on multiple GPUs, replace the ``python`` command with ``torchrun --nproc_per_node=[GPU_NUM]``. The ``nproc_per_node`` argument specifies the number of GPUs to use.
75
+
76
+
77
+ .. tabs::
78
+
79
+ .. tab:: bash
80
+ .. code-block:: bash
81
+
82
+ torchrun --nproc_per_node=2 yolo/lazy.py task=train device=[0,1]
83
+
84
+ .. tab:: zsh
85
+ .. code-block:: bash
86
+
87
+ torchrun --nproc_per_node=2 yolo/lazy.py task=train device=\[0,1\]
88
+
89
+
90
+ Training on a Custom Dataset
91
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
92
+
93
+ To use the auto-download module, we suggest users construct the dataset config in the following format.
94
+ If the config files include `auto_download`, the model will automatically download the dataset when creating the dataloader.
95
+
96
+ Here is an example dataset config file:
97
+
98
+ .. literalinclude:: ../../yolo/config/dataset/dev.yaml
99
+ :language: YAML
100
+
101
+ Both of the following formats are acceptable:
102
+
103
+ - ``path``: :guilabel:`str`
104
+ The path to the dataset.
105
+
106
+ - ``train, validation``: :guilabel:`str`
107
+ The training and validation directory names under `/images`. If using txt as ground truth, these should also be the names under `/labels/`.
108
+
109
+ - ``class_num``: :guilabel:`int`
110
+ The number of dataset classes.
111
+
112
+ - ``class_list``: :guilabel:`List[str]`
113
+ Optional, the list of class names, used only for visualizing the bounding box classes.
114
+
115
+ - ``auto_download``: :guilabel:`dict`
116
+ Optional, whether to auto-download the dataset.
117
+
118
+ The dataset should include labels or annotations, preferably in JSON format for compatibility with pycocotools during inference:
119
+
120
+ .. code-block:: text
121
+
122
+ DataSetName/
123
+ β”œβ”€β”€ annotations
124
+ β”‚ β”œβ”€β”€ train_json_name.json
125
+ β”‚ └── val_json_name.json
126
+ β”œβ”€β”€ labels/
127
+ β”‚ β”œβ”€β”€ train/
128
+ β”‚ β”‚ β”œβ”€β”€ AnyLabelName.txt
129
+ β”‚ β”‚ └── ...
130
+ β”‚ └── validation/
131
+ β”‚ └── ...
132
+ └── images/
133
+ β”œβ”€β”€ train/
134
+ β”‚ β”œβ”€β”€ AnyImageNameN.{png,jpg,jpeg}
135
+ β”‚ └── ...
136
+ └── validation/
137
+ └── ...
138
+
139
+
140
+ Validation Model
141
+ ----------------
142
+
143
+ During training, this block will be auto-executed. You may also run this task manually to generate a JSON file representing the predictions for a given validation dataset. If the validation set includes JSON annotations, it will run pycocotools for evaluation.
144
+
145
+ We recommend setting ``task.data.shuffle`` to False and turning off ``task.data.data_augment``.
146
+
147
+ You can customize the validation process by overriding the following arguments:
148
+
149
+ - ``task.nms.min_confidence``: :guilabel:`str`
150
+ The minimum confidence of model prediction.
151
+
152
+ - ``task.nms.min_iou``: :guilabel:`str`
153
+ The minimum IoU threshold for NMS (Non-Maximum Suppression).
154
+
155
+ Examples
156
+ ~~~~~~~~
157
+
158
+ .. tabs::
159
+
160
+ .. tab:: git-cloned
161
+ .. code-block:: bash
162
+
163
+ python yolo/lazy.py task=validation task.nms.min_iou=0.9
164
+
165
+ .. tab:: PyPI
166
+ .. code-block:: bash
167
+
168
+ yolo task=validation task.nms.min_iou=0.9
169
+
170
+
171
+ Model Inference
172
+ ---------------
173
+
174
+ .. note::
175
+ The ``dataset`` parameter shouldn't be overridden because the model requires the ``class_num`` of the dataset. If the classes have names, please provide the ``class_list``.
176
+
177
+ You can customize the inference process by overriding the following arguments:
178
+
179
+ - ``task.fast_inference``: :guilabel:`str`
180
+ Optional. Values can be `onnx`, `trt`, `deploy`, or `None`. `deploy` will detach the model auxiliary head.
181
+
182
+ - ``task.data.source``: :guilabel:`str | Path | int`
183
+ This argument will be auto-resolved and could be a webcam ID, image folder path, video/image path.
184
+
185
+ - ``task.nms.min_confidence``: :guilabel:`str`
186
+ The minimum confidence of model prediction.
187
+
188
+ - ``task.nms.min_iou``: :guilabel:`str`
189
+ The minimum IoU threshold for NMS (Non-Maximum Suppression).
190
+
191
+ Examples
192
+ ~~~~~~~~
193
+
194
+ .. tabs::
195
+
196
+ .. tab:: git-cloned
197
+ .. code-block:: bash
198
+
199
+ python yolo/lazy.py model=v9-m task.nms.min_confidence=0.1 task.data.source=0 task.fast_inference=onnx
200
+
201
+ .. tab:: PyPI
202
+ .. code-block:: bash
203
+
204
+ yolo model=v9-m task.nms.min_confidence=0.1 task.data.source=0 task.fast_inference=onnx
docs/1_tutorials/1_setup.rst ADDED
@@ -0,0 +1,35 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ Setup Config
2
+ ============
3
+
4
+ To set up your configuration, you will need to generate a configuration class based on :class:`~yolo.config.config.Config`, which can be achieved using `hydra <https://hydra.cc/>`_.
5
+ The configuration will include all the necessary settings for your ``task``, including general configuration, ``dataset`` information, and task-specific information (``train``, ``inference``, ``validation``).
6
+
7
+ Next, create the progress logger to handle the output and progress bar. This class is based on `rich <https://github.com/Textualize/rich>`_'s progress bar and customizes the logger (print function) using `loguru <https://loguru.readthedocs.io/>`_.
8
+
9
+ .. tabs::
10
+
11
+ .. tab:: decorator
12
+ .. code-block:: python
13
+
14
+ import hydra
15
+ from yolo import ProgressLogger
16
+ from yolo.config.config import Config
17
+
18
+ @hydra.main(config_path="config", config_name="config", version_base=None)
19
+ def main(cfg: Config):
20
+ progress = ProgressLogger(cfg, exp_name=cfg.name)
21
+ pass
22
+
23
+ .. tab:: initialize & compose
24
+ .. code-block:: python
25
+
26
+ from hydra import compose, initialize
27
+ from yolo import ProgressLogger
28
+ from yolo.config.config import Config
29
+
30
+ with initialize(config_path="config", version_base=None):
31
+ cfg = compose(config_name="config", overrides=["task=train", "model=v9-c"])
32
+
33
+ progress = ProgressLogger(cfg, exp_name=cfg.name)
34
+
35
+ TODO: add a config over view
docs/1_tutorials/2_buildmodel.rst ADDED
@@ -0,0 +1,62 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ Build Model
2
+ ===========
3
+
4
+ In YOLOv7, the prediction will be ``Anchor``, and in YOLOv9, it will predict ``Vector``. The converter will turn the bounding box to the vector.
5
+
6
+ The overall model flowchart is as follows:
7
+
8
+ .. mermaid::
9
+
10
+ flowchart LR
11
+ Input-->Model;
12
+ Model--Class-->NMS;
13
+ Model--Anc/Vec-->Converter;
14
+ Converter--Box-->NMS;
15
+ NMS-->Output;
16
+
17
+ Load Model
18
+ ~~~~~~~~~~
19
+
20
+ Using `create_model`, it will automatically create the :class:`~yolo.model.yolo.YOLO` model and load the provided weights.
21
+
22
+ Arguments:
23
+
24
+ - **model**: :class:`~yolo.config.config.ModelConfig`
25
+ The model configuration.
26
+ - **class_num**: :guilabel:`int`
27
+ The number of classes in the dataset, used for the YOLO's prediction head.
28
+ - **weight_path**: :guilabel:`Path | bool`
29
+ The path to the model weights.
30
+ - If `False`, weights are not loaded.
31
+ - If :guilabel:`True | None`, default weights are loaded.
32
+ - If a `Path`, the model weights are loaded from the specified path.
33
+
34
+ .. code-block:: python
35
+
36
+ model = create_model(cfg.model, class_num=cfg.dataset.class_num, weight_path=cfg.weight)
37
+ model = model.to(device)
38
+
39
+ Deploy Model
40
+ ~~~~~~~~~~~~
41
+
42
+ In the deployment version, we will remove the auxiliary branch of the model for fast inference. If the config includes ONNX and TensorRT, it will load/compile the model to ONNX or TensorRT format after removing the auxiliary branch.
43
+
44
+ .. code-block:: python
45
+
46
+ model = FastModelLoader(cfg).load_model(device)
47
+
48
+ Autoload Converter
49
+ ~~~~~~~~~~~~~~~~~~
50
+
51
+ Autoload the converter based on the model type (v7 or v9).
52
+
53
+ Arguments:
54
+
55
+ - **Model Name**: :guilabel:`str`
56
+ Used for choosing ``Vec2Box`` or ``Anc2Box``.
57
+ - **Anchor Config**: The anchor configuration, used to generate the anchor grid.
58
+ - **model**, **image_size**: Used for auto-detecting the anchor grid.
59
+
60
+ .. code-block:: python
61
+
62
+ converter = create_converter(cfg.model.name, model, cfg.model.anchor, cfg.image_size, device)
docs/1_tutorials/3_dataset.rst ADDED
@@ -0,0 +1,77 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ Create Dataset
2
+ ==============
3
+
4
+ In this section, we will prepare the dataset and create a dataloader.
5
+
6
+ Overall, the dataloader can be created by:
7
+
8
+ .. code-block:: python
9
+
10
+ from yolo import create_dataloader
11
+ dataloader = create_dataloader(cfg.task.data, cfg.dataset, cfg.task.task, use_ddp)
12
+
13
+ For inference, the dataset will be handled by :class:`~yolo.tools.data_loader.StreamDataLoader`, while for training and validation, it will be handled by :class:`~yolo.tools.data_loader.YoloDataLoader`.
14
+
15
+ The input arguments are:
16
+
17
+ - **DataConfig**: :class:`~yolo.config.config.DataConfig`, the relevant configuration for the dataloader.
18
+ - **DatasetConfig**: :class:`~yolo.config.config.DatasetConfig`, the relevant configuration for the dataset.
19
+ - **task_name**: :guilabel:`str`, the task name, which can be `inference`, `validation`, or `train`.
20
+ - **use_ddp**: :guilabel:`bool`, whether to use DDP (Distributed Data Parallel). Default is `False`.
21
+
22
+ Train and Validation
23
+ ----------------------------
24
+
25
+ Dataloader Return Type
26
+ ~~~~~~~~~~~~~~~~~~~~~
27
+
28
+ For each iteration, the return type includes:
29
+
30
+ - **batch_size**: the size of each batch, used to calculate batch average loss.
31
+ - **images**: the input images.
32
+ - **targets**: the ground truth of the images according to the task.
33
+
34
+ Auto Download Dataset
35
+ ~~~~~~~~~~~~~~~~~~~~~
36
+
37
+ The dataset will be auto-downloaded if the user provides the `auto_download` configuration. For example, if the configuration is as follows:
38
+
39
+
40
+ .. literalinclude:: ../../yolo/config/dataset/mock.yaml
41
+ :language: YAML
42
+
43
+
44
+ First, it will download and unzip the dataset from `{prefix}/{postfix}`, and verify that the dataset has `{file_num}` files.
45
+
46
+ Once the dataset is verified, it will generate `{train, validation}.cache` in Tensor format, which accelerates the dataset preparation speed.
47
+
48
+ Inference
49
+ -----------------
50
+
51
+ In streaming mode, the model will infer the most recent frame and draw the bounding boxes by default, given the save flag to save the image. In other modes, it will save the predictions to `runs/inference/{exp_name}/outputs/` by default.
52
+
53
+ Dataloader Return Type
54
+ ~~~~~~~~~~~~~~~~~~~~~
55
+
56
+ For each iteration, the return type of `StreamDataLoader` includes:
57
+
58
+ - **images**: tensor, the size of each batch, used to calculate batch average loss.
59
+ - **rev_tensor**: tensor, reverse tensor for reverting the bounding boxes and images to the input shape.
60
+ - **origin_frame**: tensor, the original input image.
61
+
62
+ Input Type
63
+ ~~~~~~~~~~
64
+
65
+ - **Stream Input**:
66
+
67
+ - **webcam**: :guilabel:`int`, ID of the webcam, for example, 0, 1.
68
+ - **rtmp**: :guilabel:`str`, RTMP address.
69
+
70
+ - **Single Source**:
71
+
72
+ - **image**: :guilabel:`Path`, path to image files (`jpeg`, `jpg`, `png`, `tiff`).
73
+ - **video**: :guilabel:`Path`, path to video files (`mp4`).
74
+
75
+ - **Folder**:
76
+
77
+ - **folder of images**: :guilabel:`Path`, the relative or absolute path to the folder containing images.
docs/1_tutorials/4_train.rst ADDED
@@ -0,0 +1,55 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ Train & Validation
2
+ ==================
3
+
4
+ Training Model
5
+ ----------------
6
+
7
+ To train a model, the :class:`~yolo.tools.solver.ModelTrainer` can help manage the training process. Initialize the :class:`~yolo.tools.solver.ModelTrainer` and use the :func:`~yolo.tools.solver.ModelTrainer.solve` function to start the training.
8
+
9
+ Before starting the training, don't forget to start the progress logger to enable logging the process status. This will also enable `Weights & Biases (wandb) <https://wandb.ai/site>`_ or TensorBoard if configured.
10
+
11
+ .. code-block:: python
12
+
13
+ from yolo import ModelTrainer
14
+ solver = ModelTrainer(cfg, model, converter, progress, device, use_ddp)
15
+ progress.start()
16
+ solver.solve(dataloader)
17
+
18
+ Training Diagram
19
+ ~~~~~~~~~~~~~~~~
20
+
21
+ The following diagram illustrates the training process:
22
+
23
+ .. mermaid::
24
+
25
+ flowchart LR
26
+ subgraph TS["trainer.solve"]
27
+ subgraph TE["train one epoch"]
28
+ subgraph "train one batch"
29
+ backpropagation-->TF[forward]
30
+ TF-->backpropagation
31
+ end
32
+ end
33
+ subgraph validator.solve
34
+ VC["calculate mAP"]-->VF[forward]
35
+ VF[forward]-->VC
36
+ end
37
+ end
38
+ TE-->validator.solve
39
+ validator.solve-->TE
40
+
41
+ Validation Model
42
+ ----------------
43
+
44
+ To validate the model performance, we follow a similar approach as the training process using :class:`~yolo.tools.solver.ModelValidator`.
45
+
46
+ .. code-block:: python
47
+
48
+ from yolo import ModelValidator
49
+ solver = ModelValidator(cfg, model, converter, progress, device, use_ddp)
50
+ progress.start()
51
+ solver.solve(dataloader)
52
+
53
+ The :class:`~yolo.tools.solver.ModelValidator` class helps manage the validation process, ensuring that the model's performance is evaluated accurately.
54
+
55
+ .. note:: The original training process already includes the validation phase. Call this separately if you want to run the validation again after the training is completed.
docs/1_tutorials/5_inference.rst ADDED
@@ -0,0 +1,20 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ Inference
2
+ ==========
3
+
4
+
5
+ Inference Video
6
+ ---------------
7
+
8
+ Inference Image
9
+ ---------------
10
+ task: inference
11
+
12
+ fast_inference: # onnx, trt, deploy or Empty
13
+ data:
14
+ source: demo/images/inference/image.png
15
+ image_size: ${image_size}
16
+ data_augment: {}
17
+ nms:
18
+ min_confidence: 0.5
19
+ min_iou: 0.5
20
+ # save_predict: True
docs/2_model_zoo/0_object_detection.rst ADDED
@@ -0,0 +1,169 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ Object Detection
2
+ ================
3
+
4
+ YOLOv7
5
+ ~~~~~~
6
+
7
+
8
+ .. list-table::
9
+ :header-rows: 1
10
+
11
+ * - Model
12
+ - State
13
+ - Test Size
14
+ - :math:`AP^{val}`
15
+ - :math:`AP_{50}^{val}`
16
+ - :math:`AP_{75}^{val}`
17
+ - Param.
18
+ - FLOPs
19
+ * - `YOLOv7 <https://github.com/WongKinYiu/YOLO/releases/download/v1.0-alpha/v7.pt>`_
20
+ - πŸ”§
21
+ - 640
22
+ - **51.4%**
23
+ - **69.7%**
24
+ - **55.9%**
25
+ -
26
+ -
27
+ * - `YOLOv7-X <URL>`_
28
+ - πŸ”§
29
+ - 640
30
+ - **53.1%**
31
+ - **71.2%**
32
+ - **57.8%**
33
+ -
34
+ -
35
+ * - `YOLOv7-W6 <URL>`_
36
+ - πŸ”§
37
+ - 1280
38
+ - **54.9%**
39
+ - **72.6%**
40
+ - **60.1%**
41
+ -
42
+ -
43
+ * - `YOLOv7-E6 <URL>`_
44
+ - πŸ”§
45
+ - 1280
46
+ - **56.0%**
47
+ - **73.5%**
48
+ - **61.2%**
49
+ -
50
+ -
51
+ * - `YOLOv7-D6 <URL>`_
52
+ - πŸ”§
53
+ - 1280
54
+ - **56.6%**
55
+ - **74.0%**
56
+ - **61.8%**
57
+ -
58
+ -
59
+ * - `YOLOv7-E6E <URL>`_
60
+ - πŸ”§
61
+ - 1280
62
+ - **56.8%**
63
+ - **74.4%**
64
+ - **62.1%**
65
+ -
66
+ -
67
+
68
+ YOLOv9
69
+ ~~~~~~
70
+
71
+ .. list-table::
72
+ :header-rows: 1
73
+
74
+ * - Model
75
+ - State
76
+ - Test Size
77
+ - :math:`AP^{val}`
78
+ - :math:`AP_{50}^{val}`
79
+ - :math:`AP_{75}^{val}`
80
+ - Param.
81
+ - FLOPs
82
+ * - `YOLOv9-T <https://github.com/WongKinYiu/YOLO/releases/download/v1.0-alpha/v9-t.pt>`_
83
+ - πŸ”§
84
+ - 640
85
+ -
86
+ -
87
+ -
88
+ -
89
+ -
90
+ * - `YOLOv9-S <https://github.com/WongKinYiu/YOLO/releases/download/v1.0-alpha/v9-s.pt>`_
91
+ - βœ…
92
+ - 640
93
+ - **46.8%**
94
+ - **63.4%**
95
+ - **50.7%**
96
+ - **7.1M**
97
+ - **26.4G**
98
+ * - `YOLOv9-M <https://github.com/WongKinYiu/YOLO/releases/download/v1.0-alpha/v9-m.pt>`_
99
+ - βœ…
100
+ - 640
101
+ - **51.4%**
102
+ - **68.1%**
103
+ - **56.1%**
104
+ - **20.0M**
105
+ - **76.3G**
106
+ * - `YOLOv9-C <https://github.com/WongKinYiu/YOLO/releases/download/v1.0-alpha/v9-c.pt>`_
107
+ - βœ…
108
+ - 640
109
+ - **53.0%**
110
+ - **70.2%**
111
+ - **57.8%**
112
+ - **25.3M**
113
+ - **102.1G**
114
+ * - `YOLOv9-E <https://github.com/WongKinYiu/YOLO/releases/download/v1.0-alpha/v9-e.pt>`_
115
+ - πŸ”§
116
+ - 640
117
+ - **55.6%**
118
+ - **72.8%**
119
+ - **60.6%**
120
+ - **57.3M**
121
+ - **189.0G**
122
+
123
+
124
+
125
+
126
+ .. mermaid::
127
+
128
+ graph LR
129
+ subgraph BackBone
130
+ B1-->B2;
131
+ B2-->B3;
132
+ B3-->B4;
133
+ B4-->B5;
134
+ end
135
+
136
+ subgraph FPN
137
+ B3-->N3;
138
+ B4-->N4;
139
+ B5-->N5;
140
+ N5-->N4;
141
+ N4-->N3;
142
+ end
143
+
144
+ subgraph PAN
145
+ P3-->P4;
146
+ P4-->P5;
147
+ N3-->P3;
148
+ N4-->P4;
149
+ N5-->P5;
150
+ end
151
+
152
+ P3-->Main_Head;
153
+ P4-->Main_Head;
154
+ P5-->Main_Head;
155
+
156
+ subgraph Aux
157
+ B3-->R3;
158
+ B4-->R4;
159
+ B5-->R5;
160
+ R3-->A3;
161
+ R4-->A3;
162
+ R4-->A4;
163
+ R5-->A3;
164
+ R5-->A4;
165
+ R5-->A5;
166
+ end
167
+ A3-->Auxiliary_Head;
168
+ A4-->Auxiliary_Head;
169
+ A5-->Auxiliary_Head;
docs/2_model_zoo/1_segmentation.rst ADDED
@@ -0,0 +1,11 @@
 
 
 
 
 
 
 
 
 
 
 
 
1
+ Segmentations
2
+ =============
3
+ .. _YOLOv7-seg:
4
+
5
+ YOLOv7
6
+ ------
7
+
8
+ .. _YOLOv9-seg:
9
+
10
+ YOLOv9
11
+ ------
docs/2_model_zoo/2_classification.rst ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ Classification
2
+ ==============
3
+
4
+ [WIP]
docs/3_custom/0_model.rst ADDED
@@ -0,0 +1,12 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ Model
2
+ =====
3
+
4
+ Modified Architecture
5
+ ---------------------
6
+
7
+
8
+
9
+
10
+
11
+ Modified Model Module
12
+ ---------------------
docs/3_custom/1_data_augment.rst ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ .. _DataAugment:
2
+
3
+ Data Augment
4
+ ============
docs/3_custom/2_loss.rst ADDED
@@ -0,0 +1,2 @@
 
 
 
1
+ Loss Function
2
+ =============
docs/3_custom/3_task.rst ADDED
@@ -0,0 +1,2 @@
 
 
 
1
+ Custom Task
2
+ ===========
docs/4_deploy/1_deploy.rst ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ .. _Deploy:
2
+
3
+ Deploy Model
4
+ ============
5
+
6
+ Deploy YOLOv9
7
+ -------------
8
+
9
+ Deploy YOLOv7
10
+ -------------
docs/4_deploy/2_onnx.rst ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ .. _ONNX:
2
+
3
+ Compile to ONNX
4
+ ===============
docs/4_deploy/3_tensorrt.rst ADDED
@@ -0,0 +1,5 @@
 
 
 
 
 
 
1
+ .. _TensorRT:
2
+
3
+
4
+ Compile to TensorRT
5
+ ===================
docs/5_features/0_small_object.rst ADDED
@@ -0,0 +1,2 @@
 
 
 
1
+ Small Object
2
+ ============
docs/5_features/1_version_convert.rst ADDED
@@ -0,0 +1,2 @@
 
 
 
1
+ Version Convert
2
+ ===============
docs/5_features/2_IPython.rst ADDED
@@ -0,0 +1,2 @@
 
 
 
1
+ IPython
2
+ =======
docs/6_function_docs/0_solver.rst ADDED
@@ -0,0 +1,12 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ Solver
2
+ ======
3
+
4
+ .. automodule:: yolo.tools.solver
5
+ :members:
6
+ :undoc-members:
7
+ :show-inheritance:
8
+
9
+ .. automodule:: yolo.utils.bounding_box_utils
10
+ :members:
11
+ :undoc-members:
12
+ :show-inheritance:
docs/6_function_docs/1_tools.rst ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ .. _Tools:
2
+
3
+ Useful Tools
4
+ ============
docs/6_function_docs/2_module.rst ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ .. _Module:
2
+
3
+ Model Module
4
+ ============
docs/6_function_docs/3_config.rst ADDED
@@ -0,0 +1,188 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ Config
2
+ ======
3
+
4
+
5
+
6
+ .. autoclass:: yolo.config.config.Config
7
+ :members:
8
+ :undoc-members:
9
+
10
+ .. automodule:: yolo.config.config
11
+ :members:
12
+ :undoc-members:
13
+
14
+
15
+
16
+ .. mermaid::
17
+
18
+ classDiagram
19
+ class AnchorConfig {
20
+ List~int~ strides
21
+ Optional~int~ reg_max
22
+ Optional~int~ anchor_num
23
+ List~List~int~~ anchor
24
+ }
25
+
26
+ class LayerConfig {
27
+ Dict args
28
+ Union~List~int~~ source
29
+ str tags
30
+ }
31
+
32
+ class BlockConfig {
33
+ List~Dict~LayerConfig~~ block
34
+ }
35
+
36
+ class ModelConfig {
37
+ Optional~str~ name
38
+ AnchorConfig anchor
39
+ Dict~BlockConfig~ model
40
+ }
41
+
42
+ AnchorConfig --> ModelConfig
43
+ LayerConfig --> BlockConfig
44
+ BlockConfig --> ModelConfig
45
+
46
+ .. mermaid::
47
+
48
+ classDiagram
49
+ class DownloadDetail {
50
+ str url
51
+ int file_size
52
+ }
53
+
54
+ class DownloadOptions {
55
+ Dict~DownloadDetail~ details
56
+ }
57
+
58
+ class DatasetConfig {
59
+ str path
60
+ int class_num
61
+ List~str~ class_list
62
+ Optional~DownloadOptions~ auto_download
63
+ }
64
+
65
+ class DataConfig {
66
+ bool shuffle
67
+ int batch_size
68
+ bool pin_memory
69
+ int cpu_num
70
+ List~int~ image_size
71
+ Dict~int~ data_augment
72
+ Optional~Union~str~~ source
73
+ }
74
+
75
+ DownloadDetail --> DownloadOptions
76
+ DownloadOptions --> DatasetConfig
77
+
78
+ .. mermaid::
79
+
80
+ classDiagram
81
+ class OptimizerArgs {
82
+ float lr
83
+ float weight_decay
84
+ }
85
+
86
+ class OptimizerConfig {
87
+ str type
88
+ OptimizerArgs args
89
+ }
90
+
91
+ class MatcherConfig {
92
+ str iou
93
+ int topk
94
+ Dict~str~ factor
95
+ }
96
+
97
+ class LossConfig {
98
+ Dict~str~ objective
99
+ Union~bool~ aux
100
+ MatcherConfig matcher
101
+ }
102
+
103
+ class SchedulerConfig {
104
+ str type
105
+ Dict~str~ warmup
106
+ Dict~str~ args
107
+ }
108
+
109
+ class EMAConfig {
110
+ bool enabled
111
+ float decay
112
+ }
113
+
114
+ class TrainConfig {
115
+ str task
116
+ int epoch
117
+ DataConfig data
118
+ OptimizerConfig optimizer
119
+ LossConfig loss
120
+ SchedulerConfig scheduler
121
+ EMAConfig ema
122
+ ValidationConfig validation
123
+ }
124
+
125
+ class NMSConfig {
126
+ int min_confidence
127
+ int min_iou
128
+ }
129
+
130
+ class InferenceConfig {
131
+ str task
132
+ NMSConfig nms
133
+ DataConfig data
134
+ Optional~None~ fast_inference
135
+ bool save_predict
136
+ }
137
+
138
+ class ValidationConfig {
139
+ str task
140
+ NMSConfig nms
141
+ DataConfig data
142
+ }
143
+
144
+ OptimizerArgs --> OptimizerConfig
145
+ OptimizerConfig --> TrainConfig
146
+ MatcherConfig --> LossConfig
147
+ LossConfig --> TrainConfig
148
+ SchedulerConfig --> TrainConfig
149
+ EMAConfig --> TrainConfig
150
+ NMSConfig --> InferenceConfig
151
+ NMSConfig --> ValidationConfig
152
+
153
+
154
+ .. mermaid::
155
+
156
+ classDiagram
157
+ class GeneralConfig {
158
+ str name
159
+ Union~str~ device
160
+ int cpu_num
161
+ List~int~ class_idx_id
162
+ List~int~ image_size
163
+ str out_path
164
+ bool exist_ok
165
+ int lucky_number
166
+ bool use_wandb
167
+ bool use_TensorBoard
168
+ Optional~str~ weight
169
+ }
170
+
171
+ .. mermaid::
172
+
173
+ classDiagram
174
+ class Config {
175
+ Union~ValidationConfig~ task
176
+ DatasetConfig dataset
177
+ ModelConfig model
178
+ GeneralConfig model
179
+ }
180
+
181
+ DatasetConfig --> Config
182
+ DataConfig --> TrainConfig
183
+ DataConfig --> InferenceConfig
184
+ DataConfig --> ValidationConfig
185
+ InferenceConfig --> Config
186
+ ValidationConfig --> Config
187
+ TrainConfig --> Config
188
+ GeneralConfig --> Config
docs/6_function_docs/4_dataloader.rst ADDED
@@ -0,0 +1,8 @@
 
 
 
 
 
 
 
 
 
1
+ Dataloader
2
+ ==========
3
+
4
+
5
+
6
+ .. automodule:: yolo.tools.data_loader
7
+ :members:
8
+ :undoc-members:
docs/MODELS.md DELETED
@@ -1,30 +0,0 @@
1
- # YOLO Model Zoo
2
-
3
- Welcome to the YOLOv9 Model Zoo! Here, you will find a variety of pre-trained models tailored to different use cases and performance needs. Each model comes with detailed information about its training regime, performance metrics, and usage instructions.
4
-
5
- ## Standard Models
6
-
7
- These models are trained on common datasets like COCO and provide a balance between speed and accuracy.
8
-
9
-
10
- | Model | Support? |Test Size | AP<sup>val</sup> | AP<sub>50</sub><sup>val</sup> | AP<sub>75</sub><sup>val</sup> | Param. | FLOPs |
11
- | :-- | :-: | :-: | :-: | :-: | :-: | :-: | :-: |
12
- | [**YOLOv9-S**]() |βœ… | 640 | **46.8%** | **63.4%** | **50.7%** | **7.1M** | **26.4G** |
13
- | [**YOLOv9-M**]() |βœ… | 640 | **51.4%** | **68.1%** | **56.1%** | **20.0M** | **76.3G** |
14
- | [**YOLOv9-C**]() |βœ… | 640 | **53.0%** | **70.2%** | **57.8%** | **25.3M** | **102.1G** |
15
- | [**YOLOv9-E**]() | πŸ”§ | 640 | **55.6%** | **72.8%** | **60.6%** | **57.3M** | **189.0G** |
16
- | | | | | | | |
17
- | [**YOLOv7**]() |πŸ”§ | 640 | **51.4%** | **69.7%** | **55.9%** |
18
- | [**YOLOv7-X**]() |πŸ”§ | 640 | **53.1%** | **71.2%** | **57.8%** |
19
- | [**YOLOv7-W6**]() | πŸ”§ | 1280 | **54.9%** | **72.6%** | **60.1%** |
20
- | [**YOLOv7-E6**]() | πŸ”§ | 1280 | **56.0%** | **73.5%** | **61.2%** |
21
- | [**YOLOv7-D6**]() | πŸ”§ | 1280 | **56.6%** | **74.0%** | **61.8%** |
22
- | [**YOLOv7-E6E**]() | πŸ”§ | 1280 | **56.8%** | **74.4%** | **62.1%** |
23
-
24
- ## Download and Usage Instructions
25
-
26
- To use these models, download them from the links provided and use the following command to run detection:
27
-
28
- ```bash
29
- $yolo detect weights=path/to/model.pt img=640 conf=0.25 source=your_image.jpg
30
- ```
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
docs/Makefile ADDED
@@ -0,0 +1,20 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Minimal makefile for Sphinx documentation
2
+ #
3
+
4
+ # You can set these variables from the command line, and also
5
+ # from the environment for the first two.
6
+ SPHINXOPTS ?=
7
+ SPHINXBUILD ?= sphinx-build
8
+ SOURCEDIR = .
9
+ BUILDDIR = _build
10
+
11
+ # Put it first so that "make" without argument is like "make help".
12
+ help:
13
+ @$(SPHINXBUILD) -M help "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)
14
+
15
+ .PHONY: help Makefile
16
+
17
+ # Catch-all target: route all unknown targets to Sphinx using the new
18
+ # "make mode" option. $(O) is meant as a shortcut for $(SPHINXOPTS).
19
+ %: Makefile
20
+ @$(SPHINXBUILD) -M $@ "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)
docs/conf.py ADDED
@@ -0,0 +1,50 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Configuration file for the Sphinx documentation builder.
2
+ #
3
+ # For the full list of built-in configuration values, see the documentation:
4
+ # https://www.sphinx-doc.org/en/master/usage/configuration.html
5
+
6
+ # -- Project information -----------------------------------------------------
7
+ # https://www.sphinx-doc.org/en/master/usage/configuration.html#project-information
8
+
9
+ project = "YOLO-docs"
10
+ copyright = "2024, Kin-Yiu, Wong and Hao-Tang, Tsui"
11
+ author = "Kin-Yiu, Wong and Hao-Tang, Tsui"
12
+
13
+ # -- General configuration ---------------------------------------------------
14
+ # https://www.sphinx-doc.org/en/master/usage/configuration.html#general-configuration
15
+
16
+ import os
17
+ import sys
18
+
19
+ sys.path.insert(0, os.path.abspath(".."))
20
+
21
+ extensions = [
22
+ "sphinx_rtd_theme",
23
+ "sphinx_tabs.tabs",
24
+ "sphinxcontrib.mermaid",
25
+ "sphinx.ext.autodoc",
26
+ "sphinx.ext.autosectionlabel",
27
+ "sphinx.ext.viewcode",
28
+ "sphinx.ext.napoleon",
29
+ "linuxdoc.rstFlatTable",
30
+ "myst_parser",
31
+ ]
32
+
33
+ myst_enable_extensions = [
34
+ "dollarmath",
35
+ "amsmath",
36
+ "deflist",
37
+ ]
38
+ html_theme = "sphinx_rtd_theme"
39
+ html_theme_options = {
40
+ "sticky_navigation": False,
41
+ }
42
+
43
+ templates_path = ["_templates"]
44
+ exclude_patterns = ["_build", "Thumbs.db", ".DS_Store"]
45
+
46
+
47
+ # -- Options for HTML output -------------------------------------------------
48
+ # https://www.sphinx-doc.org/en/master/usage/configuration.html#options-for-html-output
49
+
50
+ html_static_path = ["_static"]
docs/index.rst ADDED
@@ -0,0 +1,90 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ YOLO documentation
2
+ =======================
3
+
4
+ Introduction
5
+ ------------
6
+
7
+ YOLO (You Only Look Once) is a state-of-the-art, real-time object detection system that is designed for both efficiency and accuracy. This documentation provides comprehensive guidance on how to set up, configure, and effectively use YOLO for object detection tasks.
8
+
9
+ **Note: This project and some sections of this documentation are currently a work in progress.**
10
+
11
+ Project Features
12
+ ----------------
13
+
14
+ - **Real-time Processing**: YOLO can process images in real-time with high accuracy, making it suitable for applications that require instant detection.
15
+ - **Multitasking Capabilities**: Our enhanced version of YOLO supports multitasking, allowing it to handle multiple object detection tasks simultaneously.
16
+ - **Open Source**: YOLO is open source, released under the MIT License, encouraging a broad community of developers to contribute and build upon the existing framework.
17
+
18
+ Documentation Contents
19
+ ----------------------
20
+
21
+ Explore our documentation:
22
+
23
+
24
+ .. toctree::
25
+ :maxdepth: 1
26
+ :caption: Get Started
27
+
28
+ 0_get_start/0_quick_start
29
+ 0_get_start/1_introduction
30
+ 0_get_start/2_installations
31
+
32
+ .. toctree::
33
+ :maxdepth: 1
34
+ :caption: Tutorials
35
+
36
+ 1_tutorials/0_allIn1
37
+ 1_tutorials/1_setup
38
+ 1_tutorials/2_buildmodel
39
+ 1_tutorials/3_dataset
40
+ 1_tutorials/4_train
41
+ 1_tutorials/5_inference
42
+
43
+
44
+ .. toctree::
45
+ :maxdepth: 1
46
+ :caption: Model Zoo
47
+
48
+ 2_model_zoo/0_object_detection
49
+ 2_model_zoo/1_segmentation
50
+ 2_model_zoo/2_classification
51
+
52
+ .. toctree::
53
+ :maxdepth: 1
54
+ :caption: Custom YOLO
55
+
56
+ 3_custom/0_model
57
+ 3_custom/1_data_augment
58
+ 3_custom/2_loss
59
+ 3_custom/3_task
60
+
61
+
62
+ .. toctree::
63
+ :maxdepth: 1
64
+ :caption: Deploy
65
+
66
+ 4_deploy/1_deploy
67
+ 4_deploy/2_onnx
68
+ 4_deploy/3_tensorrt
69
+
70
+
71
+ .. toctree::
72
+ :maxdepth: 1
73
+ :caption: Features
74
+
75
+ 5_features/0_small_object
76
+ 5_features/1_version_convert
77
+ 5_features/2_IPython
78
+
79
+ .. toctree::
80
+ :maxdepth: 1
81
+ :caption: Function Docs
82
+
83
+ 6_function_docs/0_solver
84
+ 6_function_docs/1_tools
85
+ 6_function_docs/2_module
86
+
87
+ License
88
+ -------
89
+
90
+ YOLO is provided under the MIT License, which allows extensive freedom for reuse and distribution. See the LICENSE file for full license text.
docs/make.bat ADDED
@@ -0,0 +1,35 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ @ECHO OFF
2
+
3
+ pushd %~dp0
4
+
5
+ REM Command file for Sphinx documentation
6
+
7
+ if "%SPHINXBUILD%" == "" (
8
+ set SPHINXBUILD=sphinx-build
9
+ )
10
+ set SOURCEDIR=.
11
+ set BUILDDIR=_build
12
+
13
+ %SPHINXBUILD% >NUL 2>NUL
14
+ if errorlevel 9009 (
15
+ echo.
16
+ echo.The 'sphinx-build' command was not found. Make sure you have Sphinx
17
+ echo.installed, then set the SPHINXBUILD environment variable to point
18
+ echo.to the full path of the 'sphinx-build' executable. Alternatively you
19
+ echo.may add the Sphinx directory to PATH.
20
+ echo.
21
+ echo.If you don't have Sphinx installed, grab it from
22
+ echo.https://www.sphinx-doc.org/
23
+ exit /b 1
24
+ )
25
+
26
+ if "%1" == "" goto help
27
+
28
+ %SPHINXBUILD% -M %1 %SOURCEDIR% %BUILDDIR% %SPHINXOPTS% %O%
29
+ goto end
30
+
31
+ :help
32
+ %SPHINXBUILD% -M help %SOURCEDIR% %BUILDDIR% %SPHINXOPTS% %O%
33
+
34
+ :end
35
+ popd
docs/requirements.txt ADDED
@@ -0,0 +1,6 @@
 
 
 
 
 
 
 
1
+ myst-parser
2
+ linuxdoc
3
+ sphinx
4
+ sphinx-tabs
5
+ sphinx_rtd_theme
6
+ sphinxcontrib-mermaid
examples/notebook_TensorRT.ipynb CHANGED
@@ -18,7 +18,15 @@
18
  "project_root = Path().resolve().parent\n",
19
  "sys.path.append(str(project_root))\n",
20
  "\n",
21
- "from yolo import AugmentationComposer, bbox_nms, create_model, custom_logger, draw_bboxes, Vec2Box\n",
 
 
 
 
 
 
 
 
22
  "from yolo.config.config import NMSConfig"
23
  ]
24
  },
@@ -49,6 +57,8 @@
49
  "metadata": {},
50
  "outputs": [],
51
  "source": [
 
 
52
  "if os.path.exists(TRT_WEIGHT_PATH):\n",
53
  " from torch2trt import TRTModule\n",
54
  "\n",
@@ -57,8 +67,6 @@
57
  "else:\n",
58
  " from torch2trt import torch2trt\n",
59
  "\n",
60
- " with open(MODEL_CONFIG) as stream:\n",
61
- " cfg_model = OmegaConf.load(stream)\n",
62
  "\n",
63
  " model = create_model(cfg_model, weight_path=WEIGHT_PATH)\n",
64
  " model = model.to(device).eval()\n",
@@ -70,7 +78,7 @@
70
  " logger.info(f\"πŸ“₯ TensorRT model saved to oonx.pt\")\n",
71
  "\n",
72
  "transform = AugmentationComposer([], IMAGE_SIZE)\n",
73
- "vec2box = Vec2Box(model_trt, IMAGE_SIZE, device)\n"
74
  ]
75
  },
76
  {
@@ -79,7 +87,7 @@
79
  "metadata": {},
80
  "outputs": [],
81
  "source": [
82
- "image, bbox = transform(image, torch.zeros(0, 5))\n",
83
  "image = image.to(device)[None]"
84
  ]
85
  },
@@ -91,7 +99,7 @@
91
  "source": [
92
  "with torch.no_grad():\n",
93
  " predict = model_trt(image)\n",
94
- " predict = vec2box(predict[\"Main\"])\n",
95
  "predict_box = bbox_nms(predict[0], predict[2], NMSConfig(0.5, 0.5))\n",
96
  "draw_bboxes(image, predict_box)"
97
  ]
@@ -122,7 +130,7 @@
122
  "name": "python",
123
  "nbconvert_exporter": "python",
124
  "pygments_lexer": "ipython3",
125
- "version": "3.1.undefined"
126
  }
127
  },
128
  "nbformat": 4,
 
18
  "project_root = Path().resolve().parent\n",
19
  "sys.path.append(str(project_root))\n",
20
  "\n",
21
+ "from yolo import (\n",
22
+ " AugmentationComposer, \n",
23
+ " bbox_nms, \n",
24
+ " create_model, \n",
25
+ " custom_logger, \n",
26
+ " create_converter,\n",
27
+ " draw_bboxes, \n",
28
+ " Vec2Box\n",
29
+ ")\n",
30
  "from yolo.config.config import NMSConfig"
31
  ]
32
  },
 
57
  "metadata": {},
58
  "outputs": [],
59
  "source": [
60
+ "with open(MODEL_CONFIG) as stream:\n",
61
+ " cfg_model = OmegaConf.load(stream)\n",
62
  "if os.path.exists(TRT_WEIGHT_PATH):\n",
63
  " from torch2trt import TRTModule\n",
64
  "\n",
 
67
  "else:\n",
68
  " from torch2trt import torch2trt\n",
69
  "\n",
 
 
70
  "\n",
71
  " model = create_model(cfg_model, weight_path=WEIGHT_PATH)\n",
72
  " model = model.to(device).eval()\n",
 
78
  " logger.info(f\"πŸ“₯ TensorRT model saved to oonx.pt\")\n",
79
  "\n",
80
  "transform = AugmentationComposer([], IMAGE_SIZE)\n",
81
+ "converter = create_converter(cfg_model.name, model_trt, cfg_model.anchor, IMAGE_SIZE, device)\n"
82
  ]
83
  },
84
  {
 
87
  "metadata": {},
88
  "outputs": [],
89
  "source": [
90
+ "image, bbox, rev_tensor = transform(image, torch.zeros(0, 5))\n",
91
  "image = image.to(device)[None]"
92
  ]
93
  },
 
99
  "source": [
100
  "with torch.no_grad():\n",
101
  " predict = model_trt(image)\n",
102
+ " predict = converter(predict[\"Main\"])\n",
103
  "predict_box = bbox_nms(predict[0], predict[2], NMSConfig(0.5, 0.5))\n",
104
  "draw_bboxes(image, predict_box)"
105
  ]
 
130
  "name": "python",
131
  "nbconvert_exporter": "python",
132
  "pygments_lexer": "ipython3",
133
+ "version": "3.10.14"
134
  }
135
  },
136
  "nbformat": 4,
examples/notebook_inference.ipynb CHANGED
@@ -1,5 +1,15 @@
1
  {
2
  "cells": [
 
 
 
 
 
 
 
 
 
 
3
  {
4
  "cell_type": "code",
5
  "execution_count": null,
@@ -35,7 +45,7 @@
35
  "source": [
36
  "CONFIG_PATH = \"../yolo/config\"\n",
37
  "CONFIG_NAME = \"config\"\n",
38
- "MODEL = \"v7-base\"\n",
39
  "\n",
40
  "DEVICE = 'cuda:0'\n",
41
  "CLASS_NUM = 80\n",
@@ -54,7 +64,9 @@
54
  "with initialize(config_path=CONFIG_PATH, version_base=None, job_name=\"notebook_job\"):\n",
55
  " cfg: Config = compose(config_name=CONFIG_NAME, overrides=[\"task=inference\", f\"task.data.source={IMAGE_PATH}\", f\"model={MODEL}\"])\n",
56
  " model = create_model(cfg.model, class_num=CLASS_NUM).to(device)\n",
 
57
  " transform = AugmentationComposer([], cfg.image_size)\n",
 
58
  " converter = create_converter(cfg.model.name, model, cfg.model.anchor, cfg.image_size, device)\n",
59
  " post_proccess = PostProccess(converter, cfg.task.nms)"
60
  ]
@@ -81,7 +93,7 @@
81
  " predict = model(image)\n",
82
  " pred_bbox = post_proccess(predict, rev_tensor)\n",
83
  "\n",
84
- "draw_bboxes(pil_image, pred_bbox, idx2label=cfg.class_list)"
85
  ]
86
  },
87
  {
@@ -92,6 +104,11 @@
92
  "\n",
93
  "![image](../demo/images/output/visualize.png)"
94
  ]
 
 
 
 
 
95
  }
96
  ],
97
  "metadata": {
 
1
  {
2
  "cells": [
3
+ {
4
+ "cell_type": "code",
5
+ "execution_count": null,
6
+ "metadata": {},
7
+ "outputs": [],
8
+ "source": [
9
+ "%load_ext autoreload\n",
10
+ "%autoreload 2"
11
+ ]
12
+ },
13
  {
14
  "cell_type": "code",
15
  "execution_count": null,
 
45
  "source": [
46
  "CONFIG_PATH = \"../yolo/config\"\n",
47
  "CONFIG_NAME = \"config\"\n",
48
+ "MODEL = \"v9-c\"\n",
49
  "\n",
50
  "DEVICE = 'cuda:0'\n",
51
  "CLASS_NUM = 80\n",
 
64
  "with initialize(config_path=CONFIG_PATH, version_base=None, job_name=\"notebook_job\"):\n",
65
  " cfg: Config = compose(config_name=CONFIG_NAME, overrides=[\"task=inference\", f\"task.data.source={IMAGE_PATH}\", f\"model={MODEL}\"])\n",
66
  " model = create_model(cfg.model, class_num=CLASS_NUM).to(device)\n",
67
+ "\n",
68
  " transform = AugmentationComposer([], cfg.image_size)\n",
69
+ "\n",
70
  " converter = create_converter(cfg.model.name, model, cfg.model.anchor, cfg.image_size, device)\n",
71
  " post_proccess = PostProccess(converter, cfg.task.nms)"
72
  ]
 
93
  " predict = model(image)\n",
94
  " pred_bbox = post_proccess(predict, rev_tensor)\n",
95
  "\n",
96
+ "draw_bboxes(pil_image, pred_bbox, idx2label=cfg.dataset.class_list)"
97
  ]
98
  },
99
  {
 
104
  "\n",
105
  "![image](../demo/images/output/visualize.png)"
106
  ]
107
+ },
108
+ {
109
+ "cell_type": "markdown",
110
+ "metadata": {},
111
+ "source": []
112
  }
113
  ],
114
  "metadata": {
examples/notebook_smallobject.ipynb CHANGED
@@ -30,7 +30,17 @@
30
  "project_root = Path().resolve().parent\n",
31
  "sys.path.append(str(project_root))\n",
32
  "\n",
33
- "from yolo import AugmentationComposer, bbox_nms, Config, create_model, custom_logger, draw_bboxes, Vec2Box, NMSConfig, PostProccess"
 
 
 
 
 
 
 
 
 
 
34
  ]
35
  },
36
  {
@@ -62,8 +72,8 @@
62
  " cfg: Config = compose(config_name=CONFIG_NAME, overrides=[\"task=inference\", f\"task.data.source={IMAGE_PATH}\", f\"model={MODEL}\"])\n",
63
  " model = create_model(cfg.model, class_num=CLASS_NUM).to(device)\n",
64
  " transform = AugmentationComposer([], cfg.image_size)\n",
65
- " vec2box = Vec2Box(model, cfg.image_size, device)\n",
66
- " post_proccess = PostProccess(vec2box, NMSConfig(0.5, 0.9))\n",
67
  " "
68
  ]
69
  },
@@ -112,7 +122,7 @@
112
  "with torch.no_grad():\n",
113
  " total_image, total_shift = slide_image(image)\n",
114
  " predict = model(total_image)\n",
115
- " pred_class, _, pred_bbox = vec2box(predict[\"Main\"])\n",
116
  "pred_bbox[1:] = (pred_bbox[1: ] + total_shift[:, None]) / SLIDE\n",
117
  "pred_bbox = pred_bbox.view(1, -1, 4)\n",
118
  "pred_class = pred_class.view(1, -1, 80)\n",
@@ -126,7 +136,7 @@
126
  "metadata": {},
127
  "outputs": [],
128
  "source": [
129
- "draw_bboxes(pil_image, predict_box, idx2label=cfg.class_list)"
130
  ]
131
  },
132
  {
 
30
  "project_root = Path().resolve().parent\n",
31
  "sys.path.append(str(project_root))\n",
32
  "\n",
33
+ "from yolo import (\n",
34
+ " AugmentationComposer, \n",
35
+ " Config, \n",
36
+ " NMSConfig, \n",
37
+ " PostProccess,\n",
38
+ " bbox_nms, \n",
39
+ " create_model, \n",
40
+ " create_converter, \n",
41
+ " custom_logger, \n",
42
+ " draw_bboxes, \n",
43
+ ")"
44
  ]
45
  },
46
  {
 
72
  " cfg: Config = compose(config_name=CONFIG_NAME, overrides=[\"task=inference\", f\"task.data.source={IMAGE_PATH}\", f\"model={MODEL}\"])\n",
73
  " model = create_model(cfg.model, class_num=CLASS_NUM).to(device)\n",
74
  " transform = AugmentationComposer([], cfg.image_size)\n",
75
+ " converter = create_converter(cfg.model.name, model, cfg.model.anchor, cfg.image_size, device)\n",
76
+ " post_proccess = PostProccess(converter, NMSConfig(0.5, 0.9))\n",
77
  " "
78
  ]
79
  },
 
122
  "with torch.no_grad():\n",
123
  " total_image, total_shift = slide_image(image)\n",
124
  " predict = model(total_image)\n",
125
+ " pred_class, _, pred_bbox = converter(predict[\"Main\"])\n",
126
  "pred_bbox[1:] = (pred_bbox[1: ] + total_shift[:, None]) / SLIDE\n",
127
  "pred_bbox = pred_bbox.view(1, -1, 4)\n",
128
  "pred_class = pred_class.view(1, -1, 80)\n",
 
136
  "metadata": {},
137
  "outputs": [],
138
  "source": [
139
+ "draw_bboxes(pil_image, predict_box, idx2label=cfg.dataset.class_list)"
140
  ]
141
  },
142
  {
examples/sample_inference.py CHANGED
@@ -2,29 +2,39 @@ import sys
2
  from pathlib import Path
3
 
4
  import hydra
5
- import torch
6
 
7
  project_root = Path(__file__).resolve().parent.parent
8
  sys.path.append(str(project_root))
9
 
10
- from yolo.config.config import Config
11
- from yolo.model.yolo import create_model
12
- from yolo.tools.data_loader import create_dataloader
13
- from yolo.tools.solver import ModelTester
14
- from yolo.utils.logging_utils import custom_logger, validate_log_directory
15
 
 
 
 
 
 
 
 
 
 
 
16
 
17
- @hydra.main(config_path="../yolo/config", config_name="config", version_base=None)
18
- def main(cfg: Config):
19
- custom_logger()
20
- save_path = validate_log_directory(cfg, cfg.name)
21
- dataloader = create_dataloader(cfg)
22
-
23
- device = torch.device(cfg.device)
24
- model = create_model(cfg).to(device)
25
 
26
- tester = ModelTester(cfg, model, save_path, device)
27
- tester.solve(dataloader)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
28
 
29
 
30
  if __name__ == "__main__":
 
2
  from pathlib import Path
3
 
4
  import hydra
 
5
 
6
  project_root = Path(__file__).resolve().parent.parent
7
  sys.path.append(str(project_root))
8
 
 
 
 
 
 
9
 
10
+ from yolo import (
11
+ Config,
12
+ FastModelLoader,
13
+ ModelTester,
14
+ ProgressLogger,
15
+ create_converter,
16
+ create_dataloader,
17
+ create_model,
18
+ )
19
+ from yolo.utils.model_utils import get_device
20
 
 
 
 
 
 
 
 
 
21
 
22
+ @hydra.main(config_path="config", config_name="config", version_base=None)
23
+ def main(cfg: Config):
24
+ progress = ProgressLogger(cfg, exp_name=cfg.name)
25
+ device, use_ddp = get_device(cfg.device)
26
+ dataloader = create_dataloader(cfg.task.data, cfg.dataset, cfg.task.task, use_ddp)
27
+ if getattr(cfg.task, "fast_inference", False):
28
+ model = FastModelLoader(cfg).load_model(device)
29
+ else:
30
+ model = create_model(cfg.model, class_num=cfg.dataset.class_num, weight_path=cfg.weight)
31
+ model = model.to(device)
32
+
33
+ converter = create_converter(cfg.model.name, model, cfg.model.anchor, cfg.image_size, device)
34
+
35
+ solver = ModelTester(cfg, model, converter, progress, device)
36
+ progress.start()
37
+ solver.solve(dataloader)
38
 
39
 
40
  if __name__ == "__main__":
examples/sample_train.py CHANGED
@@ -2,29 +2,35 @@ import sys
2
  from pathlib import Path
3
 
4
  import hydra
5
- import torch
6
 
7
  project_root = Path(__file__).resolve().parent.parent
8
  sys.path.append(str(project_root))
9
 
10
- from yolo.config.config import Config
11
- from yolo.model.yolo import create_model
12
- from yolo.tools.data_loader import create_dataloader
13
- from yolo.tools.solver import ModelTrainer
14
- from yolo.utils.logging_utils import custom_logger, validate_log_directory
15
 
 
 
 
 
 
 
 
 
 
16
 
17
- @hydra.main(config_path="../yolo/config", config_name="config", version_base=None)
 
18
  def main(cfg: Config):
19
- custom_logger()
20
- save_path = validate_log_directory(cfg, cfg.name)
21
- dataloader = create_dataloader(cfg)
22
- # TODO: get_device or rank, for DDP mode
23
- device = torch.device(cfg.device)
24
- model = create_model(cfg).to(device)
25
-
26
- trainer = ModelTrainer(cfg, model, save_path, device)
27
- trainer.solve(dataloader, cfg.task.epoch)
 
 
28
 
29
 
30
  if __name__ == "__main__":
 
2
  from pathlib import Path
3
 
4
  import hydra
 
5
 
6
  project_root = Path(__file__).resolve().parent.parent
7
  sys.path.append(str(project_root))
8
 
 
 
 
 
 
9
 
10
+ from yolo import (
11
+ Config,
12
+ ModelTrainer,
13
+ ProgressLogger,
14
+ create_converter,
15
+ create_dataloader,
16
+ create_model,
17
+ )
18
+ from yolo.utils.model_utils import get_device
19
 
20
+
21
+ @hydra.main(config_path="config", config_name="config", version_base=None)
22
  def main(cfg: Config):
23
+ progress = ProgressLogger(cfg, exp_name=cfg.name)
24
+ device, use_ddp = get_device(cfg.device)
25
+ dataloader = create_dataloader(cfg.task.data, cfg.dataset, cfg.task.task, use_ddp)
26
+ model = create_model(cfg.model, class_num=cfg.dataset.class_num, weight_path=cfg.weight)
27
+ model = model.to(device)
28
+
29
+ converter = create_converter(cfg.model.name, model, cfg.model.anchor, cfg.image_size, device)
30
+
31
+ solver = ModelTrainer(cfg, model, converter, progress, device)
32
+ progress.start()
33
+ solver.solve(dataloader)
34
 
35
 
36
  if __name__ == "__main__":
requirements-dev.txt CHANGED
@@ -3,3 +3,4 @@ pytest
3
  pytest-cov
4
  pre-commit
5
  pycocotools
 
 
3
  pytest-cov
4
  pre-commit
5
  pycocotools
6
+ tensorboard
tests/test_tools/test_data_loader.py CHANGED
@@ -66,4 +66,4 @@ def test_directory_stream_data_loader_frame(directory_stream_data_loader: Stream
66
  frame, rev_tensor, origin_frame = next(iter(directory_stream_data_loader))
67
  assert frame.shape == (1, 3, 640, 640)
68
  assert rev_tensor.shape == (1, 5)
69
- assert origin_frame.size == (480, 640) or origin_frame.size == (512, 640)
 
66
  frame, rev_tensor, origin_frame = next(iter(directory_stream_data_loader))
67
  assert frame.shape == (1, 3, 640, 640)
68
  assert rev_tensor.shape == (1, 5)
69
+ assert origin_frame.size != (640, 640)
yolo/__init__.py CHANGED
@@ -5,12 +5,13 @@ from yolo.tools.drawer import draw_bboxes
5
  from yolo.tools.solver import ModelTester, ModelTrainer, ModelValidator
6
  from yolo.utils.bounding_box_utils import Anc2Box, Vec2Box, bbox_nms, create_converter
7
  from yolo.utils.deploy_utils import FastModelLoader
8
- from yolo.utils.logging_utils import custom_logger
9
  from yolo.utils.model_utils import PostProccess
10
 
11
  all = [
12
  "create_model",
13
  "Config",
 
14
  "NMSConfig",
15
  "custom_logger",
16
  "validate_log_directory",
 
5
  from yolo.tools.solver import ModelTester, ModelTrainer, ModelValidator
6
  from yolo.utils.bounding_box_utils import Anc2Box, Vec2Box, bbox_nms, create_converter
7
  from yolo.utils.deploy_utils import FastModelLoader
8
+ from yolo.utils.logging_utils import ProgressLogger, custom_logger
9
  from yolo.utils.model_utils import PostProccess
10
 
11
  all = [
12
  "create_model",
13
  "Config",
14
+ "ProgressLogger",
15
  "NMSConfig",
16
  "custom_logger",
17
  "validate_log_directory",
yolo/config/config.py CHANGED
@@ -45,6 +45,8 @@ class DownloadOptions:
45
  @dataclass
46
  class DatasetConfig:
47
  path: str
 
 
48
  auto_download: Optional[DownloadOptions]
49
 
50
 
@@ -142,9 +144,6 @@ class Config:
142
  device: Union[str, int, List[int]]
143
  cpu_num: int
144
 
145
- class_num: int
146
- class_list: List[str]
147
- class_idx_id: List[int]
148
  image_size: List[int]
149
 
150
  out_path: str
@@ -152,7 +151,7 @@ class Config:
152
 
153
  lucky_number: 10
154
  use_wandb: bool
155
- use_TensorBoard: bool
156
 
157
  weight: Optional[str]
158
 
 
45
  @dataclass
46
  class DatasetConfig:
47
  path: str
48
+ class_num: int
49
+ class_list: List[str]
50
  auto_download: Optional[DownloadOptions]
51
 
52
 
 
144
  device: Union[str, int, List[int]]
145
  cpu_num: int
146
 
 
 
 
147
  image_size: List[int]
148
 
149
  out_path: str
 
151
 
152
  lucky_number: 10
153
  use_wandb: bool
154
+ use_tensorboard: bool
155
 
156
  weight: Optional[str]
157
 
yolo/config/dataset/coco.yaml CHANGED
@@ -2,6 +2,9 @@ path: data/coco
2
  train: train2017
3
  validation: val2017
4
 
 
 
 
5
  auto_download:
6
  images:
7
  base_url: http://images.cocodataset.org/zips/
 
2
  train: train2017
3
  validation: val2017
4
 
5
+ class_num: 80
6
+ class_list: ['Person', 'Bicycle', 'Car', 'Motorcycle', 'Airplane', 'Bus', 'Train', 'Truck', 'Boat', 'Traffic light', 'Fire hydrant', 'Stop sign', 'Parking meter', 'Bench', 'Bird', 'Cat', 'Dog', 'Horse', 'Sheep', 'Cow', 'Elephant', 'Bear', 'Zebra', 'Giraffe', 'Backpack', 'Umbrella', 'Handbag', 'Tie', 'Suitcase', 'Frisbee', 'Skis', 'Snowboard', 'Sports ball', 'Kite', 'Baseball bat', 'Baseball glove', 'Skateboard', 'Surfboard', 'Tennis racket', 'Bottle', 'Wine glass', 'Cup', 'Fork', 'Knife', 'Spoon', 'Bowl', 'Banana', 'Apple', 'Sandwich', 'Orange', 'Broccoli', 'Carrot', 'Hot dog', 'Pizza', 'Donut', 'Cake', 'Chair', 'Couch', 'Potted plant', 'Bed', 'Dining table', 'Toilet', 'Tv', 'Laptop', 'Mouse', 'Remote', 'Keyboard', 'Cell phone', 'Microwave', 'Oven', 'Toaster', 'Sink', 'Refrigerator', 'Book', 'Clock', 'Vase', 'Scissors', 'Teddy bear', 'Hair drier', 'Toothbrush']
7
+
8
  auto_download:
9
  images:
10
  base_url: http://images.cocodataset.org/zips/
yolo/config/dataset/dev.yaml CHANGED
@@ -2,4 +2,7 @@ path: data/dev
2
  train: train
3
  validation: val
4
 
 
 
 
5
  auto_download:
 
2
  train: train
3
  validation: val
4
 
5
+ class_num: 80
6
+ class_list: ['Person', 'Bicycle', 'Car', 'Motorcycle', 'Airplane', 'Bus', 'Train', 'Truck', 'Boat', 'Traffic light', 'Fire hydrant', 'Stop sign', 'Parking meter', 'Bench', 'Bird', 'Cat', 'Dog', 'Horse', 'Sheep', 'Cow', 'Elephant', 'Bear', 'Zebra', 'Giraffe', 'Backpack', 'Umbrella', 'Handbag', 'Tie', 'Suitcase', 'Frisbee', 'Skis', 'Snowboard', 'Sports ball', 'Kite', 'Baseball bat', 'Baseball glove', 'Skateboard', 'Surfboard', 'Tennis racket', 'Bottle', 'Wine glass', 'Cup', 'Fork', 'Knife', 'Spoon', 'Bowl', 'Banana', 'Apple', 'Sandwich', 'Orange', 'Broccoli', 'Carrot', 'Hot dog', 'Pizza', 'Donut', 'Cake', 'Chair', 'Couch', 'Potted plant', 'Bed', 'Dining table', 'Toilet', 'Tv', 'Laptop', 'Mouse', 'Remote', 'Keyboard', 'Cell phone', 'Microwave', 'Oven', 'Toaster', 'Sink', 'Refrigerator', 'Book', 'Clock', 'Vase', 'Scissors', 'Teddy bear', 'Hair drier', 'Toothbrush']
7
+
8
  auto_download: