[YOLO-NAS]

DECI's Super Gradient

{QSP, QCI, Re-Parameterization, 8-Bit Quantization}

Paper:

Code: https://github.com/Deci-AI/super-gradients/blob/master/YOLONAS.md

Motivation, Objectives and Related Works

Motivation

Object detection has revolutionized how machines perceive and interpret the world around them.

Objectives

Developing a new YOLO-based architecture can redefine state-of-the-art (SOTA) object detection by addressing the existing limitations and incorporating recent advancements in deep learning.
Imagine a new YOLO-based architecture that could enhance your ability to detect small objects, improve localization accuracy, and increase the performance-per-compute ratio, making the model more accessible for real-time edge-device applications.

Related Works

Model

Idea

The use of QSP and QCI blocks combine re-parameterization and 8-bit quantization advantages. These blocks allow for minimal accuracy loss during post-training quantization.
AutoNAC (Deci’s), was used to determine optimal sizes and structures of stages, including block type, number of blocks, and number of channels in each stage.
A hybrid quantization method that selectively quantizes certain parts of a model, reducing information loss and balancing latency and accuracy. Standard quantization affects all model layers, often leading to significant accuracy loss. Our hybrid method optimizes quantization to maintain accuracy by only quantizing certain layers while leaving others untouched.
Our layer selection algorithm considers each layer’s impact on accuracy and latency, as well as the effects of switching between 8-bit and 16-bit quantization on overall latency.
A pre-training regimen that includes automatically labeled data, self-distillation, and large datasets.
The YOLO-NAS architecture is available under an open-source license. Its’ pre-trained weights are available for research use (non-commercial) on SuperGradients, Deci’s PyTorch-based, open-source, computer vision training library.

Figure 3. Efficiency Frontier plot for object detection on the COCO2017 dataset (validation) comparing YOLO-NAS vs other YOLO architectures.

Steps

Architecture

Data Augmentation

Backbone

Neck

Head

QSP

QCI

Loss Function

Training Strategy

YOLO-NAS’s multi-phase training process involves pre-training on Object365, COCO Pseudo-Labeled data, Knowledge Distillation (KD), and Distribution Focal Loss (DFL).
The model is pre-trained on Objects365 for 25-40 epochs (depending on the model variant) due to the extensive time needed for each epoch (each epoch takes 50-80 minutes on 8 NVIDIA RTX A5000 GPUs).
An <accurate model> is trained on COCO to label these images, which are then used to train our model with the original 118k train images.
The YOLO-NAS architecture also incorporates Knowledge Distillation (KD) and Distribution Focal Loss (DFL) to enhance its training process.
1. Knowledge Distillation is applied by adding a KD term to the loss function, enabling the student network to mimic the logits of both classification and DFL predictions of the teacher network.
2. DFL is employed by learning box regression as a classification task, discretizing box predictions into finite values, and predicting distributions over these values, which are then converted to final predictions through a weighted sum.

Experimental Results

Dataset

Objects365, a comprehensive dataset with 2 million images and 365 categories.
COCO dataset provides an additional 123k unlabeled images, which are used to generate pseudo-labelled data.
RoboFlow100 dataset (RF100), a collection of 100 datasets from diverse domains, to demonstrate its ability to handle complex object detection tasks

Metrics

Figure 4. Examples of annotated images in the RF100 benchmark.

Experimental Results

Ablations

Key Takeaways

Install Library:

!pip install super_gradients

References

- n2 n0
- θ

[YOLO-NAS]

Motivation, Objectives and Related Works

Motivation

Objectives

Related Works

Model

Idea

Steps

Architecture

Data Augmentation

Backbone

Neck

Head

QSP

QCI

Loss Function

Training Strategy

Experimental Results

Dataset

Metrics

Experimental Results

Ablations

Key Takeaways

References

About Me: