Deep SORT - Simple Online and Realtime Tracking with a Deep Association Metric (2017)
{Matching Cascade}
Paper: https://arxiv.org/abs/1703.07402
Code:
{Matching Cascade}
Paper: https://arxiv.org/abs/1703.07402
Code:
Different from SORT which uses only the IOU-based distance for object association, DeepSORT combines it with a visual appearance distance obtained from the ReID model. In particular, when a detection i is compared with an existing tracklet j, their visual appearance features, i.e., the embeddings, are compared to obtain their cosine distance. A weighted sum between the IOU distance and the cosine distance is then computed and used as the cost for object association.
Another difference between SORT and DeepSORT is that DeepSORT performs a cascade matching, in which tracklets with a smaller age are compared and associated with new detections before those with a larger age. The term age in this context refers to the number of consecutive frames that a tracklet has not been matched with any new detection. If it is matched in the current frame, the age is reset to zero; otherwise, it is increased by one. The method, therefore, gives priority to objects that have been seen more recently.
After these age-based cascade matching, the original SORT association, which is based solely on the IOU distance, is applied as the final step for the unmatched tracklets and detections. This final step is done to take into account the case of sudden changes in visual appearance, which might occur from time to time.
Figure. A usual workflow of ReID-based MOT
Linear Assumption: Sort uses Linear Kalman Filter <= not suitable.
ID Switches: the relationship btw Detect and Track is IoU => shape
The repeatation on ID Switches while being obscured or overlapped.
The factors the improve the performance:
Data Association: Database links between detected objects and saved tracks
Track Life Cycle Management: Manage when tracks begin, stop and remove.
In Sort: Data Association => Hungarian Algorithm based on IoU (weakness). Link is also associated with:
The distance between detection and track (the correlation in vector space)
The cosine distance between 2 representative vectors extracted by detection and track. (same if belonging to 1 object and different if belong to 2 different objects)
Architecture: Wide Residual Network
Dataset: large scale re-id dataset (Market 1501, MARs...)
Paper: Deep Cosine Metric Learning for Person Re-Identification (2018) [Paper]
Deep networks are costly for training and inferencing.
Use the shallow network => WRN with 16 layers.
Parameterize the standard softmax classifier. (Above is standard and below is cosine)
3 Loss:
Softmax Cross Entropy
Magnet Loss
Triplet Loss
Cross-Entropy returns good result
1. Mahalanobis
Provide information about object positions based on the accelerated movements
Good for short predictions.
2. Cosine
Consider the representative information/
Good for long predictions.
3. Deep Sort measurement
Consider the representative information/
Good for long predictions.
4. Matching Cascade
Improve the accuracy of links when the object is disappeated.
5. Track Life-cycle Management
Tentative: at beginning
Confirmed: maintain in 3 continuous frames and next 30 frames.
Deleted: maintain less than 3 frames.
Each track is composed of 8 components:
u,v: the center of BBox.
gamma: the frame ratio.
h: the height of BBox.
other: the corresponding velocities.
Process:
Detection: Faster Region CNN (VGG16 backbone)
New track situations prediction: Kalman filter
- Tentative (new)
- Confirmed (3 frames and next 30 frames)
- Deleted
Matching Cascade: Linking the detected objects with the confirmed tracks based on the distance measurement.
Hungarian Algorithm: Existing tracks are assigned by IoU for the second linking.
Process and Classify detections and tracks
Kalman Filter: Correct the value of tracks from linked detections and create new tracks.
https://viblo.asia/p/sort-deep-sort-mot-goc-nhin-ve-object-tracking-phan-2-djeZ1m78ZWz
Deep SORT: https://arxiv.org/abs/1703.07402
Wide Residual Network: https://arxiv.org/abs/1812.00442
Cosine Metric Learning: https://arxiv.org/abs/1812.00442
Mahalanobis Distance: https://en.wikipedia.org/wiki/Mahalanobis_distance