Non-Maximum Weighted

Huajun Zhou, Zechao Li, Chengcheng Ning, Jinhui Tang

{, }

Motivation, Objectives and Related Works

In existing detection frameworks, non-maximum suppression (NMS) is the most widespread method to eliminate redundant prediction boxes.
For all predictions, if the Jaccard overlap between two prediction boxes surpasses a given threshold, they are identified as the detections for the same object and the higher confident one becomes the final prediction.
Suppose B is a group of boxes that are identified as the same object detections and bpre denotes the final prediction bounding box.
NMS implements the following function:

Here ci indicates the confidence of bounding box bi in B.
For detection boxes of the same object, this method simply adopts the most confident boxes while ignoring all the non-maximum boxes.

Intuitively, lower confident bounding boxes may consider some latent information that is ignored by the most confident boxes.
Suppose an image that a man stretching out his hands, as shown in Figure 5. Some prediction boxes that well catch the upper body or the main body without stretching hands are both inferior detections.
However, the average box of these two inferior boxes seems to well catch the entire person. Especially in the case that two boxes have similar confidences, predicting the average box is more convincing than the higher one.

Weighted-averaging the non-maximum boxes slightly improves the detection performance without deceleration.

bi is the ith instance in box set B.
ci represents its maximum category confidence.
wi is the related-confidence for each prediction box and iou function computes jaccard overlap between bi and the most confident box bargmaxi-ci.
To obtain these related-confidences, we calculate the product of its own confidence and the overlap with the most confident predictions, which achieve the greatest improvement in our expression experiments.
Eventually, the final prediction boxes are generated by calculating the weighted average over box set B.