By user961627


2014-08-17 12:28:04 8 Comments

In testing an object detection algorithm in large images, we check our detected bounding boxes against the coordinates given for the ground truth rectangles.

According to the Pascal VOC challenges, there's this:

A predicted bounding box is considered correct if it overlaps more than 50% with a ground-truth bounding box, otherwise the bounding box is considered a false positive detection. Multiple detections are penalized. If a system predicts several bounding boxes that overlap with a single ground-truth bounding box, only one prediction is considered correct, the others are considered false positives.

This means that we need to calculate the percentage of overlap. Does this mean that the ground truth box is 50% covered by the detected boundary box? Or that 50% of the bounding box is absorbed by the ground truth box?

I've searched but I haven't found a standard algorithm for this - which is surprising because I would have thought that this is something pretty common in computer vision. (I'm new to it). Have I missed it? Does anyone know what the standard algorithm is for this type of problem?

5 comments

@Reno Fiedler 2018-10-08 02:43:04

how about this approach? Could be extended to any number of unioned shapes

surface = np.zeros([1024,1024])
surface[1:1+10, 1:1+10] += 1
surface[100:100+500, 100:100+100] += 1
unionArea = (surface==2).sum()
print(unionArea)

@Jindil 2018-08-07 15:32:09

For the intersection distance, shouldn't we add a +1 so as to have

intersection_area = (x_right - x_left + 1) * (y_bottom - y_top + 1)   

(same for the AABB)
Like on this pyimage search post

I agree (x_right - x_left) x (y_bottom - y_top) works in mathematics with point coordinates but since we deal with pixels it is I think different.

Consider a 1D example :
- 2 points : x1 = 1 and x2 = 3, the distance is indeed x2-x1 = 2
- 2 pixels of index : i1 = 1 and i2 = 3, the segment from pixel i1 to i2 contains 3 pixels ie l = i2 - i1 + 1

@Martin Thoma 2017-03-18 12:25:35

For axis-aligned bounding boxes it is relatively simple:

def get_iou(bb1, bb2):
    """
    Calculate the Intersection over Union (IoU) of two bounding boxes.

    Parameters
    ----------
    bb1 : dict
        Keys: {'x1', 'x2', 'y1', 'y2'}
        The (x1, y1) position is at the top left corner,
        the (x2, y2) position is at the bottom right corner
    bb2 : dict
        Keys: {'x1', 'x2', 'y1', 'y2'}
        The (x, y) position is at the top left corner,
        the (x2, y2) position is at the bottom right corner

    Returns
    -------
    float
        in [0, 1]
    """
    assert bb1['x1'] < bb1['x2']
    assert bb1['y1'] < bb1['y2']
    assert bb2['x1'] < bb2['x2']
    assert bb2['y1'] < bb2['y2']

    # determine the coordinates of the intersection rectangle
    x_left = max(bb1['x1'], bb2['x1'])
    y_top = max(bb1['y1'], bb2['y1'])
    x_right = min(bb1['x2'], bb2['x2'])
    y_bottom = min(bb1['y2'], bb2['y2'])

    if x_right < x_left or y_bottom < y_top:
        return 0.0

    # The intersection of two axis-aligned bounding boxes is always an
    # axis-aligned bounding box
    intersection_area = (x_right - x_left) * (y_bottom - y_top)

    # compute the area of both AABBs
    bb1_area = (bb1['x2'] - bb1['x1']) * (bb1['y2'] - bb1['y1'])
    bb2_area = (bb2['x2'] - bb2['x1']) * (bb2['y2'] - bb2['y1'])

    # compute the intersection over union by taking the intersection
    # area and dividing it by the sum of prediction + ground-truth
    # areas - the interesection area
    iou = intersection_area / float(bb1_area + bb2_area - intersection_area)
    assert iou >= 0.0
    assert iou <= 1.0
    return iou

Explanation

enter image description here enter image description here

Images are from this answer

@James Meakin 2018-03-14 09:56:47

There is a bug in this code - y_top = max(bb1['y1'], bb2['y1']) should use min. Similarily y_bottom should use max.

@Cris Luengo 2018-06-26 15:17:09

@JamesMeakin: The code is correct. y=0 is at the top, and increases downwards.

@markroxor 2018-10-01 11:23:06

What if the bounding box is not a rectangle?

@Martin Thoma 2018-10-01 12:00:36

Then copy-paste will not work. I only had axis aligned bounding boxes so far in detection. For semantic segmentation there are arbitrary complex shapes. But the concept is the same.

@Chaine 2018-10-02 09:03:38

@MartinThoma what is the format for bb1 and bb2?

@Martin Thoma 2018-10-02 10:56:01

@Chaine I'm not sure what I should write. Don't the docstrings answer your question?

@Chaine 2018-11-21 15:09:36

Thanks a lot! Great job!

@prb 2018-11-27 08:36:16

@MartinThoma will this work for a rectangle inside another rectangle?

@Stefan van der Walt 2014-10-15 15:44:12

In the snippet below, I construct a polygon along the edges of the first box. I then use Matplotlib to clip the polygon to the second box. The resulting polygon contains four vertices, but we are only interested in the top left and bottom right corners, so I take the max and the min of the coordinates to get a bounding box, which is returned to the user.

import numpy as np
from matplotlib import path, transforms

def clip_boxes(box0, box1):
    path_coords = np.array([[box0[0, 0], box0[0, 1]],
                            [box0[1, 0], box0[0, 1]],
                            [box0[1, 0], box0[1, 1]],
                            [box0[0, 0], box0[1, 1]]])

    poly = path.Path(np.vstack((path_coords[:, 0],
                                path_coords[:, 1])).T, closed=True)
    clip_rect = transforms.Bbox(box1)

    poly_clipped = poly.clip_to_bbox(clip_rect).to_polygons()[0]

    return np.array([np.min(poly_clipped, axis=0),
                     np.max(poly_clipped, axis=0)])

box0 = np.array([[0, 0], [1, 1]])
box1 = np.array([[0, 0], [0.5, 0.5]])

print clip_boxes(box0, box1)

@user961627 2014-10-16 06:41:27

In terms of coordinates, the returned value represents: [[ x1 y1 ] [ x2 y2 ]], am I right?

@user961627 2014-10-16 06:55:12

And the input boxes should conform to the same coordinates representation as well, right?

@Stefan van der Walt 2014-10-16 10:24:06

Yes, quite right.

@user961627 2014-11-03 10:52:00

Thanks - I've been using it fine for a while! But now it's running into an error sometimes, I'm not sure why: stackoverflow.com/questions/26712637/…

@user961627 2014-08-18 10:18:05

I found that the conceptual answer is here: http://pascallin.ecs.soton.ac.uk/challenges/VOC/voc2012/htmldoc/devkit_doc.html#SECTION00054000000000000000

from this thread: Compare two bounding boxes with each other Matlab

I should be able to code this in python!

@Stefan van der Walt 2014-08-22 18:31:27

Are you looking for polygon intersection code? Because that I have available.

@user961627 2014-10-11 18:19:13

Hey thanks, yes! My polygons will all be just boxes but I guess that's not an issue, right?

@Stefan van der Walt 2014-10-11 21:45:27

Using matplotlib, here's how to compute polygon clipping: github.com/scikit-image/scikit-image/pull/1177/…

@user961627 2014-10-14 07:11:03

Thanks - but I'm not that familiar with github so I'm not sure what to do with the link you sent me. It looks like changes over a _geometry.py and draw.py files. But which of these files do I actually need to import? And what would be a one or two-liner code be to assign two rectangles to this polygon type and get the value of how much they intersect?

Related Questions

Sponsored Content

1 Answered Questions

TF - Object detection with ground truth boxes

0 Answered Questions

Can modify object detection true positive define(IOU)?

1 Answered Questions

0 Answered Questions

0 Answered Questions

use opencv's groupRectangle for Bounding box with scores

1 Answered Questions

2 Answered Questions

Sponsored Content