Boxes

Backend functionality for 2D bounding boxes

apply_non_max_suppression

paz.backend.boxes.apply_non_max_suppression(boxes, scores, iou_thresh=0.45, top_k=200)

Apply non maximum suppression.

Arguments

boxes: Numpy array, box coordinates of shape (num_boxes, 4) where each columns corresponds to x_min, y_min, x_max, y_max.
scores: Numpy array, of scores given for each box in boxes.
iou_thresh: float, intersection over union threshold for removing boxes.
top_k: int, number of maximum objects per class.

Returns

selected_indices: Numpy array, selected indices of kept boxes.
num_selected_boxes: int, number of selected boxes.

[source]

nms_per_class

paz.backend.boxes.nms_per_class(box_data, nms_thresh=0.45, epsilon=0.01, top_k=200)

Applies non maximum suppression per class. This function takes all the detections from the detector which consists of boxes and their corresponding class scores to which it applies non maximum suppression for every class independently and then combines the result.

Arguments

box_data: Array of shape (num_nms_boxes, 4 + num_classes) containing the box coordinates as well as the predicted scores of all the classes for all non suppressed boxes.
nms_thresh: Float, Non-maximum suppression threshold.
epsilon: Float, Filter scores with a lower confidence value before performing non-maximum supression.
top_k: Int, Maximum number of boxes per class outputted by nms.

Returns

Tuple: Containing an array non suppressed boxes of shape (num_nms_boxes, 4 + num_classes) and an array of corresponding class labels of shape (num_nms_boxes, ).

[source]

_nms_per_class

paz.backend.boxes._nms_per_class(nms_boxes, class_labels, class_arg, decoded_boxes, class_predictions, epsilon, nms_thresh, top_k)

Applies non maximum suppression for a given class. This function takes all the detections that belong only to the given single class and applies non maximum suppression for that class alone and returns the resulting non suppressed boxes.

Arguments

nms_boxes: Array of shape (num_boxes, 4 + num_classes).
class_labels: Array of shape (num_boxes, ).
class_arg: Int, class index.
decoded_boxes: Array of shape (num_prior_boxes, 4) containing the box coordinates of all the non suppressed boxes.
class_predictions: Array of shape (num_nms_boxes, num_classes) containing the predicted scores of all the classes for all the non suppressed boxes.
epsilon: Float, Filter scores with a lower confidence value before performing non-maximum supression.
nms_thresh: Float, Non-maximum suppression threshold.
top_k: Int, Maximum number of boxes per class outputted by nms.

Returns

Tuple: Containing an array non suppressed boxes per class of shape (num_nms_boxes_per_class, 4 + num_classes) and an array corresponding class labels of shape(num_nms_boxes_per_class, )`.

[source]

pre_filter_nms

paz.backend.boxes.pre_filter_nms(class_arg, class_predictions, epsilon)

Applies score filtering. This function takes all the predicted scores of a given class and filters out all the predictions less than the given epsilon value.

Arguments

class_arg: Int, class index.
class_predictions: Array of shape (num_nms_boxes, num_classes) containing the predicted scores of all the classes for all the non suppressed boxes.
epsilon: Float, threshold value for score filtering.

Returns

Tuple: Containing an array filtered scores of shape (num_pre_filtered_boxes, ) and an array filter mask of shape (num_prior_boxes, ).

[source]

merge_nms_box_with_class

paz.backend.boxes.merge_nms_box_with_class(box_data, class_labels)

Merges box coordinates with their corresponding class defined by class_labels which is decided by best box geometry by non maximum suppression (and not by the best scoring class) into a single output. This function retains only the predicted score of the class to which the box belongs to and sets the scores of all the remaining classes to zero, thereby combining box and class information in a single variable.

Arguments

box_data: Array of shape (num_nms_boxes, 4 + num_classes) containing the box coordinates as well as the predicted scores of all the classes for all non suppressed boxes.
class_labels: Array of shape (num_nms_boxes, ) that contains the indices of the class whose score is to be retained.

Returns

boxes: Array of shape (num_nms_boxes, 4 + num_classes), containing coordinates of non supressed boxes along with scores of the class to which the box belongs. The scores of the other classes are zeros.

[source]

suppress_other_class_scores

paz.backend.boxes.suppress_other_class_scores(class_predictions, class_labels)

Retains the score of class in class_labels and sets other class scores to zero.

Arguments

class_predictions: Array of shape (num_nms_boxes, num_classes) containing the predicted scores of all the classes for all the non suppressed boxes.
class_labels: Array of shape (num_nms_boxes, ) that contains the indices of the class whose score is to be retained.

Returns

retained_class_score: Array of shape (num_nms_boxes, num_classes) that consists of score at only those location specified by 'class_labels' and zero at other class locations.

Note

This approach retains the scores of that class in class_predictions defined by class_labels by generating a boolean mask score_suppress_mask with elements True at the locations where the score in class_predictions is to be retained and False wherever the class score is to be suppressed. This approach of retaining/suppressing scores does not make use of for loop, if-else condition and direct value assignment to arrays.

[source]

offset

paz.backend.boxes.offset(coordinates, offset_scales)

Apply offsets to box coordinates

Arguments

coordinates: List of floats containing coordinates in point form.
offset_scales: List of floats having x and y scales respectively.

Returns

coordinates: List of floats containing coordinates in point form. i.e. [x_min, y_min, x_max, y_max].

[source]

clip

paz.backend.boxes.clip(coordinates, image_shape)

Clip box to valid image coordinates Arguments

coordinates: List of floats containing coordinates in point form i.e. [x_min, y_min, x_max, y_max].
image_shape: List of two integers indicating height and width of image respectively.

Returns

List of clipped coordinates.

[source]

compute_iou

paz.backend.boxes.compute_iou(box, boxes)

Calculates the intersection over union between 'box' and all 'boxes'. Both box and boxes are in corner coordinates.

Arguments

box: Numpy array with length at least of 4.
boxes: Numpy array with shape (num_boxes, 4).

Returns

Numpy array of shape (num_boxes, 1).

[source]

compute_ious

paz.backend.boxes.compute_ious(boxes_A, boxes_B)

Calculates the intersection over union between boxes_A and boxes_B. For each box present in the rows of boxes_A it calculates the intersection over union with respect to all boxes in boxes_B. The variables boxes_A and boxes_B contain the corner coordinates of the left-top corner (x_min, y_min) and the right-bottom (x_max, y_max) corner.

Arguments

boxes_A: Numpy array with shape (num_boxes_A, 4).
boxes_B: Numpy array with shape (num_boxes_B, 4).

Returns

Numpy array of shape (num_boxes_A, num_boxes_B).

[source]

decode

paz.backend.boxes.decode(predictions, priors, variances=[0.1, 0.1, 0.2, 0.2])

Decode default boxes into the ground truth boxes

Arguments

loc: Numpy array of shape (num_priors, 4).
priors: Numpy array of shape (num_priors, 4).
variances: List of two floats. Variances of prior boxes.

Returns

decoded boxes: Numpy array of shape (num_priors, 4).

[source]

denormalize_box

paz.backend.boxes.denormalize_box(box, image_shape)

Scales corner box coordinates from normalized values to image dimensions

Arguments

box: Numpy array containing corner box coordinates.
image_shape: List of integers with (height, width).

Returns

returns: box corner coordinates in image dimensions

[source]

encode

paz.backend.boxes.encode(matched, priors, variances=[0.1, 0.1, 0.2, 0.2])

Encode the variances from the priorbox layers into the ground truth boxes we have matched (based on jaccard overlap) with the prior boxes.

Arguments

matched: Numpy array of shape (num_priors, 4) with boxes in point-form.
priors: Numpy array of shape (num_priors, 4) with boxes in center-form.
variances: (list[float]) Variances of priorboxes

Returns

encoded boxes: Numpy array of shape (num_priors, 4).

[source]

flip_left_right

paz.backend.boxes.flip_left_right(boxes, width)

Flips box coordinates from left-to-right and vice-versa. Arguments

boxes: Numpy array of shape [num_boxes, 4]. Returns

Numpy array of shape [num_boxes, 4].

[source]

make_box_square

paz.backend.boxes.make_box_square(box)

Makes box coordinates square with sides equal to the longest original side.

Arguments

box: Numpy array with shape (4) with point corner coordinates.

Returns

returns: List of box coordinates ints.

[source]

match

paz.backend.boxes.match(boxes, prior_boxes, iou_threshold=0.5)

Matches each prior box with a ground truth box (box from boxes). It then selects which matched box will be considered positive e.g. iou > .5 and returns for each prior box a ground truth box that is either positive (with a class argument different than 0) or negative.

Arguments

boxes: Numpy array of shape (num_ground_truh_boxes, 4 + 1), where the first the first four coordinates correspond to box coordinates and the last coordinates is the class argument. This boxes should be the ground truth boxes.
prior_boxes: Numpy array of shape (num_prior_boxes, 4). where the four coordinates are in center form coordinates.
iou_threshold: Float between [0, 1]. Intersection over union used to determine which box is considered a positive box.

Returns

numpy array of shape (num_prior_boxes, 4 + 1). where the first the first four coordinates correspond to point form box coordinates and the last coordinates is the class argument.

[source]

nms_per_class

paz.backend.boxes.nms_per_class(box_data, nms_thresh=0.45, epsilon=0.01, top_k=200)

Applies non maximum suppression per class. This function takes all the detections from the detector which consists of boxes and their corresponding class scores to which it applies non maximum suppression for every class independently and then combines the result.

Arguments

box_data: Array of shape (num_nms_boxes, 4 + num_classes) containing the box coordinates as well as the predicted scores of all the classes for all non suppressed boxes.
nms_thresh: Float, Non-maximum suppression threshold.
epsilon: Float, Filter scores with a lower confidence value before performing non-maximum supression.
top_k: Int, Maximum number of boxes per class outputted by nms.

Returns

Tuple: Containing an array non suppressed boxes of shape (num_nms_boxes, 4 + num_classes) and an array of corresponding class labels of shape (num_nms_boxes, ).

[source]

to_image_coordinates

paz.backend.boxes.to_image_coordinates(boxes, image)

Transforms normalized box coordinates into image coordinates. Arguments

image: Numpy array.
boxes: Numpy array of shape [num_boxes, N] where N >= 4. Returns

Numpy array of shape [num_boxes, N].

[source]

to_center_form

paz.backend.boxes.to_center_form(boxes)

Transform from corner coordinates to center coordinates.

Arguments

boxes: Numpy array with shape (num_boxes, 4).

Returns

Numpy array with shape (num_boxes, 4).

[source]

to_one_hot

paz.backend.boxes.to_one_hot(class_indices, num_classes)

Transform from class index to one-hot encoded vector.

Arguments

class_indices: Numpy array. One dimensional array specifying the index argument of the class for each sample.
num_classes: Integer. Total number of classes.

Returns

Numpy array with shape (num_samples, num_classes).

[source]

to_normalized_coordinates

paz.backend.boxes.to_normalized_coordinates(boxes, image)

Transforms coordinates in image dimensions to normalized coordinates. Arguments

image: Numpy array.
boxes: Numpy array of shape [num_boxes, N] where N >= 4. Returns

Numpy array of shape [num_boxes, N].

[source]

to_corner_form

paz.backend.boxes.to_corner_form(boxes)

Transform from center coordinates to corner coordinates.

Arguments

boxes: Numpy array with shape (num_boxes, 4).

Returns

Numpy array with shape (num_boxes, 4).

[source]

extract_bounding_box_corners

paz.backend.boxes.extract_bounding_box_corners(points3D)

Extracts the (x_min, y_min, z_min) and the (x_max, y_max, z_max) coordinates from an array of points3D Arguments

points3D: Array (num_points, 3)

Returns

Left-down-bottom corner (x_min, y_min, z_min) and right-up-top (x_max, y_max, z_max) corner.

[source]

scale_box

paz.backend.boxes.scale_box(predictions, image_scales)

Arguments

predictions: Array of shape (num_boxes, num_classes+N) model predictions.
image_scales: Array of shape (), scale value of boxes.

Returns

predictions: Array of shape (num_boxes, num_classes+N) model predictions.

[source]

change_box_coordinates

paz.backend.boxes.change_box_coordinates(outputs)

Converts box coordinates format from (y_min, x_min, y_max, x_max) to (x_min, y_min, x_max, y_max).

Arguments

outputs: Tensor, model output.

Returns

outputs: Array, Processed outputs by merging the features at all levels. Each row corresponds to box coordinate offsets and sigmoid of the class logits.

[source]

add_class_and_score

paz.backend.boxes.add_class_and_score(predictions, box)

Adds class and score to box.

Arguments

predictions: Dictionary with keys class_name and scores.
box: Array of shape (num_nms_boxes, 4 + num_classes).