Skip to content

Boxes

Backend functionality for 2D bounding boxes

[source]

apply_non_max_suppression

paz.backend.boxes.apply_non_max_suppression(boxes, scores, iou_thresh=0.45, top_k=200)

Apply non maximum suppression.

Arguments

  • boxes: Numpy array, box coordinates of shape (num_boxes, 4) where each columns corresponds to x_min, y_min, x_max, y_max.
  • scores: Numpy array, of scores given for each box in boxes.
  • iou_thresh: float, intersection over union threshold for removing boxes.
  • top_k: int, number of maximum objects per class.

Returns

  • selected_indices: Numpy array, selected indices of kept boxes.
  • num_selected_boxes: int, number of selected boxes.

[source]

nms_per_class

paz.backend.boxes.nms_per_class(box_data, nms_thresh=0.45, epsilon=0.01, top_k=200)

Applies non maximum suppression per class. This function takes all the detections from the detector which consists of boxes and their corresponding class scores to which it applies non maximum suppression for every class independently and then combines the result.

Arguments

  • box_data: Array of shape (num_nms_boxes, 4 + num_classes) containing the box coordinates as well as the predicted scores of all the classes for all non suppressed boxes.
  • nms_thresh: Float, Non-maximum suppression threshold.
  • epsilon: Float, Filter scores with a lower confidence value before performing non-maximum supression.
  • top_k: Int, Maximum number of boxes per class outputted by nms.

Returns

  • Tuple: Containing an array non suppressed boxes of shape (num_nms_boxes, 4 + num_classes) and an array of corresponding class labels of shape (num_nms_boxes, ).

[source]

_nms_per_class

paz.backend.boxes._nms_per_class(nms_boxes, class_labels, class_arg, decoded_boxes, class_predictions, epsilon, nms_thresh, top_k)

Applies non maximum suppression for a given class. This function takes all the detections that belong only to the given single class and applies non maximum suppression for that class alone and returns the resulting non suppressed boxes.

Arguments

  • nms_boxes: Array of shape (num_boxes, 4 + num_classes).
  • class_labels: Array of shape (num_boxes, ).
  • class_arg: Int, class index.
  • decoded_boxes: Array of shape (num_prior_boxes, 4) containing the box coordinates of all the non suppressed boxes.
  • class_predictions: Array of shape (num_nms_boxes, num_classes) containing the predicted scores of all the classes for all the non suppressed boxes.
  • epsilon: Float, Filter scores with a lower confidence value before performing non-maximum supression.
  • nms_thresh: Float, Non-maximum suppression threshold.
  • top_k: Int, Maximum number of boxes per class outputted by nms.

Returns

  • Tuple: Containing an array non suppressed boxes per class of shape (num_nms_boxes_per_class, 4 + num_classes) and an array corresponding class labels of shape(num_nms_boxes_per_class, )`.

[source]

pre_filter_nms

paz.backend.boxes.pre_filter_nms(class_arg, class_predictions, epsilon)

Applies score filtering. This function takes all the predicted scores of a given class and filters out all the predictions less than the given epsilon value.

Arguments

  • class_arg: Int, class index.
  • class_predictions: Array of shape (num_nms_boxes, num_classes) containing the predicted scores of all the classes for all the non suppressed boxes.
  • epsilon: Float, threshold value for score filtering.

Returns

  • Tuple: Containing an array filtered scores of shape (num_pre_filtered_boxes, ) and an array filter mask of shape (num_prior_boxes, ).

[source]

merge_nms_box_with_class

paz.backend.boxes.merge_nms_box_with_class(box_data, class_labels)

Merges box coordinates with their corresponding class defined by class_labels which is decided by best box geometry by non maximum suppression (and not by the best scoring class) into a single output. This function retains only the predicted score of the class to which the box belongs to and sets the scores of all the remaining classes to zero, thereby combining box and class information in a single variable.

Arguments

  • box_data: Array of shape (num_nms_boxes, 4 + num_classes) containing the box coordinates as well as the predicted scores of all the classes for all non suppressed boxes.
  • class_labels: Array of shape (num_nms_boxes, ) that contains the indices of the class whose score is to be retained.

Returns

  • boxes: Array of shape (num_nms_boxes, 4 + num_classes), containing coordinates of non supressed boxes along with scores of the class to which the box belongs. The scores of the other classes are zeros.

[source]

suppress_other_class_scores

paz.backend.boxes.suppress_other_class_scores(class_predictions, class_labels)

Retains the score of class in class_labels and sets other class scores to zero.

Arguments

  • class_predictions: Array of shape (num_nms_boxes, num_classes) containing the predicted scores of all the classes for all the non suppressed boxes.
  • class_labels: Array of shape (num_nms_boxes, ) that contains the indices of the class whose score is to be retained.

Returns

  • retained_class_score: Array of shape (num_nms_boxes, num_classes) that consists of score at only those location specified by 'class_labels' and zero at other class locations.

Note

This approach retains the scores of that class in class_predictions defined by class_labels by generating a boolean mask score_suppress_mask with elements True at the locations where the score in class_predictions is to be retained and False wherever the class score is to be suppressed. This approach of retaining/suppressing scores does not make use of for loop, if-else condition and direct value assignment to arrays.


[source]

offset

paz.backend.boxes.offset(coordinates, offset_scales)

Apply offsets to box coordinates

Arguments

  • coordinates: List of floats containing coordinates in point form.
  • offset_scales: List of floats having x and y scales respectively.

Returns

  • coordinates: List of floats containing coordinates in point form. i.e. [x_min, y_min, x_max, y_max].

[source]

clip

paz.backend.boxes.clip(coordinates, image_shape)

Clip box to valid image coordinates Arguments

  • coordinates: List of floats containing coordinates in point form i.e. [x_min, y_min, x_max, y_max].
  • image_shape: List of two integers indicating height and width of image respectively.

Returns

List of clipped coordinates.


[source]

compute_iou

paz.backend.boxes.compute_iou(box, boxes)

Calculates the intersection over union between 'box' and all 'boxes'. Both box and boxes are in corner coordinates.

Arguments

  • box: Numpy array with length at least of 4.
  • boxes: Numpy array with shape (num_boxes, 4).

Returns

Numpy array of shape (num_boxes, 1).


[source]

compute_ious

paz.backend.boxes.compute_ious(boxes_A, boxes_B)

Calculates the intersection over union between boxes_A and boxes_B. For each box present in the rows of boxes_A it calculates the intersection over union with respect to all boxes in boxes_B. The variables boxes_A and boxes_B contain the corner coordinates of the left-top corner (x_min, y_min) and the right-bottom (x_max, y_max) corner.

Arguments

  • boxes_A: Numpy array with shape (num_boxes_A, 4).
  • boxes_B: Numpy array with shape (num_boxes_B, 4).

Returns

Numpy array of shape (num_boxes_A, num_boxes_B).


[source]

decode

paz.backend.boxes.decode(predictions, priors, variances=[0.1, 0.1, 0.2, 0.2])

Decode default boxes into the ground truth boxes

Arguments

  • loc: Numpy array of shape (num_priors, 4).
  • priors: Numpy array of shape (num_priors, 4).
  • variances: List of two floats. Variances of prior boxes.

Returns

decoded boxes: Numpy array of shape (num_priors, 4).


[source]

denormalize_box

paz.backend.boxes.denormalize_box(box, image_shape)

Scales corner box coordinates from normalized values to image dimensions

Arguments

  • box: Numpy array containing corner box coordinates.
  • image_shape: List of integers with (height, width).

Returns

  • returns: box corner coordinates in image dimensions

[source]

encode

paz.backend.boxes.encode(matched, priors, variances=[0.1, 0.1, 0.2, 0.2])

Encode the variances from the priorbox layers into the ground truth boxes we have matched (based on jaccard overlap) with the prior boxes.

Arguments

  • matched: Numpy array of shape (num_priors, 4) with boxes in point-form.
  • priors: Numpy array of shape (num_priors, 4) with boxes in center-form.
  • variances: (list[float]) Variances of priorboxes

Returns

encoded boxes: Numpy array of shape (num_priors, 4).


[source]

flip_left_right

paz.backend.boxes.flip_left_right(boxes, width)

Flips box coordinates from left-to-right and vice-versa. Arguments

  • boxes: Numpy array of shape [num_boxes, 4]. Returns

Numpy array of shape [num_boxes, 4].


[source]

make_box_square

paz.backend.boxes.make_box_square(box)

Makes box coordinates square with sides equal to the longest original side.

Arguments

  • box: Numpy array with shape (4) with point corner coordinates.

Returns

  • returns: List of box coordinates ints.

[source]

match

paz.backend.boxes.match(boxes, prior_boxes, iou_threshold=0.5)

Matches each prior box with a ground truth box (box from boxes). It then selects which matched box will be considered positive e.g. iou > .5 and returns for each prior box a ground truth box that is either positive (with a class argument different than 0) or negative.

Arguments

  • boxes: Numpy array of shape (num_ground_truh_boxes, 4 + 1), where the first the first four coordinates correspond to box coordinates and the last coordinates is the class argument. This boxes should be the ground truth boxes.
  • prior_boxes: Numpy array of shape (num_prior_boxes, 4). where the four coordinates are in center form coordinates.
  • iou_threshold: Float between [0, 1]. Intersection over union used to determine which box is considered a positive box.

Returns

numpy array of shape (num_prior_boxes, 4 + 1). where the first the first four coordinates correspond to point form box coordinates and the last coordinates is the class argument.


[source]

nms_per_class

paz.backend.boxes.nms_per_class(box_data, nms_thresh=0.45, epsilon=0.01, top_k=200)

Applies non maximum suppression per class. This function takes all the detections from the detector which consists of boxes and their corresponding class scores to which it applies non maximum suppression for every class independently and then combines the result.

Arguments

  • box_data: Array of shape (num_nms_boxes, 4 + num_classes) containing the box coordinates as well as the predicted scores of all the classes for all non suppressed boxes.
  • nms_thresh: Float, Non-maximum suppression threshold.
  • epsilon: Float, Filter scores with a lower confidence value before performing non-maximum supression.
  • top_k: Int, Maximum number of boxes per class outputted by nms.

Returns

  • Tuple: Containing an array non suppressed boxes of shape (num_nms_boxes, 4 + num_classes) and an array of corresponding class labels of shape (num_nms_boxes, ).

[source]

to_image_coordinates

paz.backend.boxes.to_image_coordinates(boxes, image)

Transforms normalized box coordinates into image coordinates. Arguments

  • image: Numpy array.
  • boxes: Numpy array of shape [num_boxes, N] where N >= 4. Returns

Numpy array of shape [num_boxes, N].


[source]

to_center_form

paz.backend.boxes.to_center_form(boxes)

Transform from corner coordinates to center coordinates.

Arguments

  • boxes: Numpy array with shape (num_boxes, 4).

Returns

Numpy array with shape (num_boxes, 4).


[source]

to_one_hot

paz.backend.boxes.to_one_hot(class_indices, num_classes)

Transform from class index to one-hot encoded vector.

Arguments

  • class_indices: Numpy array. One dimensional array specifying the index argument of the class for each sample.
  • num_classes: Integer. Total number of classes.

Returns

Numpy array with shape (num_samples, num_classes).


[source]

to_normalized_coordinates

paz.backend.boxes.to_normalized_coordinates(boxes, image)

Transforms coordinates in image dimensions to normalized coordinates. Arguments

  • image: Numpy array.
  • boxes: Numpy array of shape [num_boxes, N] where N >= 4. Returns

Numpy array of shape [num_boxes, N].


[source]

to_corner_form

paz.backend.boxes.to_corner_form(boxes)

Transform from center coordinates to corner coordinates.

Arguments

  • boxes: Numpy array with shape (num_boxes, 4).

Returns

Numpy array with shape (num_boxes, 4).


[source]

extract_bounding_box_corners

paz.backend.boxes.extract_bounding_box_corners(points3D)

Extracts the (x_min, y_min, z_min) and the (x_max, y_max, z_max) coordinates from an array of points3D Arguments

  • points3D: Array (num_points, 3)

Returns

Left-down-bottom corner (x_min, y_min, z_min) and right-up-top (x_max, y_max, z_max) corner.


[source]

scale_box

paz.backend.boxes.scale_box(predictions, image_scales)

Arguments

  • predictions: Array of shape (num_boxes, num_classes+N) model predictions.
  • image_scales: Array of shape (), scale value of boxes.

Returns

  • predictions: Array of shape (num_boxes, num_classes+N) model predictions.

[source]

change_box_coordinates

paz.backend.boxes.change_box_coordinates(outputs)

Converts box coordinates format from (y_min, x_min, y_max, x_max) to (x_min, y_min, x_max, y_max).

Arguments

  • outputs: Tensor, model output.

Returns

  • outputs: Array, Processed outputs by merging the features at all levels. Each row corresponds to box coordinate offsets and sigmoid of the class logits.

[source]

add_class_and_score

paz.backend.boxes.add_class_and_score(predictions, box)

Adds class and score to box.

Arguments

  • predictions: Dictionary with keys class_name and scores.
  • box: Array of shape (num_nms_boxes, 4 + num_classes).