Boxes
Backend functionality for 2D bounding boxes
apply_non_max_suppression
paz.backend.boxes.apply_non_max_suppression(boxes, scores, iou_thresh=0.45, top_k=200)
Apply non maximum suppression.
Arguments
- boxes: Numpy array, box coordinates of shape
(num_boxes, 4)
where each columns corresponds to x_min, y_min, x_max, y_max. - scores: Numpy array, of scores given for each box in
boxes
. - iou_thresh: float, intersection over union threshold for removing boxes.
- top_k: int, number of maximum objects per class.
Returns
- selected_indices: Numpy array, selected indices of kept boxes.
- num_selected_boxes: int, number of selected boxes.
nms_per_class
paz.backend.boxes.nms_per_class(box_data, nms_thresh=0.45, epsilon=0.01, top_k=200)
Applies non maximum suppression per class. This function takes all the detections from the detector which consists of boxes and their corresponding class scores to which it applies non maximum suppression for every class independently and then combines the result.
Arguments
- box_data: Array of shape
(num_nms_boxes, 4 + num_classes)
containing the box coordinates as well as the predicted scores of all the classes for all non suppressed boxes. - nms_thresh: Float, Non-maximum suppression threshold.
- epsilon: Float, Filter scores with a lower confidence value before performing non-maximum supression.
- top_k: Int, Maximum number of boxes per class outputted by nms.
Returns
- Tuple: Containing an array non suppressed boxes of shape
(num_nms_boxes, 4 + num_classes)
and an array of corresponding class labels of shape(num_nms_boxes, )
.
_nms_per_class
paz.backend.boxes._nms_per_class(nms_boxes, class_labels, class_arg, decoded_boxes, class_predictions, epsilon, nms_thresh, top_k)
Applies non maximum suppression for a given class. This function takes all the detections that belong only to the given single class and applies non maximum suppression for that class alone and returns the resulting non suppressed boxes.
Arguments
- nms_boxes: Array of shape
(num_boxes, 4 + num_classes)
. - class_labels: Array of shape
(num_boxes, )
. - class_arg: Int, class index.
- decoded_boxes: Array of shape
(num_prior_boxes, 4)
containing the box coordinates of all the non suppressed boxes. - class_predictions: Array of shape
(num_nms_boxes, num_classes)
containing the predicted scores of all the classes for all the non suppressed boxes. - epsilon: Float, Filter scores with a lower confidence value before performing non-maximum supression.
- nms_thresh: Float, Non-maximum suppression threshold.
- top_k: Int, Maximum number of boxes per class outputted by nms.
Returns
- Tuple: Containing an array non suppressed boxes per class of
shape
(num_nms_boxes_per_class, 4 + num_classes) and an array corresponding class labels of shape
(num_nms_boxes_per_class, )`.
pre_filter_nms
paz.backend.boxes.pre_filter_nms(class_arg, class_predictions, epsilon)
Applies score filtering.
This function takes all the predicted scores of a given class and
filters out all the predictions less than the given epsilon
value.
Arguments
- class_arg: Int, class index.
- class_predictions: Array of shape
(num_nms_boxes, num_classes)
containing the predicted scores of all the classes for all the non suppressed boxes. - epsilon: Float, threshold value for score filtering.
Returns
- Tuple: Containing an array filtered scores of shape
(num_pre_filtered_boxes, )
and an array filter mask of shape(num_prior_boxes, )
.
merge_nms_box_with_class
paz.backend.boxes.merge_nms_box_with_class(box_data, class_labels)
Merges box coordinates with their corresponding class
defined by class_labels
which is decided by best box geometry
by non maximum suppression (and not by the best scoring class)
into a single output.
This function retains only the predicted score of the class to
which the box belongs to and sets the scores of all the remaining
classes to zero, thereby combining box and class information in a
single variable.
Arguments
- box_data: Array of shape
(num_nms_boxes, 4 + num_classes)
containing the box coordinates as well as the predicted scores of all the classes for all non suppressed boxes. - class_labels: Array of shape
(num_nms_boxes, )
that contains the indices of the class whose score is to be retained.
Returns
- boxes: Array of shape
(num_nms_boxes, 4 + num_classes)
, containing coordinates of non supressed boxes along with scores of the class to which the box belongs. The scores of the other classes are zeros.
suppress_other_class_scores
paz.backend.boxes.suppress_other_class_scores(class_predictions, class_labels)
Retains the score of class in class_labels
and
sets other class scores to zero.
Arguments
- class_predictions: Array of shape
(num_nms_boxes, num_classes)
containing the predicted scores of all the classes for all the non suppressed boxes. - class_labels: Array of shape
(num_nms_boxes, )
that contains the indices of the class whose score is to be retained.
Returns
- retained_class_score: Array of shape
(num_nms_boxes, num_classes)
that consists of score at only those location specified by 'class_labels' and zero at other class locations.
Note
This approach retains the scores of that class in
class_predictions
defined by class_labels
by generating
a boolean mask score_suppress_mask
with elements True at the
locations where the score in class_predictions
is to be
retained and False wherever the class score is to be suppressed.
This approach of retaining/suppressing scores does not make use
of for loop, if-else condition and direct value assignment
to arrays.
offset
paz.backend.boxes.offset(coordinates, offset_scales)
Apply offsets to box coordinates
Arguments
- coordinates: List of floats containing coordinates in point form.
- offset_scales: List of floats having x and y scales respectively.
Returns
- coordinates: List of floats containing coordinates in point form. i.e. [x_min, y_min, x_max, y_max].
clip
paz.backend.boxes.clip(coordinates, image_shape)
Clip box to valid image coordinates Arguments
- coordinates: List of floats containing coordinates in point form i.e. [x_min, y_min, x_max, y_max].
- image_shape: List of two integers indicating height and width of image respectively.
Returns
List of clipped coordinates.
compute_iou
paz.backend.boxes.compute_iou(box, boxes)
Calculates the intersection over union between 'box' and all 'boxes'.
Both box
and boxes
are in corner coordinates.
Arguments
- box: Numpy array with length at least of 4.
- boxes: Numpy array with shape
(num_boxes, 4)
.
Returns
Numpy array of shape (num_boxes, 1)
.
compute_ious
paz.backend.boxes.compute_ious(boxes_A, boxes_B)
Calculates the intersection over union between boxes_A
and boxes_B
.
For each box present in the rows of boxes_A
it calculates
the intersection over union with respect to all boxes in boxes_B
.
The variables boxes_A
and boxes_B
contain the corner coordinates
of the left-top corner (x_min, y_min)
and the right-bottom
(x_max, y_max)
corner.
Arguments
- boxes_A: Numpy array with shape
(num_boxes_A, 4)
. - boxes_B: Numpy array with shape
(num_boxes_B, 4)
.
Returns
Numpy array of shape (num_boxes_A, num_boxes_B)
.
decode
paz.backend.boxes.decode(predictions, priors, variances=[0.1, 0.1, 0.2, 0.2])
Decode default boxes into the ground truth boxes
Arguments
- loc: Numpy array of shape
(num_priors, 4)
. - priors: Numpy array of shape
(num_priors, 4)
. - variances: List of two floats. Variances of prior boxes.
Returns
decoded boxes: Numpy array of shape (num_priors, 4)
.
denormalize_box
paz.backend.boxes.denormalize_box(box, image_shape)
Scales corner box coordinates from normalized values to image dimensions
Arguments
- box: Numpy array containing corner box coordinates.
- image_shape: List of integers with (height, width).
Returns
- returns: box corner coordinates in image dimensions
encode
paz.backend.boxes.encode(matched, priors, variances=[0.1, 0.1, 0.2, 0.2])
Encode the variances from the priorbox layers into the ground truth boxes we have matched (based on jaccard overlap) with the prior boxes.
Arguments
- matched: Numpy array of shape
(num_priors, 4)
with boxes in point-form. - priors: Numpy array of shape
(num_priors, 4)
with boxes in center-form. - variances: (list[float]) Variances of priorboxes
Returns
encoded boxes: Numpy array of shape (num_priors, 4)
.
flip_left_right
paz.backend.boxes.flip_left_right(boxes, width)
Flips box coordinates from left-to-right and vice-versa. Arguments
- boxes: Numpy array of shape
[num_boxes, 4]
. Returns
Numpy array of shape [num_boxes, 4]
.
make_box_square
paz.backend.boxes.make_box_square(box)
Makes box coordinates square with sides equal to the longest original side.
Arguments
- box: Numpy array with shape
(4)
with point corner coordinates.
Returns
- returns: List of box coordinates ints.
match
paz.backend.boxes.match(boxes, prior_boxes, iou_threshold=0.5)
Matches each prior box with a ground truth box (box from boxes
).
It then selects which matched box will be considered positive e.g. iou > .5
and returns for each prior box a ground truth box that is either positive
(with a class argument different than 0) or negative.
Arguments
- boxes: Numpy array of shape
(num_ground_truh_boxes, 4 + 1)
, where the first the first four coordinates correspond to box coordinates and the last coordinates is the class argument. This boxes should be the ground truth boxes. - prior_boxes: Numpy array of shape
(num_prior_boxes, 4)
. where the four coordinates are in center form coordinates. - iou_threshold: Float between [0, 1]. Intersection over union used to determine which box is considered a positive box.
Returns
numpy array of shape (num_prior_boxes, 4 + 1)
.
where the first the first four coordinates correspond to point
form box coordinates and the last coordinates is the class
argument.
nms_per_class
paz.backend.boxes.nms_per_class(box_data, nms_thresh=0.45, epsilon=0.01, top_k=200)
Applies non maximum suppression per class. This function takes all the detections from the detector which consists of boxes and their corresponding class scores to which it applies non maximum suppression for every class independently and then combines the result.
Arguments
- box_data: Array of shape
(num_nms_boxes, 4 + num_classes)
containing the box coordinates as well as the predicted scores of all the classes for all non suppressed boxes. - nms_thresh: Float, Non-maximum suppression threshold.
- epsilon: Float, Filter scores with a lower confidence value before performing non-maximum supression.
- top_k: Int, Maximum number of boxes per class outputted by nms.
Returns
- Tuple: Containing an array non suppressed boxes of shape
(num_nms_boxes, 4 + num_classes)
and an array of corresponding class labels of shape(num_nms_boxes, )
.
to_image_coordinates
paz.backend.boxes.to_image_coordinates(boxes, image)
Transforms normalized box coordinates into image coordinates. Arguments
- image: Numpy array.
- boxes: Numpy array of shape
[num_boxes, N]
where N >= 4. Returns
Numpy array of shape [num_boxes, N]
.
to_center_form
paz.backend.boxes.to_center_form(boxes)
Transform from corner coordinates to center coordinates.
Arguments
- boxes: Numpy array with shape
(num_boxes, 4)
.
Returns
Numpy array with shape (num_boxes, 4)
.
to_one_hot
paz.backend.boxes.to_one_hot(class_indices, num_classes)
Transform from class index to one-hot encoded vector.
Arguments
- class_indices: Numpy array. One dimensional array specifying the index argument of the class for each sample.
- num_classes: Integer. Total number of classes.
Returns
Numpy array with shape (num_samples, num_classes)
.
to_normalized_coordinates
paz.backend.boxes.to_normalized_coordinates(boxes, image)
Transforms coordinates in image dimensions to normalized coordinates. Arguments
- image: Numpy array.
- boxes: Numpy array of shape
[num_boxes, N]
where N >= 4. Returns
Numpy array of shape [num_boxes, N]
.
to_corner_form
paz.backend.boxes.to_corner_form(boxes)
Transform from center coordinates to corner coordinates.
Arguments
- boxes: Numpy array with shape
(num_boxes, 4)
.
Returns
Numpy array with shape (num_boxes, 4)
.
extract_bounding_box_corners
paz.backend.boxes.extract_bounding_box_corners(points3D)
Extracts the (x_min, y_min, z_min) and the (x_max, y_max, z_max) coordinates from an array of points3D Arguments
- points3D: Array (num_points, 3)
Returns
Left-down-bottom corner (x_min, y_min, z_min) and right-up-top (x_max, y_max, z_max) corner.
scale_box
paz.backend.boxes.scale_box(predictions, image_scales)
Arguments
- predictions: Array of shape
(num_boxes, num_classes+N)
model predictions. - image_scales: Array of shape
()
, scale value of boxes.
Returns
- predictions: Array of shape
(num_boxes, num_classes+N)
model predictions.
change_box_coordinates
paz.backend.boxes.change_box_coordinates(outputs)
Converts box coordinates format from (y_min, x_min, y_max, x_max) to (x_min, y_min, x_max, y_max).
Arguments
- outputs: Tensor, model output.
Returns
- outputs: Array, Processed outputs by merging the features at all levels. Each row corresponds to box coordinate offsets and sigmoid of the class logits.
add_class_and_score
paz.backend.boxes.add_class_and_score(predictions, box)
Adds class and score to box.
Arguments
- predictions: Dictionary with keys
class_name
andscores
. - box: Array of shape
(num_nms_boxes, 4 + num_classes)
.