Skip to content

Classification

Models for object classification

[source]

MiniXception

paz.models.classification.xception.MiniXception(input_shape, num_classes, weights=None)

Build MiniXception (see references).

Arguments

  • input_shape: List of three integers e.g. [H, W, 3]
  • num_classes: Int.
  • weights: None or string with pre-trained dataset. Valid datasets include only FER.

Returns

Tensorflow-Keras model.

References


[source]

ProtoEmbedding

paz.models.classification.protonet.ProtoEmbedding(image_shape, num_blocks)

Embedding convolutional network used for proto-typical networks

Arguments:

  • image_shape: List with image shape (H, W, channels).
  • num_blocks: Ints. Number of convolution blocks.

Returns:

Keras model.

References:

prototypical networks


[source]

ProtoNet

paz.models.classification.protonet.ProtoNet(embed, num_classes, num_support, num_queries, image_shape)

Prototypical networks used for few-shot classification Arguments:

  • embed: Keras network for embedding images into metric space.
  • num_classes: Number of ways for few-shot classification.
  • num_support: Number of shots used for meta learning.
  • num_queries: Number of test images to query.
  • image_shape: List with image shape (H, W, channels).

Returns:

Keras model.

References:

prototypical networks


[source]

CNN2Plus1D

paz.models.classification.cnn2Plus1.CNN2Plus1D(weights=None, input_shape=(38, 96, 96, 3), seed=305865, architecture='CNN2Plus1D')

Binary Classification for videos with 2+1D CNNs. Arguments

  • weights: None or string with pre-trained dataset. Valid datasets include only VVAD-LRS3.
  • input_shape: List of integers. Input shape to the model in following format: (frames, height, width, channels)

e.g. (38, 96, 96, 3).

  • seed: Integer. Seed for random number generator.
  • architecture: String. Name of the architecture to use. Currently supported: 'CNN2Plus1D', 'CNN2Plus1D_Filters', 'CNN2Plus1D_Layers', 'CNN2Plus1D_Light'. 'CNN2Plus1D_18' is only available without weights.

Reference

(https://www.tensorflow.org/tutorials/video/video_classification#load_and_preprocess_video_data)


[source]

VVAD_LRS3_LSTM

paz.models.classification.vvad_lrs3.VVAD_LRS3_LSTM(weights=None, input_shape=(38, 96, 96, 3), seed=305865)

Binary Classification for videos using a CNN based mobile net with an TimeDistributed layer (LSTM). Arguments

  • weights: None or string with pre-trained dataset. Valid datasets include only VVAD-LRS3.
  • input_shape: List of integers. Input shape to the model in following format: (frames, height, width, channels)

e.g. (38, 96, 96, 3).

  • seed: Integer. Seed for random number generator.

Reference

  • [The VVAD-LRS3 Dataset for Visual Voice Activity Detection]

(https://api.semanticscholar.org/CorpusID:238198700)

  • [VVAD-LRS3 GitHub Repository]

(https://github.com/adriandavidauer/VVAD)