recognite.model package

class recognite.model.Recognizer(model_name: str, num_classes: int | None = None, weights: Weights | str | None = None, clf_bias: bool = False, normalize: bool = True)[source]

A recognition model consisting of a backbone and a classifier.

The task of the backbone is to compute an embedding for a given input image. During training, the embedding is passed through the classifier (a single fully-connected layer). Training the classifier acts as a proxy objective for the optimization of the backbone.

During inference (or evaluation), the classifier is ignored and the model returns the embedding that comes out of the backbone.

We support all classifier models available in the torchvision library, apart from the squeezenet-based models. The entire list of supported models is available as the global variable SUPPORTED_MODELS.

We normalize both the embeddings as the classifier weights. As such, the columns in the classifier’s weights can be interpreted as reference embeddings for the corresponding classes and the optimization of the classifier directly optimizes the cosine similarity. As such, after sufficient training, extracted embeddings can be easily compared via a dot product.

backbone: The backbone model.

classifier: The fully-connected layer that acts as the classifier during training.

__init__(model_name: str, num_classes: int | None = None, weights: Weights | str | None = None, clf_bias: bool = False, normalize: bool = True)[source]

Parameters:

model_name – The name of the model to use. The classification layer of this model will be extracted and replaced by a fully-connected layer that outputs the desired number of classes (see num_classes). Note that we normalize the weights of the fully connected layer so that the columns have norm 1.
num_classes – The number of classes outputted by the classifier. If None, don’t use classifier and instead also return the backbone’s embedding during training.
weights – The pretrained weights to initialize the model with. If None, the weights are randomly initialized. See also <https://pytorch.org/vision/stable/models.html>.
clf_bias – If True, use a bias in the classification layer. Using bias in the fully connected layer together with weight normalization can worsen the results (see <https://dl.acm.org/doi/10.1145/3123266.3123359>), so we suggest to keep it turned off.
normalize – If True, normalize the embedding returned by the backbone, as well as the weights of the classifier (if applicable).

forward(x: Tensor) → Tensor[source]

Passes a batch of input images through the model.

During training, the input is passed through the backbone and the classifier, and the classification logits are returned. During inference, only the backbone is used and the extracted embeddings are returned.

Parameters:: x – The batch of input images.
Returns:: The classification logits during training, and the embeddings during inference (evaluation).

recognite.model.update_classifier(model: Module, num_classes: int, bias: bool = False) → Module[source]

Updates the classifier according to the given number of classes.

The classifier (fully connected layer) contained in the given model is updated in-place such that it outputs num_classes elements. This function supports all models that are available from torchvision.models.

Parameters:

model – The model to update. We assume that the last layer of the model is a classifier. This can be a single nn.Linear (like ResNet), or an nn.Sequential with an nn.Linear as final layer (like AlexNet).
num_classes – The new number of classes for the classifier.
bias – If True, use a bias in the updated classifier. Else, don’t.

Returns:

The updated model.

recognite.model.split_backbone_classifier(model: Module)[source]

Splits the given model into a backbone and a classifier module.

Parameters:: model – The model to split. We assume that the last layer of the model is a classifier. This can be a single nn.Linear (like ResNet), or an nn.Sequential with an nn.Linear as final layer (like AlexNet).

recognite.model.get_path_to_ultimate_classifier(model: Module)[source]

Returns the path to the ultimate fully-connected layer.

The path is returned as a list of strings, where each element is the attribute to get from the parent module to retrieve the corresponding module. To get the module from this path, use get_module_at_path(model, path). To change this module with another module, use set_module_at_path(module, path, new_module).

Parameters:: model – The model. We assume that the last layer of the model is a classifier. This can be a single nn.Linear (like ResNet) or an nn.Sequential with an nn.Linear as final layer (like AlexNet)

recognite.model.get_ultimate_classifier(model: Module)[source]

Returns the ultimate fully-connected layer.

Parameters:: model – The model. We assume that the last layer of the model is a classifier. This can be a single nn.Linear (like ResNet), or an nn.Sequential with an nn.Linear as final layer (like AlexNet).

recognite.model.get_module_at_path(model: Module, path: List[str]) → Module[source]

Returns the module at the given path in the model.

Parameters:

model – The model.
path – List of strings, where each element is the attribute to get from the parent module to retrieve the corresponding module.

Returns:

The module at the given path.

recognite.model.set_module_at_path(model: Module, path: List[str], new_module: Module)[source]

Replaces a module in a model.

Parameters:

model – The model.
path – List of strings, where each element is the attribute to get from the parent module to retrieve the corresponding module.
new_module – The module to put at the given path.