DALIB Basic Modules¶

Classifier¶

class dalib.modules.classifier.Classifier(backbone: torch.nn.modules.module.Module, num_classes: int, bottleneck: Optional[torch.nn.modules.module.Module] = None, bottleneck_dim: Optional[int] = -1, head: Optional[torch.nn.modules.module.Module] = None)[source]¶

Bases: torch.nn.modules.module.Module

A generic Classifier class for domain adaptation.

Parameters:

backbone (class:nn.Module object): Any backbone to extract 1-d features from data
num_classes (int): Number of classes
bottleneck (class:nn.Module object, optional): Any bottleneck layer. Use no bottleneck by default
bottleneck_dim (int, optional): Feature dimension of the bottleneck layer. Default: -1
head (class:nn.Module object, optional): Any classifier head. Use nn.Linear by default

Note

Different classifiers are used in different domain adaptation algorithms to achieve better accuracy respectively, and we provide a suggested Classifier for different algorithms. Remember they are not the core of algorithms. You can implement your own Classifier and combine it with the domain adaptation algorithm in this algorithm library.

Note

The learning rate of this classifier is set 10 times to that of the feature extractor for better accuracy by default. If you have other optimization strategies, please over-ride get_parameters.

Inputs:

x (tensor): input data fed to backbone

Outputs: predictions, features

predictions: classifier’s predictions
features: features after bottleneck layer and before head layer

Shape:

Inputs: (minibatch, *) where * means, any number of additional dimensions
predictions: (minibatch, num_classes)
features: (minibatch, features_dim)

features_dim¶: The dimension of features before the final head layer

get_parameters() → List[Dict[KT, VT]][source]¶: A parameter list which decides optimization hyper-parameters, such as the relative learning rate of each layer

Domain Discriminator¶

class dalib.modules.domain_discriminator.DomainDiscriminator(in_feature: int, hidden_size: int)[source]¶

Bases: torch.nn.modules.module.Module

Domain discriminator model from “Domain-Adversarial Training of Neural Networks”

Distinguish whether the input features come from the source domain or the target domain. The source domain label is 1 and the target domain label is 0.

Parameters:

in_feature (int): dimension of the input feature
hidden_size (int): dimension of the hidden features

Shape:

Inputs: (minibatch, in_feature)
Outputs: \((minibatch, 1)\)

GRL¶

class dalib.modules.grl.WarmStartGradientReverseLayer(alpha: Optional[float] = 1.0, lo: Optional[float] = 0.0, hi: Optional[float] = 1.0, max_iters: Optional[int] = 1000.0, auto_step: Optional[bool] = False)[source]¶

Bases: torch.nn.modules.module.Module

Gradient Reverse Layer \(\mathcal{R}(x)\) with warm start

The forward and backward behaviours are:

\[ \begin{align}\begin{aligned}\mathcal{R}(x) = x,\\\dfrac{ d\mathcal{R}} {dx} = - \lambda I.\end{aligned}\end{align} \]

\(\lambda\) is initiated at \(lo\) and is gradually changed to \(hi\) using the following schedule:

\[\lambda = \dfrac{2(hi-lo)}{1+\exp(- α \dfrac{i}{N})} - (hi-lo) + lo\]

where \(i\) is the iteration step.

Parameters:

alpha (float, optional): \(α\). Default: 1.0
lo (float, optional): Initial value of \(\lambda\). Default: 0.0
hi (float, optional): Final value of \(\lambda\). Default: 1.0
max_iters (int, optional): \(N\). Default: 1000
auto_step (bool, optional): If True, increase \(i\) each time forward is called. Otherwise use function step to increase \(i\). Default: False

step()[source]¶: Increase iteration number \(i\) by 1

Kernels¶

class dalib.modules.kernels.GaussianKernel(sigma: Optional[float] = None, track_running_stats: Optional[bool] = True, alpha: Optional[float] = 1.0)[source]¶

Bases: torch.nn.modules.module.Module

Gaussian Kernel Matrix

Gaussian Kernel k is defined by

\[k(x_1, x_2) = \exp \left( - \dfrac{\| x_1 - x_2 \|^2}{2\sigma^2} \right)\]

where \(x_1, x_2 \in R^d\) are 1-d tensors.

Gaussian Kernel Matrix K is defined on input group \(X=(x_1, x_2, ..., x_m),\)

\[K(X)_{i,j} = k(x_i, x_j)\]

Also by default, during training this layer keeps running estimates of the mean of L2 distances, which are then used to set hyperparameter \(\sigma\). Mathematically, the estimation is \(\sigma^2 = \dfrac{\alpha}{n^2}\sum_{i,j} \| x_i - x_j \|^2\). If track_running_stats is set to False, this layer then does not keep running estimates, and use a fixed \(\sigma\) instead.

Parameters:

sigma (float, optional): bandwidth \(\sigma\). Default: None
track_running_stats (bool, optional): If True, this module tracks the running mean of \(\sigma^2\). Otherwise, it won’t track such statistics and always uses fix \(\sigma^2\). Default: True
alpha (float, optional): \(\alpha\) which decides the magnitude of \(\sigma^2\) when track_running_stats is set to True

Inputs:

X (tensor): input group \(X\)

Shape:

Inputs: \((minibatch, F)\) where F means the dimension of input features.
Outputs: \((minibatch, minibatch)\)