DALIB Basic Modules¶
Classifier¶
-
class
dalib.modules.classifier.
Classifier
(backbone: torch.nn.modules.module.Module, num_classes: int, bottleneck: Optional[torch.nn.modules.module.Module] = None, bottleneck_dim: Optional[int] = -1, head: Optional[torch.nn.modules.module.Module] = None)[source]¶ Bases:
torch.nn.modules.module.Module
A generic Classifier class for domain adaptation.
- Parameters:
- backbone (class:nn.Module object): Any backbone to extract 1-d features from data
- num_classes (int): Number of classes
- bottleneck (class:nn.Module object, optional): Any bottleneck layer. Use no bottleneck by default
- bottleneck_dim (int, optional): Feature dimension of the bottleneck layer. Default: -1
- head (class:nn.Module object, optional): Any classifier head. Use nn.Linear by default
Note
Different classifiers are used in different domain adaptation algorithms to achieve better accuracy respectively, and we provide a suggested Classifier for different algorithms. Remember they are not the core of algorithms. You can implement your own Classifier and combine it with the domain adaptation algorithm in this algorithm library.
Note
The learning rate of this classifier is set 10 times to that of the feature extractor for better accuracy by default. If you have other optimization strategies, please over-ride get_parameters.
- Inputs:
- x (tensor): input data fed to backbone
- Outputs: predictions, features
- predictions: classifier’s predictions
- features: features after bottleneck layer and before head layer
- Shape:
- Inputs: (minibatch, *) where * means, any number of additional dimensions
- predictions: (minibatch, num_classes)
- features: (minibatch, features_dim)
-
features_dim
¶ The dimension of features before the final head layer
Domain Discriminator¶
-
class
dalib.modules.domain_discriminator.
DomainDiscriminator
(in_feature: int, hidden_size: int)[source]¶ Bases:
torch.nn.modules.module.Module
Domain discriminator model from “Domain-Adversarial Training of Neural Networks”
Distinguish whether the input features come from the source domain or the target domain. The source domain label is 1 and the target domain label is 0.
- Parameters:
- in_feature (int): dimension of the input feature
- hidden_size (int): dimension of the hidden features
- Shape:
- Inputs: (minibatch, in_feature)
- Outputs: \((minibatch, 1)\)
GRL¶
-
class
dalib.modules.grl.
WarmStartGradientReverseLayer
(alpha: Optional[float] = 1.0, lo: Optional[float] = 0.0, hi: Optional[float] = 1.0, max_iters: Optional[int] = 1000.0, auto_step: Optional[bool] = False)[source]¶ Bases:
torch.nn.modules.module.Module
Gradient Reverse Layer \(\mathcal{R}(x)\) with warm start
The forward and backward behaviours are:
\[ \begin{align}\begin{aligned}\mathcal{R}(x) = x,\\\dfrac{ d\mathcal{R}} {dx} = - \lambda I.\end{aligned}\end{align} \]\(\lambda\) is initiated at \(lo\) and is gradually changed to \(hi\) using the following schedule:
\[\lambda = \dfrac{2(hi-lo)}{1+\exp(- α \dfrac{i}{N})} - (hi-lo) + lo\]where \(i\) is the iteration step.
- Parameters:
- alpha (float, optional): \(α\). Default: 1.0
- lo (float, optional): Initial value of \(\lambda\). Default: 0.0
- hi (float, optional): Final value of \(\lambda\). Default: 1.0
- max_iters (int, optional): \(N\). Default: 1000
- auto_step (bool, optional): If True, increase \(i\) each time forward is called. Otherwise use function step to increase \(i\). Default: False
Kernels¶
-
class
dalib.modules.kernels.
GaussianKernel
(sigma: Optional[float] = None, track_running_stats: Optional[bool] = True, alpha: Optional[float] = 1.0)[source]¶ Bases:
torch.nn.modules.module.Module
Gaussian Kernel Matrix
Gaussian Kernel k is defined by
\[k(x_1, x_2) = \exp \left( - \dfrac{\| x_1 - x_2 \|^2}{2\sigma^2} \right)\]where \(x_1, x_2 \in R^d\) are 1-d tensors.
Gaussian Kernel Matrix K is defined on input group \(X=(x_1, x_2, ..., x_m),\)
\[K(X)_{i,j} = k(x_i, x_j)\]Also by default, during training this layer keeps running estimates of the mean of L2 distances, which are then used to set hyperparameter \(\sigma\). Mathematically, the estimation is \(\sigma^2 = \dfrac{\alpha}{n^2}\sum_{i,j} \| x_i - x_j \|^2\). If
track_running_stats
is set toFalse
, this layer then does not keep running estimates, and use a fixed \(\sigma\) instead.- Parameters:
- sigma (float, optional): bandwidth \(\sigma\). Default: None
- track_running_stats (bool, optional): If
True
, this module tracks the running mean of \(\sigma^2\). Otherwise, it won’t track such statistics and always uses fix \(\sigma^2\). Default:True
- alpha (float, optional): \(\alpha\) which decides the magnitude of \(\sigma^2\) when track_running_stats is set to
True
- Inputs:
- X (tensor): input group \(X\)
- Shape:
- Inputs: \((minibatch, F)\) where F means the dimension of input features.
- Outputs: \((minibatch, minibatch)\)