inferno.extensions.optimizers package¶
Submodules¶
inferno.extensions.optimizers.adam module¶
-
class
inferno.extensions.optimizers.adam.
Adam
(params, lr=0.001, betas=(0.9, 0.999), eps=1e-08, lambda_l1=0, weight_decay=0, **kwargs)[source]¶ Bases: torch.optim.optimizer.Optimizer
Implements Adam algorithm with the option of adding a L1 penalty.
It has been proposed in Adam: A Method for Stochastic Optimization.
Parameters: - params (iterable) – iterable of parameters to optimize or dicts defining parameter groups
- lr (float, optional) – learning rate (default: 1e-3)
- betas (Tuple[float, float], optional) – coefficients used for computing running averages of gradient and its square (default: (0.9, 0.999))
- eps (float, optional) – term added to the denominator to improve numerical stability (default: 1e-8)
- weight_decay (float, optional) – weight decay (L2 penalty) (default: 0)
inferno.extensions.optimizers.annealed_adam module¶
-
class
inferno.extensions.optimizers.annealed_adam.
AnnealedAdam
(params, lr=0.001, betas=(0.9, 0.999), eps=1e-08, lambda_l1=0, weight_decay=0, lr_decay=1.0)[source]¶ Bases: inferno.extensions.optimizers.adam.Adam
Implements Adam algorithm with learning rate annealing and optional L1 penalty.
It has been proposed in Adam: A Method for Stochastic Optimization.
Parameters: - params (iterable) – iterable of parameters to optimize or dicts defining parameter groups
- lr (float, optional) – learning rate (default: 1e-3)
- betas (Tuple[float, float], optional) – coefficients used for computing running averages of gradient and its square (default: (0.9, 0.999))
- eps (float, optional) – term added to the denominator to improve numerical stability (default: 1e-8)
- lambda_l1 (float, optional) – L1 penalty (default: 0)
- weight_decay (float, optional) – L2 penalty (weight decay) (default: 0)
- lr_decay (float, optional) – decay learning rate by this factor after every step (default: 1.)