optimizer_adamax
Adamax optimizer
Description
Adamax optimizer from Section 7 of the Adam paper. It is a variant of Adam based on the infinity norm.
Usage
optimizer_adamax(
learning_rate = 0.002,
beta_1 = 0.9,
beta_2 = 0.999,
epsilon = NULL,
decay = 0,
clipnorm = NULL,
clipvalue = NULL,
... )
Arguments
Arguments | Description |
---|---|
learning_rate | float >= 0. Learning rate. |
beta_1 | The exponential decay rate for the 1st moment estimates. float, 0 < beta < 1. Generally close to 1. |
beta_2 | The exponential decay rate for the 2nd moment estimates. float, 0 < beta < 1. Generally close to 1. |
epsilon | float >= 0. Fuzz factor. If NULL , defaults to k_epsilon() . |
decay | float >= 0. Learning rate decay over each update. |
clipnorm | Gradients will be clipped when their L2 norm exceeds this value. |
clipvalue | Gradients will be clipped when their absolute value exceeds this value. |
… | Unused, present only for backwards compatability |
See Also
Other optimizers: optimizer_adadelta()
, optimizer_adagrad()
, optimizer_adam()
, optimizer_nadam()
, optimizer_rmsprop()
, optimizer_sgd()