Adamax optimizer from Section 7 of the Adam paper. It is a variant of Adam based on the infinity norm.

optimizer_adamax(
lr = 0.002,
beta_1 = 0.9,
beta_2 = 0.999,
epsilon = NULL,
decay = 0,
clipnorm = NULL,
clipvalue = NULL
)

## Arguments

 lr float >= 0. Learning rate. beta_1 The exponential decay rate for the 1st moment estimates. float, 0 < beta < 1. Generally close to 1. beta_2 The exponential decay rate for the 2nd moment estimates. float, 0 < beta < 1. Generally close to 1. epsilon float >= 0. Fuzz factor. If NULL, defaults to k_epsilon(). decay float >= 0. Learning rate decay over each update. clipnorm Gradients will be clipped when their L2 norm exceeds this value. clipvalue Gradients will be clipped when their absolute value exceeds this value.

Other optimizers: optimizer_adadelta(), optimizer_adagrad(), optimizer_adam(), optimizer_nadam(), optimizer_rmsprop(), optimizer_sgd()