Adaptive Moment Estimation (Adam)
Published on: July 18, 2021
Table of Content
Adaptive Moment Estimation better known as Adam is another adaptive learning rate method first published in 2014 by Kingma et. al.  In addition to storing an exponentially decaying average of past squared gradients
To counteract the biases by calculating bias-corrected first and second moment esimates:
As default values for
 Diederik P. Kingma and Jimmy Ba (2014). Adam: A Method for Stochastic Optimization.
 Sebastian Ruder (2016). An overview of gradient descent optimization algorithms. arXiv preprint arXiv:1609.04747.