AdamW

Published on: August 5, 2021

AdamW

Table of Content

AdamW is a stochastic optimization method that modifies the typical implementation of weight decay in Adam to combat Adam's known convergence problems by decoupling the weight decay from the gradient updates.

Code

Resources

More stories

  • Activation Functions

  • RAdam - Rectified Adam

  • QHM (Quasi-Hyperbolic Momentum)