Adam: Bias Correction

To counteract initialization bias in \( m_t, v_t \), Adam computes bias-corrected estimates

$$ \hat{m}_t = \frac{m_t}{1 - \beta_1^t}, \qquad \hat{v}_t = \frac{v_t}{1 - \beta_2^t}. $$