Gate Intuition and Dynamics
- Forget gate \( f_t \) acts as a soft “erase” signal: \( f_t \approx 0 \) forgets, \( f_t \approx 1 \) retains previous memory.
- Input gate \( i_t \) scales how much new candidate memory \( \tilde{C}_t \) is written.
- Output gate \( o_t \) determines how much of the cell's memory flows into the hidden state \( h_t \).
- By controlling these gates, LSTM effectively keeps long-term information when needed.