Parameters to train, common settings

With parameter sharing, the convolution involves thus for each filter \( F\times F\times D_1 \) weights plus one bias parameter.

In total we have

$$ \left(F\times F\times D_1\right) \times K+K_{\mathrm{biases}}, $$

parameters to train by back propagation.

It is common to let \( K \) come in powers of \( 2 \), that is \( 32 \), \( 64 \), \( 128 \) etc.

  1. \( \begin{array}{c} F=3 & S=1 & P=1 \end{array} \)
  2. \( \begin{array}{c} F=5 & S=1 & P=2 \end{array} \)
  3. \( \begin{array}{c} F=5 & S=2 & P=\mathrm{open} \end{array} \)
  4. \( \begin{array}{c} F=1 & S=1 & P=0 \end{array} \)