With parameter sharing, the convolution involves thus for each filter \( F\times F\times D_1 \) weights plus one bias parameter.
In total we have
$$ \left(F\times F\times D_1\right) \times K+K_{\mathrm{biases}}, $$parameters to train by back propagation.
It is common to let \( K \) come in powers of \( 2 \), that is \( 32 \), \( 64 \), \( 128 \) etc.