The parameters

The network parameters, to be optimized/learned:

  1. \( \boldsymbol{a} \) represents the visible bias, a vector of same length \( M \) as \( \boldsymbol{x} \).
  2. \( \boldsymbol{b} \) represents the hidden bias, a vector of same length \( N \) as \( \boldsymbol{h} \).
  3. \( \boldsymbol{W} \) represents the interaction weights, a matrix of size \( M\times N \).

Note that we have specified the lengths of \( bm{x} \) and \( \boldsymbol{h} \). These lengths define the number of visible and hidden units, respectively.