The parameters
The network parameters, to be optimized/learned:
- \( \boldsymbol{a} \) represents the visible bias, a vector of same length \( M \) as \( \boldsymbol{x} \).
- \( \boldsymbol{b} \) represents the hidden bias, a vector of same length \( N \) as \( \boldsymbol{h} \).
- \( \boldsymbol{W} \) represents the interaction weights, a matrix of size \( M\times N \).
Note that we have specified the lengths of \( bm{x} \) and \( \boldsymbol{h} \). These
lengths define the number of visible and hidden units, respectively.