With the above definition we can write the probability as
p(\boldsymbol{x},\boldsymbol{h};\boldsymbol{\Theta}) = \frac{\exp{(\boldsymbol{a}^T\boldsymbol{x}+\boldsymbol{b}^T\boldsymbol{h}+\boldsymbol{x}^T\boldsymbol{W}\boldsymbol{h})}}{Z(\boldsymbol{\Theta})},where the biases \boldsymbol{a} and \boldsymbol{h} and the weights defined by the matrix \boldsymbol{W} are the parameters we need to optimize.