The function E(\boldsymbol{x},\boldsymbol{h},\boldsymbol{\Theta}) gives the energy of a configuration (pair of vectors) (\boldsymbol{x}, \boldsymbol{h}) . The lower the energy of a configuration, the higher the probability of it. This function also depends on the parameters \boldsymbol{a} , \boldsymbol{b} and W . Thus, when we adjust them during the learning procedure, we are adjusting the energy function to best fit our problem.