Since the binary-binary energy model is linear in the parameters \( a_i \), \( b_j \) and \( w_{ij} \), it is easy to see that the derivatives with respect to the various optimization parameters yield expressions used in the evaluation of gradients like
$$ \frac{\partial E(\boldsymbol{x}, \boldsymbol{h};\boldsymbol{\Theta})}{\partial w_{ij}}=-x_ih_j, $$and
$$ \frac{\partial E(\boldsymbol{x}, \boldsymbol{h};\boldsymbol{\Theta})}{\partial a_i}=-x_i, $$and
$$ \frac{\partial E(\boldsymbol{x}, \boldsymbol{h};\boldsymbol{\Theta})}{\partial b_j}=-h_j. $$