Boltzmann machines and deep learning

Contents

Energy models

Last week we defined a domain $ \boldsymbol{X} $ of stochastic variables $ \boldsymbol{X}= \{x_0,x_1, \dots , x_{n-1}\} $ with a pertinent probability distribution

$$ p(\boldsymbol{X})=\prod_{x_i\in \boldsymbol{X}}p(x_i), $$

where we have assumed that the random varaibles $ x_i $ are all independent and identically distributed (iid).

We will now assume that we can defined this function in terms of optimization parameters $ \boldsymbol{\Theta} $, which could be the biases and weights of deep network, and a set of hidden variables we also assume to be random variables which also are iid. The domain of these variables is $ \boldsymbol{H}= \{h_0,h_1, \dots , h_{m-1}\} $.