Loading [MathJax]/extensions/TeX/boldsymbol.js

 

 

 

Optimizing the logarithm instead

Computing the derivatives with respect to the parameters \boldsymbol{\Theta} is easier (and equivalent) if we compute the logarithm of the probability. We will thus optimize

{\displaystyle \mathrm{arg} \hspace{0.1cm}\max_{\boldsymbol{\boldsymbol{\Theta}}\in {\mathbb{R}}^{p}}} \hspace{0.1cm}\log{p(\boldsymbol{X};\boldsymbol{\Theta})},

which leads to

\nabla_{\boldsymbol{\Theta}}\log{p(\boldsymbol{X};\boldsymbol{\Theta})}=0.