Data Analysis and Machine Learning Lectures: Optimization and Gradient Methods

Conjugate gradient method

One can show that the solution $\hat{x}$ is also the unique minimizer of the quadratic form $\begin{equation*} f(\hat{x}) = \frac{1}{2}\hat{x}^T\hat{A}\hat{x} - \hat{x}^T \hat{x} , \quad \hat{x}\in\mathbf{R}^n. \end{equation*}$ This suggests taking the first basis vector $\hat{p}_1$ to be the gradient of $f$ at $\hat{x}=\hat{x}_0$ , which equals $\begin{equation*} \hat{A}\hat{x}_0-\hat{b}, \end{equation*}$ and $\hat{x}_0=0$ it is equal $-\hat{b}$ . The other vectors in the basis will be conjugate to the gradient, hence the name conjugate gradient method.