One can show that the solution \hat{x} is also the unique minimizer of the quadratic form \begin{equation*} f(\hat{x}) = \frac{1}{2}\hat{x}^T\hat{A}\hat{x} - \hat{x}^T \hat{x} , \quad \hat{x}\in\mathbf{R}^n. \end{equation*} This suggests taking the first basis vector \hat{p}_1 to be the gradient of f at \hat{x}=\hat{x}_0 , which equals \begin{equation*} \hat{A}\hat{x}_0-\hat{b}, \end{equation*} and \hat{x}_0=0 it is equal -\hat{b} . The other vectors in the basis will be conjugate to the gradient, hence the name conjugate gradient method.