Let \( \hat{r}_k \) be the residual at the \( k \)-th step:
$$ \begin{equation*} \hat{r}_k=\hat{b}-\hat{A}\hat{x}_k. \end{equation*} $$Note that \( \hat{r}_k \) is the negative gradient of \( f \) at \( \hat{x}=\hat{x}_k \), so the gradient descent method would be to move in the direction \( \hat{r}_k \). Here, we insist that the directions \( \hat{p}_k \) are conjugate to each other, so we take the direction closest to the gradient \( \hat{r}_k \) under the conjugacy constraint. This gives the following expression
$$ \begin{equation*} \hat{p}_{k+1}=\hat{r}_k-\frac{\hat{p}_k^T \hat{A}\hat{r}_k}{\hat{p}_k^T\hat{A}\hat{p}_k} \hat{p}_k. \end{equation*} $$