The optimization problem is to minimize \( f(\mathbf {x} ) \) where \( \mathbf {x} \) is a vector in \( R^{n} \), and \( f \) is a differentiable scalar function. There are no constraints on the values that \( \mathbf {x} \) can take.
The algorithm begins at an initial estimate for the optimal value \( \mathbf {x}_{0} \) and proceeds iteratively to get a better estimate at each stage.
The search direction \( p_k \) at stage \( k \) is given by the solution of the analogue of the Newton equation
$$ B_{k}\mathbf {p} _{k}=-\nabla f(\mathbf {x}_{k}), $$where \( B_{k} \) is an approximation to the Hessian matrix, which is updated iteratively at each stage, and \( \nabla f(\mathbf {x} _{k}) \) is the gradient of the function evaluated at \( x_k \). A line search in the direction \( p_k \) is then used to find the next point \( x_{k+1} \) by minimising
$$ f(\mathbf {x}_{k}+\alpha \mathbf {p}_{k}), $$over the scalar \( \alpha > 0 \).
See notes from whiteboard