Generalizing the fitting procedure as a linear algebra problem

We redefine in turn the matrix \( \boldsymbol{X} \) as

$$ \boldsymbol{X}= \begin{bmatrix} x_{00}& x_{01} &x_{02}& \dots & \dots &x_{0,n-1}\\ x_{10}& x_{11} &x_{12}& \dots & \dots &x_{1,n-1}\\ x_{20}& x_{21} &x_{22}& \dots & \dots &x_{2,n-1}\\ \dots& \dots &\dots& \dots & \dots &\dots\\ x_{n-1,0}& x_{n-1,1} &x_{n-1,2}& \dots & \dots &x_{n-1,n-1}\\ \end{bmatrix} $$

and without loss of generality we rewrite again our equations as

$$ \boldsymbol{y} = \boldsymbol{X}\boldsymbol{\beta}+\boldsymbol{\epsilon}. $$

The left-hand side of this equation is kwown. Our error vector \( \boldsymbol{\epsilon} \) and the parameter vector \( \boldsymbol{\beta} \) are our unknow quantities. How can we obtain the optimal set of \( \beta_i \) values?