We redefine in turn the matrix \boldsymbol{X} as
\boldsymbol{X}= \begin{bmatrix} x_{00}& x_{01} &x_{02}& \dots & \dots &x_{0,n-1}\\ x_{10}& x_{11} &x_{12}& \dots & \dots &x_{1,n-1}\\ x_{20}& x_{21} &x_{22}& \dots & \dots &x_{2,n-1}\\ \dots& \dots &\dots& \dots & \dots &\dots\\ x_{n-1,0}& x_{n-1,1} &x_{n-1,2}& \dots & \dots &x_{n-1,n-1}\\ \end{bmatrix}and without loss of generality we rewrite again our equations as
\boldsymbol{y} = \boldsymbol{X}\boldsymbol{\beta}+\boldsymbol{\epsilon}.The left-hand side of this equation is kwown. Our error vector \boldsymbol{\epsilon} and the parameter vector \boldsymbol{\beta} are our unknow quantities. How can we obtain the optimal set of \beta_i values?