Revisiting our first homework
We will use linear regression as a case study for the gradient descent
methods. Linear regression is a great test case for the gradient
descent methods discussed in the lectures since it has several
desirable properties such as:
- An analytical solution (recall homework set 1).
- The gradient can be computed analytically.
- The cost function is convex which guarantees that gradient descent converges for small enough learning rates
We revisit the example from homework set 1 where we had
$$
y_i = 5x_i^2 + 0.1\xi_i, \ i=1,\cdots,100
$$
with \( x_i \in [0,1] \) chosen randomly with a uniform distribution. Additionally \( \xi_i \) represents stochastic noise chosen according to a normal distribution \( \cal {N}(0,1) \).
The linear regression model is given by
$$
h_\beta(x) = \hat{y} = \beta_0 + \beta_1 x,
$$
such that
$$
\hat{y}_i = \beta_0 + \beta_1 x_i.
$$