Exercises week 35#
August 26-30, 2024
Date: Deadline is Friday August 30 at midnight
Exercise 1: Analytical exercises#
In this exercise we derive the expressions for various derivatives of products of vectors and matrices. Such derivatives are central to the optimization of various cost functions. Although we will often use automatic differentiation in actual calculations, to be able to have analytical expressions is extremely helpful in case we have simpler derivatives as well as when we analyze various properties (like second derivatives) of the chosen cost functions. Vectors are always written as boldfaced lower case letters and matrices as upper case boldfaced letters. You will find useful the notes from week 35 on derivatives of vectors and matrices. See also the textbook of Faisal at al, chapter 5 and in particular sections 5.3-5.5 at CompPhysics/MachineLearning
Show that
and
and
and finally find the second derivative of this function with respect to the vector
The second derivative of the mean squared error is then proportional to the so-called Hessian matrix
Hint: In these exercises it is always useful to write out with summation indices the various quantities. Take also a look at the weekly slides from week 35 and the various examples included there.
As an example, consider the function
which reads for a specific component
which leads to
and written out in terms of the vector
Exercise 2: making your own data and exploring scikit-learn#
We will generate our own dataset for a function
import numpy as np
x = np.random.rand(100,1)
y = 2.0+5*x*x+0.1*np.random.randn(100,1)
Write your own code (following the examples under the regression notes) for computing the parametrization of the data set fitting a second-order polynomial.
Use thereafter scikit-learn (see again the examples in the slides for week 35) and compare with your own code. Note here that scikit-learn does not include, by default, the intercept. See the discussions on scaling your data in the slides for this week. This type of problems appear in particular if we fit a polynomial with an intercept.
Using scikit-learn, compute also the mean squared error, a risk metric corresponding to the expected value of the squared (quadratic) error defined as
and the
where we have defined the mean value of
You can use the functionality included in scikit-learn. If you feel for it, you can use your own program and define functions which compute the above two functions. Discuss the meaning of these results. Try also to vary the coefficient in front of the added stochastic noise term and discuss the quality of the fits.
Exercise 3: Split data in test and training data#
In this exercise we want you to to compute the MSE for the training data and the test data as function of the complexity of a polynomial, that is the degree of a given polynomial.
The aim is to reproduce Figure 2.11 of Hastie et al. Feel free to read the discussions leading to figure 2.11 of Hastie et al.
Our data is defined by
np.random.seed()
n = 100
# Make data set.
x = np.linspace(-3, 3, n).reshape(-1, 1)
y = np.exp(-x**2) + 1.5 * np.exp(-(x-2)**2)+ np.random.normal(0, 0.1, x.shape)
where
a)
Write a first code which sets up a design matrix
b) Write thereafter (using either scikit-learn or your matrix inversion code using for example numpy) and perform an ordinary least squares fitting and compute the mean squared error for the training data and the test data. These calculations should apply to a model given by a fifth-order polynomial. If you compare your own code with scikit_learn, not that the latter does not include by default the intercept. See the discussions on scaling your data in the slides for this week.
c)
Add now a model which allows you to make polynomials up to degree