Data Analysis and Machine Learning Lectures: Optimization and Gradient Methods

Loading [MathJax]/extensions/TeX/boldsymbol.js

Optimization, the central part of any Machine Learning algortithm

Almost every problem in machine learning and data science starts with a dataset $X$ , a model $g(\beta)$ , which is a function of the parameters $\beta$ and a cost function $C(X, g(\beta))$ that allows us to judge how well the model $g(\beta)$ explains the observations $X$ . The model is fit by finding the values of $\beta$ that minimize the cost function. Ideally we would be able to solve for $\beta$ analytically, however this is not possible in general and we must use some approximative/numerical method to compute the minimum.