Week 38: Logistic Regression and Optimization

Loading [MathJax]/jax/output/HTML-CSS/jax.js

Optimization, the central part of any Machine Learning algortithm

Overview Video, why do we care about gradient methods?

Almost every problem in machine learning and data science starts with a dataset $X$ , a model $g(\beta)$ , which is a function of the parameters $\beta$ and a cost function $C(X, g(\beta))$ that allows us to judge how well the model $g(\beta)$ explains the observations $X$ . The model is fit by finding the values of $\beta$ that minimize the cost function. Ideally we would be able to solve for $\beta$ analytically, however this is not possible in general and we must use some approximative/numerical method to compute the minimum.