Loading [MathJax]/extensions/TeX/boldsymbol.js

 

 

 

Regression analysis, overarching aims

Regression modeling deals with the description of the sampling distribution of a given random variable y and how it varies as function of another variable or a set of such variables \boldsymbol{x} =[x_0, x_1,\dots, x_{n-1}]^T . The first variable is called the dependent, the outcome or the response variable while the set of variables \boldsymbol{x} is called the independent variable, or the predictor variable or the explanatory variable, or simply just the inputs.

A regression model aims at finding a likelihood function p(\boldsymbol{y}\vert \boldsymbol{x}) or in the more traditional sense a function \boldsymbol{y}(\boldsymbol{x}) , that is the conditional distribution for \boldsymbol{y} with a given \boldsymbol{x} . The estimation of p(\boldsymbol{y}\vert \boldsymbol{x}) is made using a data set with

  • n cases i = 0, 1, 2, \dots, n-1
  • Response (target, dependent or outcome) variable y_i with i = 0, 1, 2, \dots, n-1
  • p so-called explanatory (independent or predictor or feature) variables \boldsymbol{x}_i=[x_{i0}, x_{i1}, \dots, x_{ip-1}] with i = 0, 1, 2, \dots, n-1 and explanatory variables running from 0 to p-1 . See below for more explicit examples.

The goal of the regression analysis is to extract/exploit relationship between \boldsymbol{y} and \boldsymbol{x} in order to infer specific dependencies, approximations to the likelihood functions, functional relationships and to make predictions, making fits and many other things.