Extending to more predictors

Within a binary classification problem, we can easily expand our model to include multiple predictors. Our ratio between likelihoods is then with \( p \) predictors

$$ \log{ \frac{p(\boldsymbol{\theta}\boldsymbol{x})}{1-p(\boldsymbol{\theta}\boldsymbol{x})}} = \theta_0+\theta_1x_1+\theta_2x_2+\dots+\theta_px_p. $$

Here we defined \( \boldsymbol{x}=[1,x_1,x_2,\dots,x_p] \) and \( \boldsymbol{\theta}=[\theta_0, \theta_1, \dots, \theta_p] \) leading to

$$ p(\boldsymbol{\theta}\boldsymbol{x})=\frac{ \exp{(\theta_0+\theta_1x_1+\theta_2x_2+\dots+\theta_px_p)}}{1+\exp{(\theta_0+\theta_1x_1+\theta_2x_2+\dots+\theta_px_p)}}. $$