We assume now that we have two classes with y_i either 0 or 1 . Furthermore we assume also that we have only two parameters \beta in our fitting of the Sigmoid function, that is we define probabilities
\begin{align*} p(y_i=1|x_i,\boldsymbol{\beta}) &= \frac{\exp{(\beta_0+\beta_1x_i)}}{1+\exp{(\beta_0+\beta_1x_i)}},\nonumber\\ p(y_i=0|x_i,\boldsymbol{\beta}) &= 1 - p(y_i=1|x_i,\boldsymbol{\beta}), \end{align*}where \boldsymbol{\beta} are the weights we wish to extract from data, in our case \beta_0 and \beta_1 .
Note that we used
p(y_i=0\vert x_i, \boldsymbol{\beta}) = 1-p(y_i=1\vert x_i, \boldsymbol{\beta}).