
The discriminator attempts to distinguish between samples drawn from the training data and samples drawn from the generator. In other words, it tries to tell the difference between the fake data produced by \( g \) and the actual data samples we want to do prediction on. The discriminator outputs a probability value given by $$ \begin{equation} d(x; \theta^{(d)}) \tag{2} \end{equation} $$

indicating the probability that \( x \) is a real training example rather than a fake sample the generator has generated. The simplest way to formulate the learning process in a generative adversarial network is a zero-sum game, in which a function $$ \begin{equation} v(\theta^{(g)}, \theta^{(d)}) \tag{3} \end{equation} $$

determines the reward for the discriminator, while the generator gets the conjugate reward $$ \begin{equation} -v(\theta^{(g)}, \theta^{(d)}) \tag{4} \end{equation} $$