At convergence we have $$ \begin{equation} g^* = \underset{g}{\mathrm{argmin}}\hspace{2pt} \underset{d}{\mathrm{max}}v(\theta^{(g)}, \theta^{(d)}) \tag{5} \end{equation} $$ The default choice for \( v \) is $$ \begin{equation} v(\theta^{(g)}, \theta^{(d)}) = \mathbb{E}_{x\sim p_\mathrm{data}}\log d(x) + \mathbb{E}_{x\sim p_\mathrm{model}} \log (1 - d(x)) \tag{6} \end{equation} $$ The main motivation for the design of GANs is that the learning process requires neither approximate inference (variational autoencoders for example) nor approximation of a partition function. In the case where $$ \begin{equation} \underset{d}{\mathrm{max}}v(\theta^{(g)}, \theta^{(d)}) \tag{7} \end{equation} $$ is convex in $\theta^{(g)} then the procedure is guaranteed to converge and is asymptotically consistent ( Seth Lloyd on QuGANs ).