Variational Monte Carlo methods
Contents
5. Variational Monte Carlo methods¶
5.1. Quantum Monte Carlo Motivation¶
We start with the variational principle.
Given a hamiltonian
is an upper bound to the ground state energy
In general, the integrals involved in the calculation of various expectation values are multi-dimensional ones. Traditional integration methods such as the Gauss-Legendre will not be adequate for say the computation of the energy of a many-body system.
The trial wave function can be expanded in the eigenstates of the hamiltonian since they form a complete set, viz.,
and assuming the set of eigenfunctions to be normalized one obtains
where we used that
In most cases, a wave function has only small values in large parts of configuration space, and a straightforward procedure which uses homogenously distributed random points in configuration space will most likely lead to poor results. This may suggest that some kind of importance sampling combined with e.g., the Metropolis algorithm may be a more efficient way of obtaining the ground state energy. The hope is then that those regions of configurations space where the wave function assumes appreciable values are sampled more efficiently.
The tedious part in a VMC calculation is the search for the variational minimum. A good knowledge of the system is required in order to carry out reasonable VMC calculations. This is not always the case, and often VMC calculations serve rather as the starting point for so-called diffusion Monte Carlo calculations (DMC). DMC is a way of solving exactly the many-body Schroedinger equation by means of a stochastic procedure. A good guess on the binding energy and its wave function is however necessary. A carefully performed VMC calculation can aid in this context.
The basic recipe in a VMC calculation consists of the following elements:
Construct first a trial wave function
, for a many-body system consisting of particles located at positions . The trial wave function depends on variational parameters .Then we evaluate the expectation value of the hamiltonian
Thereafter we vary
according to some minimization algorithm and return to the first step.
With a trial wave function
This is our new probability distribution function (PDF). The approximation to the expectation value of the Hamiltonian is now
Define a new quantity
called the local energy, which, together with our trial PDF yields
with
The Algorithm for performing a variational Monte Carlo calculations runs thus as this
Initialisation: Fix the number of Monte Carlo steps. Choose an initial
and variational parameters and calculate .Initialise the energy and the variance and start the Monte Carlo calculation.
Calculate a trial position
where is a random variable .Metropolis algorithm to accept or reject this move
.If the step is accepted, then we set
.Update averages
Finish and compute final averages.
Observe that the jumping in space is governed by the variable step. This is Called brute-force sampling. Need importance sampling to get more relevant sampling, see lectures below.
5.1.1. Quantum Monte Carlo: hydrogen atom¶
The radial Schroedinger equation for the hydrogen atom can be written as
or with dimensionless variables
with the hamiltonian
Use variational parameter
Inserting this wave function into the expression for the
local energy
A simple variational Monte Carlo calculation results in
$\alpha$ | $\langle H \rangle $ | $\sigma^2$ | $\sigma/\sqrt{N}$ |
---|---|---|---|
7.00000E-01 | -4.57759E-01 | 4.51201E-02 | 6.71715E-04 |
8.00000E-01 | -4.81461E-01 | 3.05736E-02 | 5.52934E-04 |
9.00000E-01 | -4.95899E-01 | 8.20497E-03 | 2.86443E-04 |
1.00000E-00 | -5.00000E-01 | 0.00000E+00 | 0.00000E+00 |
1.10000E+00 | -4.93738E-01 | 1.16989E-02 | 3.42036E-04 |
1.20000E+00 | -4.75563E-01 | 8.85899E-02 | 9.41222E-04 |
1.30000E+00 | -4.54341E-01 | 1.45171E-01 | 1.20487E-03 |
We note that at
yields just a constant. The integral which defines various expectation values involving moments of the hamiltonian becomes then
This gives an important information: the exact wave function leads to zero variance! Variation is then performed by minimizing both the energy and the variance.
For bosons in a harmonic oscillator-like trap we will use is a spherical (S) or an elliptical (E) harmonic trap in one, two and finally three dimensions, with the latter given by
where (S) stands for symmetric and
as the two-body Hamiltonian of the system.
We will represent the inter-boson interaction by a pairwise, repulsive potential
where
Our trial wave function for the ground state with
where
For spherical traps we have
5.1.2. A simple Python code that solves the two-boson or two-fermion case in two-dimensions¶
%matplotlib inline
# Importing various packages
from math import exp, sqrt
from random import random, seed
import numpy as np
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
from matplotlib import cm
from matplotlib.ticker import LinearLocator, FormatStrFormatter
import sys
#Trial wave function for quantum dots in two dims
def WaveFunction(r,alpha,beta):
r1 = r[0,0]**2 + r[0,1]**2
r2 = r[1,0]**2 + r[1,1]**2
r12 = sqrt((r[0,0]-r[1,0])**2 + (r[0,1]-r[1,1])**2)
deno = r12/(1+beta*r12)
return exp(-0.5*alpha*(r1+r2)+deno)
#Local energy for quantum dots in two dims, using analytical local energy
def LocalEnergy(r,alpha,beta):
r1 = (r[0,0]**2 + r[0,1]**2)
r2 = (r[1,0]**2 + r[1,1]**2)
r12 = sqrt((r[0,0]-r[1,0])**2 + (r[0,1]-r[1,1])**2)
deno = 1.0/(1+beta*r12)
deno2 = deno*deno
return 0.5*(1-alpha*alpha)*(r1 + r2) +2.0*alpha + 1.0/r12+deno2*(alpha*r12-deno2+2*beta*deno-1.0/r12)
# The Monte Carlo sampling with the Metropolis algo
def MonteCarloSampling():
NumberMCcycles= 100000
StepSize = 1.0
# positions
PositionOld = np.zeros((NumberParticles,Dimension), np.double)
PositionNew = np.zeros((NumberParticles,Dimension), np.double)
# seed for rng generator
seed()
# start variational parameter
alpha = 0.9
for ia in range(MaxVariations):
alpha += .025
AlphaValues[ia] = alpha
beta = 0.2
for jb in range(MaxVariations):
beta += .01
BetaValues[jb] = beta
energy = energy2 = 0.0
DeltaE = 0.0
#Initial position
for i in range(NumberParticles):
for j in range(Dimension):
PositionOld[i,j] = StepSize * (random() - .5)
wfold = WaveFunction(PositionOld,alpha,beta)
#Loop over MC MCcycles
for MCcycle in range(NumberMCcycles):
#Trial position
for i in range(NumberParticles):
for j in range(Dimension):
PositionNew[i,j] = PositionOld[i,j] + StepSize * (random() - .5)
wfnew = WaveFunction(PositionNew,alpha,beta)
#Metropolis test to see whether we accept the move
if random() < wfnew**2 / wfold**2:
PositionOld = PositionNew.copy()
wfold = wfnew
DeltaE = LocalEnergy(PositionOld,alpha,beta)
energy += DeltaE
energy2 += DeltaE**2
#We calculate mean, variance and error ...
energy /= NumberMCcycles
energy2 /= NumberMCcycles
variance = energy2 - energy**2
error = sqrt(variance/NumberMCcycles)
Energies[ia,jb] = energy
return Energies, AlphaValues, BetaValues
#Here starts the main program with variable declarations
NumberParticles = 2
Dimension = 2
MaxVariations = 10
Energies = np.zeros((MaxVariations,MaxVariations))
AlphaValues = np.zeros(MaxVariations)
BetaValues = np.zeros(MaxVariations)
(Energies, AlphaValues, BetaValues) = MonteCarloSampling()
# Prepare for plots
fig = plt.figure()
ax = fig.gca(projection='3d')
# Plot the surface.
X, Y = np.meshgrid(AlphaValues, BetaValues)
surf = ax.plot_surface(X, Y, Energies,cmap=cm.coolwarm,linewidth=0, antialiased=False)
# Customize the z axis.
zmin = np.matrix(Energies).min()
zmax = np.matrix(Energies).max()
ax.set_zlim(zmin, zmax)
ax.set_xlabel(r'$\alpha$')
ax.set_ylabel(r'$\beta$')
ax.set_zlabel(r'$\langle E \rangle$')
ax.zaxis.set_major_locator(LinearLocator(10))
ax.zaxis.set_major_formatter(FormatStrFormatter('%.02f'))
# Add a color bar which maps values to colors.
fig.colorbar(surf, shrink=0.5, aspect=5)
plt.show()
---------------------------------------------------------------------------
KeyboardInterrupt Traceback (most recent call last)
Input In [1], in <cell line: 90>()
88 AlphaValues = np.zeros(MaxVariations)
89 BetaValues = np.zeros(MaxVariations)
---> 90 (Energies, AlphaValues, BetaValues) = MonteCarloSampling()
92 # Prepare for plots
93 fig = plt.figure()
Input In [1], in MonteCarloSampling()
62 for j in range(Dimension):
63 PositionNew[i,j] = PositionOld[i,j] + StepSize * (random() - .5)
---> 64 wfnew = WaveFunction(PositionNew,alpha,beta)
66 #Metropolis test to see whether we accept the move
67 if random() < wfnew**2 / wfold**2:
Input In [1], in WaveFunction(r, alpha, beta)
14 def WaveFunction(r,alpha,beta):
---> 15 r1 = r[0,0]**2 + r[0,1]**2
16 r2 = r[1,0]**2 + r[1,1]**2
17 r12 = sqrt((r[0,0]-r[1,0])**2 + (r[0,1]-r[1,1])**2)
KeyboardInterrupt:
5.2. Quantum Monte Carlo: the helium atom¶
The helium atom consists of two electrons and a nucleus with
charge
to the potential energy due to the attraction from the nucleus is
and if we add the repulsion arising from the two interacting electrons, we obtain the potential energy
with the electrons separated at a distance
The hamiltonian becomes then
and Schroedingers equation reads
All observables are evaluated with respect to the probability distribution
generated by the trial wave function.
The trial wave function must approximate an exact
eigenstate in order that accurate results are to be obtained.
Choice of trial wave function for Helium:
Assume
For small values of
since the second derivative does not diverge due to the finiteness of
This results in
and
A similar condition applies to electron 2 as well.
For orbital momenta
Similarly, studying the case
The last equation can be generalized to
for a system with
During the development of our code we need to make several checks. It is also very instructive to compute a closed form expression for the local energy. Since our wave function is rather simple it is straightforward to find an analytic expressions. Consider first the case of the simple helium function
The local energy is for this case
which gives an expectation value for the local energy given by
With closed form formulae we can speed up the computation of the correlation. In our case we write it as
which means that the gradient needed for the so-called quantum force and local energy can be calculated analytically. This will speed up your code since the computation of the correlation part and the Slater determinant are the most time consuming parts in your code.
We will refer to this correlation function as
We can test this by computing the local energy for our helium wave function
with
The local energy is for this case
It is very useful to test your code against these expressions. It means also that you don’t need to compute a derivative numerically as discussed in the code example below.
For the computation of various derivatives with different types of wave functions, you will find it useful to use python with symbolic python, that is sympy, see online manual. Using sympy allows you autogenerate both Latex code as well c++, python or Fortran codes. Here you will find some simple examples. We choose
the
with
from sympy import symbols, diff, exp, sqrt
x, y, z, Z = symbols('x y z Z')
r = sqrt(x*x + y*y + z*z)
r
phi = (Z*r - 2)*exp(-Z*r/2)
phi
diff(phi, x)
This doesn’t look very nice, but sympy provides several functions that allow for improving and simplifying the output.
We can improve our output by factorizing and substituting expressions
from sympy import symbols, diff, exp, sqrt, factor, Symbol, printing
x, y, z, Z = symbols('x y z Z')
r = sqrt(x*x + y*y + z*z)
phi = (Z*r - 2)*exp(-Z*r/2)
R = Symbol('r') #Creates a symbolic equivalent of r
#print latex and c++ code
print printing.latex(diff(phi, x).factor().subs(r, R))
print printing.ccode(diff(phi, x).factor().subs(r, R))
We can in turn look at second derivatives
from sympy import symbols, diff, exp, sqrt, factor, Symbol, printing
x, y, z, Z = symbols('x y z Z')
r = sqrt(x*x + y*y + z*z)
phi = (Z*r - 2)*exp(-Z*r/2)
R = Symbol('r') #Creates a symbolic equivalent of r
(diff(diff(phi, x), x) + diff(diff(phi, y), y) + diff(diff(phi, z), z)).factor().subs(r, R)
# Collect the Z values
(diff(diff(phi, x), x) + diff(diff(phi, y), y) +diff(diff(phi, z), z)).factor().collect(Z).subs(r, R)
# Factorize also the r**2 terms
(diff(diff(phi, x), x) + diff(diff(phi, y), y) + diff(diff(phi, z), z)).factor().collect(Z).subs(r, R).subs(r**2, R**2).factor()
print printing.ccode((diff(diff(phi, x), x) + diff(diff(phi, y), y) + diff(diff(phi, z), z)).factor().collect(Z).subs(r, R).subs(r**2, R**2).factor())
With some practice this allows one to be able to check one’s own calculation and translate automatically into code lines.
5.3. The Metropolis algorithm¶
The Metropolis algorithm , see the original article was invented by Metropolis et. al
and is often simply called the Metropolis algorithm.
It is a method to sample a normalized probability
distribution by a stochastic process. We define
Sample a possible new state
with some probability .Accept the new state
with probability and use it as the next sample. With probability the move is rejected and the original state is used again as a sample.
We wish to derive the required properties of
The dynamical equation for
Since the probability of making some transition must be 1,
For large
The balance requirement is very weak. Typically the much stronger detailed balance requirement is enforced, that is rather than the sum being set to zero, we set each term separately to zero and use this to determine the acceptance probabilities. Rearranging, the result is
The Metropolis choice is to maximize the
Other choices are possible, but they all correspond to multilplying
Having chosen the acceptance probabilities, we have guaranteed that
if the
The dynamical equation can be written as
with the matrix
Summing over
The Metropolis method is simply the power method for computing the
right eigenvector of
5.4. Importance sampling¶
We need to replace the brute force Metropolis algorithm with a walk in coordinate space biased by the trial wave function. This approach is based on the Fokker-Planck equation and the Langevin equation for generating a trajectory in coordinate space. The link between the Fokker-Planck equation and the Langevin equations are explained, only partly, in the slides below. An excellent reference on topics like Brownian motion, Markov chains, the Fokker-Planck equation and the Langevin equation is the text by Van Kampen Here we will focus first on the implementation part first.
For a diffusion process characterized by a time-dependent probability density
where
The new positions in coordinate space are given as the solutions of the Langevin equation using Euler’s method, namely, we go from the Langevin equation
with
where
The process of isotropic diffusion characterized by a time-dependent probability density
where
The drift vector should be of the form
The condition of stationary density means that the left hand side equals zero. In other words, the terms containing first and second derivatives have to cancel each other. It is possible only if
which is known as the so-called quantum force. This term is responsible for pushing the walker towards regions of configuration space where the trial wave function is large, increasing the efficiency of the simulation in contrast to the Metropolis algorithm where the walker has the same probability of moving in every direction.
The Fokker-Planck equation yields a (the solution to the equation) transition probability given by the Green’s function
which in turn means that our brute force Metropolis algorithm
with
5.5. Importance sampling, program elements¶
The general derivative formula of the Jastrow factor is (the subscript
However, with our written in way which can be reused later as
the gradient needed for the quantum force and local energy is easy to compute.
The function
In the Metropolis/Hasting algorithm, the acceptance ratio determines the probability for a particle to be accepted at a new position. The ratio of the trial wave functions evaluated at the new and current positions is given by (
Here
The expectation value of the kinetic energy expressed in atomic units for electron
The second derivative which enters the definition of the local energy is
We discuss here how to calculate these quantities in an optimal way,
We have defined the correlated function as
with
In our particular case we have
The total number of different relative distances
This applies to
In our algorithm we will move one particle at the time, say the
We have that the ratio between Jastrow factors
For the Pade-Jastrow form
where
One needs to develop a special algorithm
that runs only through the elements of the upper triangular
matrix
The expression to be derived in the following is of interest when computing the quantum force and the kinetic energy. It has the form
for all dimensions and with
For the first derivative only
An equivalent equation is obtained for the exponential form after replacing
with both expressions scaling as
Using the identity
we get expressions where all the derivatives acting on the particle are represented by the second index of
and for the exponential case:
For correlation forms depending only on the scalar distances
we arrive at
Note that for the Pade-Jastrow form we can set
Therefore,
where
is the relative distance.
The second derivative of the Jastrow factor divided by the Jastrow factor (the way it enters the kinetic energy) is
But we have a simple form for the function, namely
and it is easy to see that for particle
5.6. Importance sampling, Fokker-Planck and Langevin equations¶
A stochastic process is simply a function of two variables, one is the time,
the other is a stochastic variable
the set
of possible values for ;the probability distribution,
, over this set, or briefly
The set of values
An arbitrary number of other stochastic variables may be derived from
The quantity
is a function of
For many physical systems initial distributions of a stochastic
variable
where
Note that for a system in equilibrium the transition rate
Consider, for instance, a simple
system that has only two energy levels
For a system governed by the Boltzmann distribution we find (the partition function has been taken out)
We get then
which goes to zero when
If we assume a discrete set of events, our initial probability distribution function can be given by
and its time-development after a given time step
The continuous analog to
where we now have generalized the one-dimensional position
vector
The transition from a state
and after
When equilibrium is reached we have
that is no time-dependence. Note our change of notation for
We can solve the equation for
and using the definition of the
we see that
We can then use the Fourier-transformed diffusion equation
with the obvious solution
With the Fourier transform we obtain
with the normalization condition
The solution represents the probability of finding
our random walker at position
There is another interesting feature worth observing. The discrete transition probability
and that it satisfies the normalization condition and is itself a solution to the diffusion equation.
Let us now assume that we have three PDFs for times
and
and
We can combine these equations and arrive at the famous Einstein-Smoluchenski-Kolmogorov-Chapman (ESKC) relation
We can replace the spatial dependence with a dependence upon say the velocity (or momentum), that is we have
We will now derive the Fokker-Planck equation. We start from the ESKC equation
Define
Assume now that
We assume that
We say thus that
We can then rewrite the ESKC equation as
We have neglected higher powers of
We say thus that
We can then rewrite the ESKC equation as
We have neglected higher powers of
We simplify the above by introducing the moments
resulting in
When
If we neglect such terms we can rewrite the ESKC equation as
In a more compact form we have
which is the Fokker-Planck equation! It is trivial to replace position with velocity (momentum).
Consider a particle suspended in a liquid. On its path through the liquid it will continuously collide with the liquid molecules. Because on average the particle will collide more often on the front side than on the back side, it will experience a systematic force proportional with its velocity, and directed opposite to its velocity. Besides this systematic force the particle will experience a stochastic force
and .
From hydrodynamics we know that the friction constant
where
Solving the second equation in the previous slide we get
If we want to get some useful information out of this, we have to average over all possible realizations of
In order to continue we have to make some assumptions about the conditional averages of the stochastic forces. In view of the chaotic character of the stochastic forces the following assumptions seem to be appropriate
and
We omit the subscript
For large t this should be equal to 3kT/m, from which it follows that
This result is called the fluctuation-dissipation theorem .
Integrating
we get
from which we calculate the mean square displacement
For very large
from which we get the Einstein relation
where we have used
5.7. Code example for two electrons in a quantum dots¶
# 2-electron VMC code for 2dim quantum dot with importance sampling
# Using gaussian rng for new positions and Metropolis- Hastings
# No energy minimization
from math import exp, sqrt
from random import random, seed, normalvariate
import numpy as np
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
from matplotlib import cm
from matplotlib.ticker import LinearLocator, FormatStrFormatter
import sys
from numba import jit,njit
#Read name of output file from command line
if len(sys.argv) == 2:
outfilename = sys.argv[1]
else:
print('\nError: Name of output file must be given as command line argument.\n')
outfile = open(outfilename,'w')
# Trial wave function for the 2-electron quantum dot in two dims
def WaveFunction(r,alpha,beta):
r1 = r[0,0]**2 + r[0,1]**2
r2 = r[1,0]**2 + r[1,1]**2
r12 = sqrt((r[0,0]-r[1,0])**2 + (r[0,1]-r[1,1])**2)
deno = r12/(1+beta*r12)
return exp(-0.5*alpha*(r1+r2)+deno)
# Local energy for the 2-electron quantum dot in two dims, using analytical local energy
def LocalEnergy(r,alpha,beta):
r1 = (r[0,0]**2 + r[0,1]**2)
r2 = (r[1,0]**2 + r[1,1]**2)
r12 = sqrt((r[0,0]-r[1,0])**2 + (r[0,1]-r[1,1])**2)
deno = 1.0/(1+beta*r12)
deno2 = deno*deno
return 0.5*(1-alpha*alpha)*(r1 + r2) +2.0*alpha + 1.0/r12+deno2*(alpha*r12-deno2+2*beta*deno-1.0/r12)
# Setting up the quantum force for the two-electron quantum dot, recall that it is a vector
def QuantumForce(r,alpha,beta):
qforce = np.zeros((NumberParticles,Dimension), np.double)
r12 = sqrt((r[0,0]-r[1,0])**2 + (r[0,1]-r[1,1])**2)
deno = 1.0/(1+beta*r12)
qforce[0,:] = -2*r[0,:]*alpha*(r[0,:]-r[1,:])*deno*deno/r12
qforce[1,:] = -2*r[1,:]*alpha*(r[1,:]-r[0,:])*deno*deno/r12
return qforce
# The Monte Carlo sampling with the Metropolis algo
# jit decorator tells Numba to compile this function.
# The argument types will be inferred by Numba when function is called.
@jit()
def MonteCarloSampling():
NumberMCcycles= 100000
# Parameters in the Fokker-Planck simulation of the quantum force
D = 0.5
TimeStep = 0.05
# positions
PositionOld = np.zeros((NumberParticles,Dimension), np.double)
PositionNew = np.zeros((NumberParticles,Dimension), np.double)
# Quantum force
QuantumForceOld = np.zeros((NumberParticles,Dimension), np.double)
QuantumForceNew = np.zeros((NumberParticles,Dimension), np.double)
# seed for rng generator
seed()
# start variational parameter loops, two parameters here
alpha = 0.9
for ia in range(MaxVariations):
alpha += .025
AlphaValues[ia] = alpha
beta = 0.2
for jb in range(MaxVariations):
beta += .01
BetaValues[jb] = beta
energy = energy2 = 0.0
DeltaE = 0.0
#Initial position
for i in range(NumberParticles):
for j in range(Dimension):
PositionOld[i,j] = normalvariate(0.0,1.0)*sqrt(TimeStep)
wfold = WaveFunction(PositionOld,alpha,beta)
QuantumForceOld = QuantumForce(PositionOld,alpha, beta)
#Loop over MC MCcycles
for MCcycle in range(NumberMCcycles):
#Trial position moving one particle at the time
for i in range(NumberParticles):
for j in range(Dimension):
PositionNew[i,j] = PositionOld[i,j]+normalvariate(0.0,1.0)*sqrt(TimeStep)+\
QuantumForceOld[i,j]*TimeStep*D
wfnew = WaveFunction(PositionNew,alpha,beta)
QuantumForceNew = QuantumForce(PositionNew,alpha, beta)
GreensFunction = 0.0
for j in range(Dimension):
GreensFunction += 0.5*(QuantumForceOld[i,j]+QuantumForceNew[i,j])*\
(D*TimeStep*0.5*(QuantumForceOld[i,j]-QuantumForceNew[i,j])-\
PositionNew[i,j]+PositionOld[i,j])
GreensFunction = exp(GreensFunction)
ProbabilityRatio = GreensFunction*wfnew**2/wfold**2
#Metropolis-Hastings test to see whether we accept the move
if random() <= ProbabilityRatio:
for j in range(Dimension):
PositionOld[i,j] = PositionNew[i,j]
QuantumForceOld[i,j] = QuantumForceNew[i,j]
wfold = wfnew
DeltaE = LocalEnergy(PositionOld,alpha,beta)
energy += DeltaE
energy2 += DeltaE**2
# We calculate mean, variance and error (no blocking applied)
energy /= NumberMCcycles
energy2 /= NumberMCcycles
variance = energy2 - energy**2
error = sqrt(variance/NumberMCcycles)
Energies[ia,jb] = energy
outfile.write('%f %f %f %f %f\n' %(alpha,beta,energy,variance,error))
return Energies, AlphaValues, BetaValues
#Here starts the main program with variable declarations
NumberParticles = 2
Dimension = 2
MaxVariations = 10
Energies = np.zeros((MaxVariations,MaxVariations))
AlphaValues = np.zeros(MaxVariations)
BetaValues = np.zeros(MaxVariations)
(Energies, AlphaValues, BetaValues) = MonteCarloSampling()
outfile.close()
# Prepare for plots
fig = plt.figure()
ax = fig.gca(projection='3d')
# Plot the surface.
X, Y = np.meshgrid(AlphaValues, BetaValues)
surf = ax.plot_surface(X, Y, Energies,cmap=cm.coolwarm,linewidth=0, antialiased=False)
# Customize the z axis.
zmin = np.matrix(Energies).min()
zmax = np.matrix(Energies).max()
ax.set_zlim(zmin, zmax)
ax.set_xlabel(r'$\alpha$')
ax.set_ylabel(r'$\beta$')
ax.set_zlabel(r'$\langle E \rangle$')
ax.zaxis.set_major_locator(LinearLocator(10))
ax.zaxis.set_major_formatter(FormatStrFormatter('%.02f'))
# Add a color bar which maps values to colors.
fig.colorbar(surf, shrink=0.5, aspect=5)
plt.show()
5.7.1. Bringing the gradient optmization¶
The simple one-particle case in a harmonic oscillator trap
# Gradient descent stepping with analytical derivative
import numpy as np
from scipy.optimize import minimize
def DerivativeE(x):
return x-1.0/(4*x*x*x);
def Energy(x):
return x*x*0.5+1.0/(8*x*x);
x0 = 1.0
eta = 0.1
Niterations = 100
for iter in range(Niterations):
gradients = DerivativeE(x0)
x0 -= eta*gradients
print(x0)
# 2-electron VMC code for 2dim quantum dot with importance sampling
# Using gaussian rng for new positions and Metropolis- Hastings
from math import exp, sqrt
from random import random, seed, normalvariate
import numpy as np
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
from matplotlib import cm
from matplotlib.ticker import LinearLocator, FormatStrFormatter
import sys
from numba import jit
# Trial wave function for the 2-electron quantum dot in two dims
def WaveFunction(r,alpha):
r1 = r[0,0]**2 + r[0,1]**2
r2 = r[1,0]**2 + r[1,1]**2
return exp(-0.5*alpha*(r1+r2))
# Local energy for the 2-electron quantum dot in two dims, using analytical local energy
def LocalEnergy(r,alpha):
r1 = (r[0,0]**2 + r[0,1]**2)
r2 = (r[1,0]**2 + r[1,1]**2)
return 0.5*(1-alpha*alpha)*(r1 + r2) +2.0*alpha
# Derivate of wave function ansatz as function of variational parameters
def DerivativeWFansatz(r,alpha):
r1 = (r[0,0]**2 + r[0,1]**2)
r2 = (r[1,0]**2 + r[1,1]**2)
WfDer = -(r1+r2)
return WfDer
# Setting up the quantum force for the two-electron quantum dot, recall that it is a vector
def QuantumForce(r,alpha):
qforce = np.zeros((NumberParticles,Dimension), np.double)
qforce[0,:] = -2*r[0,:]*alpha
qforce[1,:] = -2*r[1,:]*alpha
return qforce
# Computing the derivative of the energy and the energy
# jit decorator tells Numba to compile this function.
# The argument types will be inferred by Numba when function is called.
@jit
def EnergyMinimization(alpha):
NumberMCcycles= 1000
# Parameters in the Fokker-Planck simulation of the quantum force
D = 0.5
TimeStep = 0.05
# positions
PositionOld = np.zeros((NumberParticles,Dimension), np.double)
PositionNew = np.zeros((NumberParticles,Dimension), np.double)
# Quantum force
QuantumForceOld = np.zeros((NumberParticles,Dimension), np.double)
QuantumForceNew = np.zeros((NumberParticles,Dimension), np.double)
# seed for rng generator
seed()
energy = 0.0
DeltaE = 0.0
EnergyDer = 0.0
DeltaPsi = 0.0
DerivativePsiE = 0.0
#Initial position
for i in range(NumberParticles):
for j in range(Dimension):
PositionOld[i,j] = normalvariate(0.0,1.0)*sqrt(TimeStep)
wfold = WaveFunction(PositionOld,alpha)
QuantumForceOld = QuantumForce(PositionOld,alpha)
#Loop over MC MCcycles
for MCcycle in range(NumberMCcycles):
#Trial position moving one particle at the time
for i in range(NumberParticles):
for j in range(Dimension):
PositionNew[i,j] = PositionOld[i,j]+normalvariate(0.0,1.0)*sqrt(TimeStep)+\
QuantumForceOld[i,j]*TimeStep*D
wfnew = WaveFunction(PositionNew,alpha)
QuantumForceNew = QuantumForce(PositionNew,alpha)
GreensFunction = 0.0
for j in range(Dimension):
GreensFunction += 0.5*(QuantumForceOld[i,j]+QuantumForceNew[i,j])*\
(D*TimeStep*0.5*(QuantumForceOld[i,j]-QuantumForceNew[i,j])-\
PositionNew[i,j]+PositionOld[i,j])
GreensFunction = exp(GreensFunction)
ProbabilityRatio = GreensFunction*wfnew**2/wfold**2
#Metropolis-Hastings test to see whether we accept the move
if random() <= ProbabilityRatio:
for j in range(Dimension):
PositionOld[i,j] = PositionNew[i,j]
QuantumForceOld[i,j] = QuantumForceNew[i,j]
wfold = wfnew
DeltaE = LocalEnergy(PositionOld,alpha)
DeltaPsi = DerivativeWFansatz(PositionOld,alpha)
energy += DeltaE
DerivativePsiE += DeltaPsi*DeltaE
# We calculate mean, variance and error (no blocking applied)
energy /= NumberMCcycles
DerivativePsiE /= NumberMCcycles
DeltaPsi /= NumberMCcycles
EnergyDer = 2*(DerivativePsiE-DeltaPsi*energy)
return energy, EnergyDer
#Here starts the main program with variable declarations
NumberParticles = 2
Dimension = 2
# guess for variational parameters
x0 = 1.5
# Set up iteration using stochastic gradient method
Energy =0 ; EnergyDer = 0
Energy, EnergyDer = EnergyMinimization(x0)
print(Energy, EnergyDer)
eta = 0.01
Niterations = 100
for iter in range(Niterations):
gradients = EnergyDer
x0 -= eta*gradients
Energy, EnergyDer = EnergyMinimization(x0)
print(x0)
5.8. VMC for fermions: Efficient calculation of Slater determinants¶
The potentially most time-consuming part is the
evaluation of the gradient and the Laplacian of an
We have to differentiate the determinant with respect to
all spatial coordinates of all particles. A brute force
differentiation would involve
This poses serious hindrances to the overall efficiency of our code.
The efficiency can be improved however if we move only one electron at the time.
The Slater determinant matrix
where
What we need to realize is that when differentiating a Slater determinant with respect to some given coordinate, only one row of the corresponding Slater matrix is changed.
Therefore, by recalculating the whole determinant we risk producing redundant information. The solution turns out to be an algorithm that requires to keep track of the inverse of the Slater matrix.
Let the current position in phase space be represented by the
The inverse of
Notice that the interchanged indices indicate that the matrix of cofactors is to be transposed.
If
Consider the ratio, which we shall call
Suppose now that we move only one particle at a time, meaning that
Inserting this into the numerator of eq. (14) and using eq. (12) to substitute the cofactors with the elements of the inverse matrix, we get:
Now by eq. (13) the denominator of the rightmost expression must be unity, so that we finally arrive at:
What this means is that in order to get the ratio when only the i-th
particle has been moved, we only need to calculate the dot
product of the vector
If the new position
The new elements of the j-th column of
Finally the elements of the i-th column of
We see from these formulas that the time scaling of an update of
The scheme is also applicable for the calculation of the ratios
involving derivatives. It turns
out that differentiating the Slater determinant with respect
to the coordinates of a single particle
5.8.1. The gradient and the Laplacian¶
The gradient and the Laplacian can therefore be calculated as follows:
and
Thus, to calculate all the derivatives of the Slater determinant, we
only need the derivatives of the single particle wave functions
(
Important note: In most cases you end with closed form expressions for the single-particle wave functions. It is then useful to calculate the various derivatives and make separate functions for them.
The Slater determinant takes the form
The Slater determinant as written is zero since the spatial wave functions for the spin up and spin down
states are equal.
But we can rewrite it as the product of two Slater determinants, one for spin up and one for spin down.
We can rewrite it as
where we have defined
and
We want to avoid to sum over spin variables, in particular when the interaction does not depend on spin.
It can be shown, see for example Moskowitz and Kalos, Int. J. Quantum Chem. 20 1107 (1981), that for the variational energy we can approximate the Slater determinant as
or more generally as
where we have the Slater determinant as the product of a spin up part involving the number of electrons with spin up only (2 for beryllium and 5 for neon) and a spin down part involving the electrons with spin down.
This ansatz is not antisymmetric under the exchange of electrons with opposite spins but it can be shown (show this) that it gives the same expectation value for the energy as the full Slater determinant.
As long as the Hamiltonian is spin independent, the above is correct. It is rather straightforward to see this if you go back to the equations for the energy discussed earlier this semester.
We will thus
factorize the full determinant
The combined dimensionality of the two smaller determinants equals the
dimensionality of the full determinant. Such a factorization is
advantageous in that it makes it possible to perform the calculation
of the ratio
This reduces the calculation time by a constant factor. The maximal
time reduction happens in a system of equal numbers of
Consider the case of moving only one particle at a time which originally had the following time scaling for one transition:
For the factorized determinants one of the two determinants is
obviously unaffected by the change so that it cancels from the ratio
Therefore, only one determinant of size
and the time scaling when the transitions for all
which gives the same reduction as in the case of moving all particles at once.
Computing the ratios discussed above requires that we maintain
the inverse of the Slater matrix evaluated at the current position.
Each time a trial position is accepted, the row number
This equation scales as
5.8.2. Expectation value of the kinetic energy¶
The expectation value of the kinetic energy expressed in atomic units for electron
The second derivative of the Jastrow factor divided by the Jastrow factor (the way it enters the kinetic energy) is
But we have a simple form for the function, namely
and it is easy to see that for particle
Using
and
The gradient and Laplacian can be calculated as follows:
and
The gradient for the determinant is
We have
the gradient needed for the quantum force and local energy is easy to compute.
We get for particle
which is rather easy to code. Remember to sum over all particles when you compute the local energy.
We need to compute the ratio between wave functions, in particular for the Slater determinants.
What this means is that in order to get the ratio when only the i-th
particle has been moved, we only need to calculate the dot
product of the vector