Processing math: 74%

Computational Physics Lectures: Random walks, Brownian motion and the Metropolis algorithm

Morten Hjorth-Jensen [1, 2]

[1] Department of Physics, University of Oslo
[2] Department of Physics and Astronomy and National Superconducting Cyclotron Laboratory, Michigan State University

Apr 18, 2018












Why Markov chains, Brownian motion and the Metropolis algorithm

We want to study a physical system which evolves towards equilibrium, from som given initial conditions. Recall the simple example of particles in a box. At an initial time t0 all particles are in the left half of the box. Thereafter they are allowed to diffuse into the two halves of the box.











Why Markov chains, Brownian motion and the Metropolis algorithm

Use Markov chain Monte Carlo

Brownian motion and Markov processes

A Markov process is a random walk with a selected probability for making a move. The new move is independent of the previous history of the system.

The Markov process is used repeatedly in Monte Carlo simulations in order to generate new random states.

The reason for choosing a Markov process is that when it is run for a long enough time starting with a random state, we will eventually reach the most likely state of the system.

In thermodynamics, this means that after a certain number of Markov processes we reach an equilibrium distribution.

This mimicks the way a real system reaches its most likely state at a given temperature of the surroundings.

Brownian motion and Markov processes, Ergodicity and Detailed balance

To reach this distribution, the Markov process needs to obey two important conditions, that of ergodicity and detailed balance. These conditions impose then constraints on our algorithms for accepting or rejecting new random states.

The Metropolis algorithm discussed here abides to both these constraints.

The Metropolis algorithm is widely used in Monte Carlo simulations and the understanding of it rests within the interpretation of random walks and Markov processes.

Brownian motion and Markov processes, jargon

In a random walk one defines a mathematical entity called a walker, whose attributes completely define the state of the system in question.

The state of the system can refer to any physical quantities, from the vibrational state of a molecule specified by a set of quantum numbers, to the brands of coffee in your favourite supermarket.

The walker moves in an appropriate state space by a combination of deterministic and random displacements from its previous position.

This sequence of steps forms a chain.

Brownian motion and Markov processes, sequence of ingredients











Applications: almost every field in science

Vt+12σ2S22VS2+rSVSrV=0. The Black and Scholes equation is a partial differential equation, which describes the price of the option over time. It is a diffusion equation with a random term.

The list of applications is endless











A simple example (close to project 4) and some more jargon

The obvious case is that of a random walker on a one-, or two- or three-dimensional lattice (dubbed coordinate space hereafter).

Consider a system whose energy is defined by the orientation of single spins. Consider the state i, with given energy Ei represented by the following N spins 123k1kk+1N1N We may be interested in the transition with one single spinflip to a new state j with energy Ej 123k1kk+1N1N This change from one microstate i (or spin configuration) to another microstate j is the configuration space analogue to a random walk on a lattice. Instead of jumping from one place to another in space, we 'jump' from one microstate to another.











Markov processes

A Markov process allows in principle for a microscopic description of Brownian motion. As with the random walk studied in the previous section, we consider a particle which moves along the x-axis in the form of a series of jumps with step length Δx=l. Time and space are discretized and the subsequent moves are statistically independent, i.e., the new move depends only on the previous step and not on the results from earlier trials. We start at a position x=jl=jΔx and move to a new position x=iΔx during a step Δt=ϵ, where i0 and j0 are integers. The original probability distribution function (PDF) of the particles is given by wi(t=0) where i refers to a specific position on the grid in

The function wi(t=0) is now the discretized version of w(x,t). We can regard the discretized PDF as a vector.











Markov processes

For the Markov process we have a transition probability from a position x=jl to a position x=il given by Wij(ϵ)=W(iljl,ϵ)={12|ij|=10else, where Wij is normally called the transition probability and we can represent it, see below, as a matrix. Here we have specialized to a case where the transition probability is known.

Our new PDF wi(t=ϵ) is now related to the PDF at t=0 through the relation wi(t=ϵ)=jW(ji)wj(t=0).

This equation represents the discretized time-development of an original PDF with equal probability of jumping left or right.











Markov processes, the probabilities

Since both W and w represent probabilities, they have to be normalized, i.e., we require that at each time step we have iwi(t)=1, and jW(ji)=1, which applies for all j-values. The further constraints are 0Wij1 and 0wj1. Note that the probability for remaining at the same place is in general not necessarily equal zero.











Markov processes

The time development of our initial PDF can now be represented through the action of the transition probability matrix applied n times. At a time tn=nϵ our initial distribution has developed into wi(tn)=jWij(tn)wj(0), and defining W(iljl,nϵ)=(Wn(ϵ))ij we obtain wi(nϵ)=j(Wn(ϵ))ijwj(0), or in matrix form ˆw(nϵ)=ˆWn(ϵ)ˆw(0).











An Illustrative Example

The following simple example may help in understanding the meaning of the transition matrix ˆW and the vector ˆw. Consider the 4×4 matrix ˆW ˆW=(1/41/93/81/32/42/901/301/93/801/45/92/81/3), and we choose our initial state as ˆw(t=0)=(1000).











An Illustrative Example

We note that both the vector and the matrix are properly normalized. Summing the vector elements gives one and summing over columns for the matrix results also in one. Furthermore, the largest eigenvalue is one. We act then on ˆw with ˆW. The first iteration is ˆw(t=ϵ)=ˆWˆw(t=0),

resulting in ˆw(t=ϵ)=(1/41/201/4).











An Illustrative Example, next step

The next iteration results in ˆw(t=2ϵ)=ˆWˆw(t=ϵ),

resulting in ˆw(t=2ϵ)=(0.2013890.3194440.0555560.423611). Note that the vector ˆw is always normalized to 1.











An Illustrative Example, the steady state

We find the steady state of the system by solving the set of equations w(t=)=Ww(t=), which is an eigenvalue problem with eigenvalue equal to one! This set of equations reads W11w1(t=)+W12w2(t=)+W13w3(t=)+W14w4(t=)=w1(t=)W21w1(t=)+W22w2(t=)+W23w3(t=)+W24w4(t=)=w2(t=)W31w1(t=)+W32w2(t=)+W33w3(t=)+W34w4(t=)=w3(t=)W41w1(t=)+W42w2(t=)+W43w3(t=)+W44w4(t=)=w4(t=) with the constraint that iwi(t=)=1, yielding as solution ˆw(t=)=(0.2443180.3196020.0568180.379261).











An Illustrative Example, iterative steps

The table here demonstrates the convergence as a function of the number of iterations or time steps. After twelve iterations we have reached the exact value with six leading digits.

Iteration w1 w2 w3 w4
0 1.000000 0.000000 0.000000 0.000000
1 0.250000 0.500000 0.000000 0.250000
2 0.201389 0.319444 0.055556 0.423611
3 0.247878 0.312886 0.056327 0.382909
4 0.245494 0.321106 0.055888 0.377513
5 0.243847 0.319941 0.056636 0.379575
6 0.244274 0.319547 0.056788 0.379391
7 0.244333 0.319611 0.056801 0.379255
8 0.244314 0.319610 0.056813 0.379264
9 0.244317 0.319603 0.056817 0.379264
10 0.244318 0.319602 0.056818 0.379262
11 0.244318 0.319602 0.056818 0.379261
12 0.244318 0.319602 0.056818 0.379261
ˆw(t=) 0.244318 0.319602 0.056818 0.379261











An Illustrative Example, what does it mean?

We have after t-steps ˆw(t)=ˆWtˆw(0), with ˆw(0) the distribution at t=0 and ˆW representing the transition probability matrix.











An Illustrative Example, understanding the basics

We can always expand ˆw(0) in terms of the right eigenvectors ˆv of ˆW as ˆw(0)=iαiˆvi, resulting in ˆw(t)=ˆWtˆw(0)=ˆWtiαiˆvi=iλtiαiˆvi, with λi the ith eigenvalue corresponding to the eigenvector ˆvi.

If we assume that λ0 is the largest eigenvector we see that in the limit t, ˆw(t) becomes proportional to the corresponding eigenvector ˆv0. This is our steady state or final distribution.











Simple c++ program to perform the above calculations

The c++ program we have included here (using Armadillo) performs the above operations for, in this case, a 5×5 matrix. The largest eigenvalue is 1.

#include <iostream>
#include "armadillo"
using namespace arma;
using namespace std;

int main()
{
  int dim = 5;
  mat W = zeros<mat>(dim,dim);
  vec wold = zeros<mat>(dim);
  vec wnew = zeros<mat>(dim);
  vec eigenvector = zeros<mat>(dim);
  // Initializing the first vector
  wold(0) = 1.0;
  // Setting up the stochastic matrix W
  W(0,0) = 0.; W(0,1) = 0.; W(0,2) = 0.25; W(0,3) = 0.0;   W(0,4) = 0.;   
  W(1,0) = 0.; W(1,1) = 0.; W(1,2) = 0.25; W(1,3) = 0.; W(1,4) = 0.0;     
  W(2,0) = 0.5; W(2,1) = 1.0; W(2,2) = 0.; W(2,3) = 0.5; W(2,4) = 0.;       
  W(3,0) = 0.0; W(3,1) = 0.; W(3,2) = 0.25; W(3,3) = 0.;  W(3,4) = 0.;      
  W(4,0) = 0.5; W(4,1) = 0.; W(4,2) = 0.25; W(4,3) = 0.5;  W(4,4) = 1.0;      
  double eps = 1.0E-10;
  W.print("W =");
  double difference  = norm(wold-wnew, 2);
  int count = 0; 
  do{
    // Multiplying the old vector with the transition probability
    count += 1; 
    wnew = W*wold;
    difference  = norm(wold-wnew, 2);
    wold = wnew;
    cout << "Iteration number = " << count << endl;
    wnew.print("New vector =");
  } while(difference > eps);

  // Getting the eigenvectors and eigenvalues of the stochastic matrix
  cx_vec eigval;
  eig_gen(eigval, W);
  eigval.print("Eigenvalues=");
  return 0;
}











Entropy and the most likely state











The Metropolis Algorithm and Detailed Balance

Let us recapitulate some of our results about Markov chains and random walks.

one time-step from t=0 is given by wi(t=ϵ)=W(ji)wj(t=0).

This equation represents the discretized time-development of an original PDF. We can rewrite this as a wi(t=ϵ)=Wijwj(t=0). with the transition matrix W for a random walk given by Wij(ϵ)=W(iljl,ϵ)={12|ij|=10else











The Metropolis Algorithm and Detailed Balance

We call Wij for the transition probability and we represent it as a matrix.

iwi(t)=1, and jW(ji)=1. Here we have written the previous matrix Wij=W(ji).











The Metropolis Algorithm and Detailed Balance

The further constraints are 0Wij1 and 0wj1.

wi(t+1)=jWijwj(t), or as vector-matrix relation ˆw(t+1)=^Wˆw(t), and if we have that ||ˆw(t+1)ˆw(t)||0, we say that we have reached the most likely state of the system, the so-called steady state or equilibrium state.











The Metropolis Algorithm and Detailed Balance

Another way of phrasing this is w(t=)=Ww(t=).











The Metropolis Algorithm and Detailed Balance

The question then is how can we model anything under such a severe lack of knowledge? The Metropolis algorithm comes to our rescue here. Since W(ji) is unknown, we model it as the product of two probabilities, a probability for accepting the proposed move from the state j to the state j, and a probability for making the transition to the state i being in the state j. We label these probabilities A(ji) and T(ji), respectively. Our total transition probability is then W(ji)=T(ji)A(ji). The algorithm can then be expressed as











The Metropolis Algorithm and Detailed Balance

We wish to derive the required properties of the probabilities T and A such that w(t)iwi, starting from any distribution, will lead us to the correct distribution.

We can now derive the dynamical process towards equilibrium. To obtain this equation we note that after t time steps the probability for being in a state i is related to the probability of being in a state j and performing a transition to the new state together with the probability of actually being in the state i and making a move to any of the possible states j from the previous time step.











The Metropolis Algorithm and Detailed Balance

We can express this as, assuming that T and A are time-independent, wi(t+1)=j[wj(t)TjiAji+wi(t)Tij(1Aij)].











The Metropolis Algorithm and Detailed Balance

All probabilities are normalized, meaning that jTij=1. Using the latter, we can rewrite the previous equation as wi(t+1)=wi(t)+j[wj(t)TjiAjiwi(t)TijAij], which can be rewritten as wi(t+1)wi(t)=j[wj(t)TjiAjiwi(t)TijAij].











The Metropolis Algorithm and Detailed Balance

The last equation is very similar to the so-called Master equation, which relates the temporal dependence of a PDF wi(t) to various transition rates. The equation can be derived from the so-called Chapman-Einstein-Enskog-Kolmogorov equation. The equation is given as dwi(t)dt=j[W(ji)wjW(ij)wi], which simply states that the rate at which the systems moves from a state j to a final state i (the first term on the right-hand side of the last equation) is balanced by the rate at which the system undergoes transitions from the state i to a state j (the second term). If we have reached the so-called steady state, then the temporal development is zero. This means that in equilibrium we have dwi(t)dt=0.











The Metropolis Algorithm and Detailed Balance

In the limit t we require that the two distributions wi(t+1)=wi and wi(t)=wi and we have jwjTjiAji=jwiTijAij, which is the condition for balance when the most likely state (or steady state) has been reached. We see also that the right-hand side can be rewritten as jwiTijAij=jwiWij, and using the property that jWij=1, we can rewrite our equation as wi=jwjTjiAji=jwjWji, which is nothing but the standard equation for a Markov chain when the steady state has been reached.











The Metropolis Algorithm and Detailed Balance

However, the condition that the rates should equal each other is in general not sufficient to guarantee that we, after many simulations, generate the correct distribution. We may risk to end up with so-called cyclic solutions. To avoid this we therefore introduce an additional condition, namely that of detailed balance W(ji)wj=W(ij)wi. These equations were derived by Lars Onsager when studying irreversible processes. At equilibrium detailed balance gives thus W(ji)W(ij)=wiwj. Rewriting the last equation in terms of our transition probabilities T and acceptance probobalities A we obtain wj(t)TjiAji=wi(t)TijAij.











The Metropolis Algorithm and Detailed Balance

Since we normally have an expression for the probability distribution functions wi, we can rewrite the last equation as TjiAjiTijAij=wiwj.











The Metropolis Algorithm and Detailed Balance

In statistical physics this condition ensures that it is e.g., the Boltzmann distribution which is generated when equilibrium is reached.

We introduce now the Boltzmann distribution wi=exp(β(Ei))Z, which states that the probability of finding the system in a state i with energy Ei at an inverse temperature β=1/kBT is wiexp(β(Ei)). The denominator Z is a normalization constant which ensures that the sum of all probabilities is normalized to one. It is defined as the sum of probabilities over all microstates j of the system Z=jexp(β(Ei)).











The Metropolis Algorithm and Detailed Balance

From the partition function we can in principle generate all interesting quantities for a given system in equilibrium with its surroundings at a temperature T.

With the probability distribution given by the Boltzmann distribution we are now in a position where we can generate expectation values for a given variable A through the definition A=jAjwj=jAjexp(β(Ej)Z. In general, most systems have an infinity of microstates making thereby the computation of Z practically impossible and a brute force Monte Carlo calculation over a given number of randomly selected microstates may therefore not yield those microstates which are important at equilibrium. To select the most important contributions we need to use the condition for detailed balance. Since this is just given by the ratios of probabilities, we never need to evaluate the partition function Z.











The Metropolis Algorithm and Detailed Balance

For the Boltzmann distribution, detailed balance results in wiwj=exp(β(EiEj)).

Let us now specialize to a system whose energy is defined by the orientation of single spins. Consider the state i, with given energy Ei represented by the following N spins 123k1kk+1N1N











The Metropolis Algorithm and Detailed Balance

We are interested in the transition with one single spinflip to a new state j with energy Ej 123k1kk+1N1N This change from one microstate i (or spin configuration) to another microstate j is the configuration space analogue to a random walk on a lattice. Instead of jumping from one place to another in space, we 'jump' from one microstate to another.











The Metropolis Algorithm and Detailed Balance

However, the selection of states has to generate a final distribution which is the Boltzmann distribution. This is again the same we saw for a random walker, for the discrete case we had always a binomial distribution, whereas for the continuous case we had a normal distribution. The way we sample configurations should result, when equilibrium is established, in the Boltzmann distribution. Else, our algorithm for selecting microstates is wrong.

As stated above, we do in general not know the closed-form expression of the transition rate and we are free to model it as W(ij)=T(ij)A(ij). Our ratio between probabilities gives us AjiAij=wiTijwjTji. The simplest form of the Metropolis algorithm (sometimes called for brute force Metropolis) assumes that the transition probability T(ij) is symmetric, implying that T(ij)=T(ji).











The Metropolis Algorithm and Detailed Balance

We obtain then (using the Boltzmann distribution) A(ji)A(ij)=exp(β(EiEj)). We are in this case interested in a new state Ej whose energy is lower than Ei, viz., ΔE=EjEi0. A simple test would then be to accept only those microstates which lower the energy. Suppose we have ten microstates with energy E0E1E2E3E9. Our desired energy is E0.











The Metropolis Algorithm and Detailed Balance

At a given temperature T we start our simulation by randomly choosing state E9. Flipping spins we may then find a path from E9E8E7E1E0. This would however lead to biased statistical averages since it would violate the ergodic hypothesis discussed in the previous section. This principle states that it should be possible for any Markov process to reach every possible state of the system from any starting point if the simulations is carried out for a long enough time.

Any state in a Boltzmann distribution has a probability different from zero and if such a state cannot be reached from a given starting point, then the system is not ergodic. This means that another possible path to E0 could be E9E7E8E9E5E0 and so forth. Even though such a path could have a negligible probability it is still a possibility, and if we simulate long enough it should be included in our computation of an expectation value.











The Metropolis Algorithm and Detailed Balance

Thus, we require that our algorithm should satisfy the principle of detailed balance and be ergodic. The problem with our ratio A(ji)A(ij)=exp(β(EiEj)), is that we do not know the acceptance probability. This equation only specifies the ratio of pairs of probabilities. Normally we want an algorithm which is as efficient as possible and maximizes the number of accepted moves. Moreover, we know that the acceptance probability has 0 as its smallest value and 1 as its largest. If we assume that the largest possible acceptance probability is 1, we adjust thereafter the other acceptance probability to this constraint.











The Metropolis Algorithm and Detailed Balance

To understand this better, assume that we have two energies, Ei and Ej, with Ei<Ej. This means that the largest acceptance value must be A(ji) since we move to a state with lower energy. It follows from also from the fact that the probability wi is larger than wj. The trick then is to fix this value to A(ji)=1. It means that the other acceptance probability has to be A(ij)=exp(β(EjEi)).











The Metropolis Algorithm and Detailed Balance

One possible way to encode this equation reads A(ji)={exp(β(EiEj))EiEj>01else, implying that if we move to a state with a lower energy, we always accept this move with acceptance probability A(ji)=1. If the energy is higher, we need to check this acceptance probability with the ratio between the probabilities from our PDF. From a practical point of view, the above ratio is compared with a random number. If the ratio is smaller than a given random number we accept the move to a higher energy, else we stay in the same state.











The Metropolis Algorithm and Detailed Balance

Nothing hinders us obviously in choosing another acceptance ratio, like a weighting of the two energies via A(ji)=exp(12β(EiEj)). However, it is easy to see that such an acceptance ratio would result in fewer accepted moves.











Two examples that illustrate the Metropolis algorithm

Let us look at two simple examples that illustrate the Metropolis algorithm.

We have the ratio wjTjiAji=wiTijAij.

Let us assume for the first example that we have two states only and that we know the likelihoods w1=1/3 and w2=2/3. Can we find A and T using the ratios w2w1=2w1w2=12?











Two examples that illustrate the Metropolis algorithm

We have the first case w2w1=2=T12A12T21A21, and using the Metropolis algorithm we have then that since the likelihood for moving to state 2 is larger than one, then we have T12A12=1, and assuming in a very democratic way that T12=T21=1/2, we have then that the transition matrix takes the values W12=T12A12=12.











The other value

For the second case w1w2=12=T21A21T12A12, we have then T21A21=12×12, since we have T12=T21=1/2 we end with W21=T21A21=14.











The transition matrix

Summing up, our total transition likelihood written in terms of a matrix reads ˆW=[W11W12W21W22]=[?1214?], and using the normalization requirement for the column elements we obtain ˆW=[34121412], and this matrix has eigenvalues λ0=1 and λ1=1/4. The most likely state is the one with the largest eigenvalue.











And the code which finds the most likely state

int main()
{
  int dim = 2;
  mat W = zeros<mat>(dim,dim);
  vec wold = zeros<mat>(dim);
  vec wnew = zeros<mat>(dim);
  vec eigenvector = zeros<mat>(dim);
  // Initializing the first vector
  wold(0) = 1.0;
  // Setting up the stochastic matrix W
  W(0,0) = 0.75; W(0,1) = 0.5;
  W(1,0) = 0.25; W(1,1) = 0.5;
  double eps = 1.0E-10;
  W.print("W =");
  double difference  = norm(wold-wnew, 2);
  int count = 0; 
  do{
    // Multiplying the old vector with the transition probability
    count += 1; 
    wnew = W*wold;
    difference  = norm(wold-wnew, 2);
    wold = wnew;
    cout << "Iteration number = " << count << endl;
    wnew.print("New vector =");
  } while(difference > eps);
  return 0;
}











And the output

W =
   0.7500   0.5000
   0.2500   0.5000
Iteration number = 1
New vector =
   0.7500
   0.2500
...
Iteration number = 16
New vector =
   0.6667
   0.3333
Iteration number = 17
New vector =
   0.6667
   0.3333

We obtain the likelihoods we started with for the states w1 and w2!! We have thus shown that the Metropolis algorithm gives the correct likelihoods for this simple example!











The next example

We are going to study one single particle in equilibrium with its surroundings, the latter modelled via a large heat bath with temperature T.

The model used to describe this particle is that of an ideal gas in one dimension and with velocity v or v . We are interested in finding P(v)dv , which expresses the probability for finding the system with a given velocity v\in [v,v+dv] . The energy for this one-dimensional system is E=\frac{1}{2}kT=\frac{1}{2}v^2, with mass m=1 .

We will use the Boltzmann distribution P(\beta)=\frac{e^{-\beta E}}{Z}, with \beta=1/kT being the inverse temperature, E is the energy of the system and Z is the partition function.











The python code











Brief Summary

The Monte Carlo approach, combined with the theory for Markov chains can be summarized as follows: A Markov chain Monte Carlo method for the simulation of a distribution w is any method producing an ergodic Markov chain of events x whose stationary distribution is w . The Metropolis algorithm can be phrased as

\begin{equation*} x^{(i+1)}= \left\{\begin{array}{cc} y_t & \mathrm{with\hspace{0.1cm}probability} = A(x^{(i)}\rightarrow y_t) \\ x^{(i)} & \mathrm{with \hspace{0.1cm}probability} = 1-A(x^{(i)}\rightarrow y_t)\end{array}\right . \end{equation*} \begin{equation*} A(x\rightarrow y)= \mathrm{min}\left\{\frac{w(y)T(x|y)}{w(x)T(y|x)},1\right\}. \end{equation*}











Diffusion

Diffusion and the diffusion equation are central topics in both Physics and Mathematics, and their ranges of applicability span from stellar dynamics to the diffusion of particles governed by Schroedinger's equation. The latter is, for a free particle, nothing but the diffusion equation in complex time!

Let us consider the one-dimensional diffusion equation. We study a large ensemble of particles performing Brownian motion along the x -axis. There is no interaction between the particles.

We define w(x,t)dx as the probability of finding a given number of particles in an interval of length dx in x\in [x, x+dx] at a time t . This quantity is our probability distribution function (PDF).











Diffusion Equation

From experiment there are strong indications that the flux of particles j(x,t) , viz., the number of particles passing x at a time t is proportional to the gradient of w(x,t) . This proportionality is expressed mathematically through \begin{equation*} j(x,t) = -D\frac{\partial w(x,t)}{\partial x}, \end{equation*} where D is the so-called diffusion constant, with dimensionality length$^2$ per time.











Diffusion Equation, continuity equation

If the number of particles is conserved, we have the continuity equation \begin{equation*} \frac{\partial j(x,t)}{\partial x} = -\frac{\partial w(x,t)}{\partial t}, \end{equation*} which leads to \begin{equation} \label{eq:diffequation1} \frac{\partial w(x,t)}{\partial t} = D\frac{\partial^2w(x,t)}{\partial x^2}, \end{equation} which is the diffusion equation in one dimension.











Diffusion Equation, expectation values

With the probability distribution function w(x,t)dx we can evaluate expectation values such as the mean distance \langle x(t)\rangle = \int_{-\infty}^{\infty}xw(x,t)dx, or \langle x^2(t)\rangle = \int_{-\infty}^{\infty}x^2w(x,t)dx, which allows for the computation of the variance \sigma^2=\langle x^2(t)\rangle-\langle x(t)\rangle^2 . Note well that these expectation values are time-dependent.











Diffusion Equation, other expectation values

In a similar way we can also define expectation values of functions f(x,t) as \begin{equation*} \langle f(x,t)\rangle = \int_{-\infty}^{\infty}f(x,t)w(x,t)dx. \end{equation*} The normalization condition \begin{equation*} \int_{-\infty}^{\infty}w(x,t)dx=1 \end{equation*} imposes significant constraints on w(x,t) .











Diffusion Equation, normalization condition

We have \begin{equation*} w(x=\pm \infty,t)=0 \hspace{1cm} \frac{\partial^{n}w(x,t)}{\partial x^n}|_{x=\pm\infty} = 0, \end{equation*} implying that when we study the time-derivative \partial\langle x(t)\rangle/\partial t , we obtain after integration by parts and using Eq. \eqref{eq:diffequation1} \frac{\partial \langle x\rangle}{\partial t} = \int_{-\infty}^{\infty}x\frac{\partial w(x,t)}{\partial t}dx= D\int_{-\infty}^{\infty}x\frac{\partial^2w(x,t)}{\partial x^2}dx, leading to \frac{\partial \langle x\rangle}{\partial t} = Dx\frac{\partial w(x,t)}{\partial x}|_{x=\pm\infty}- D\int_{-\infty}^{\infty}\frac{\partial w(x,t)}{\partial x}dx.











Diffusion Equation

The result is \frac{\partial \langle x\rangle}{\partial t} = 0. This means in turn that \langle x\rangle is independent of time. If we choose the initial position x(t=0)=0 , the average displacement \langle x\rangle= 0 . If we link this discussion to a random walk in one dimension with equal probability of jumping to the left or right and with an initial position x=0 , then our probability distribution remains centered around \langle x\rangle= 0 as function of time.











Diffusion Equation, the variance

The variance is not necessarily 0. Consider first \frac{\partial \langle x^2\rangle}{\partial t} = Dx^2\frac{\partial w(x,t)}{\partial x}|_{x=\pm\infty}- 2D\int_{-\infty}^{\infty}x\frac{\partial w(x,t)}{\partial x}dx, where we have performed an integration by parts as we did for \frac{\partial \langle x\rangle}{\partial t} .











Diffusion Equation, final expression for the variance

Integration by parts results in \frac{\partial \langle x^2\rangle}{\partial t} = -Dxw(x,t)|_{x=\pm\infty}+ 2D\int_{-\infty}^{\infty}w(x,t)dx=2D, leading to \langle x^2\rangle = 2Dt, and the variance as \begin{equation} \label{eq:variancediffeq} \langle x^2\rangle-\langle x\rangle^2 = 2Dt. \end{equation} The root mean square displacement after a time t is then \begin{equation*} \sqrt{\langle x^2\rangle-\langle x\rangle^2} = \sqrt{2Dt}. \end{equation*}











Diffusion Equation, interpretation

This should be contrasted to the displacement of a free particle with initial velocity v_0 . In that case the distance from the initial position after a time t is x(t) = vt whereas for a diffusion process the root mean square value is \sqrt{\langle x^2\rangle-\langle x\rangle^2} \propto \sqrt{t} . Since diffusion is strongly linked with random walks, we could say that a random walker escapes much more slowly from the starting point than would a free particle.











Diffusion Equation, simple illustration

\begin{equation*} w(x,t)dx = \frac{1}{\sqrt{4\pi Dt}}\exp{(-\frac{x^2}{4Dt})}dx. \end{equation*} At a time $t=2$s the new variance is $\sigma^2=4D$s, implying that the root mean square value is \sqrt{\langle x^2\rangle-\langle x\rangle^2} = 2\sqrt{D} . At a further time t=8 we have \sqrt{\langle x^2\rangle-\langle x\rangle^2} = 4\sqrt{D} . While time has elapsed by a factor of 4 , the root mean square has only changed by a factor of 2.











Random Walks

Consider \begin{equation*} \langle x(n)\rangle = \sum_{i}^{n} \Delta x_i = 0 \hspace{1cm} \Delta x_i=\pm l, \end{equation*} since we have an equal probability of jumping either to the left or to right. The value of \langle x(n)^2\rangle is \begin{equation*} \langle x(n)^2\rangle = \left(\sum_{i}^{n} \Delta x_i\right)\left(\sum_{j}^{n} \Delta x_j\right)=\sum_{i}^{n} \Delta x_i^2+ \sum_{i\ne j}^{n} \Delta x_i\Delta x_j=l^2n. \end{equation*}











Random Walks

For many enough steps the non-diagonal contribution is \begin{equation*} \sum_{i\ne j}^{N} \Delta x_i\Delta x_j=0, \end{equation*} since \Delta x_{i,j} = \pm l . The variance is then \begin{equation} \langle x(n)^2\rangle - \langle x(n)\rangle^2 = l^2n. \label{eq:rwvariance} \end{equation} It is also rather straightforward to compute the variance for L\ne R . The result is \begin{equation*} \langle x(n)^2\rangle - \langle x(n)\rangle^2 = 4LRl^2n. \end{equation*}











Random Walks

In Eq. \eqref{eq:rwvariance} the variable n represents the number of time steps. If we define n=t/\Delta t , we can then couple the variance result from a random walk in one dimension with the variance from the diffusion equation of Eq. \eqref{eq:variancediffeq} by defining the diffusion constant as \begin{equation*} D = \frac{l^2}{\Delta t}. \end{equation*}











Random Walks, simple program

The main program reads the name of the output file from screen and sets up the arrays containing the walker's position after a given number of steps. The corresponding program for a two-dimensional random walk (not listed in the main text) is found under programs/chapter12/program2.cpp

/*
  1-dim random walk program. 
  A walker makes several trials steps with
  a given number of walks per trial
*/
#include <iostream>
#include <fstream>
#include <iomanip>
#include "lib.h"
using namespace  std;

// Function to read in data from screen, note call by reference
void initialise(int&, int&, double&) ;
// The Mc sampling for random walks
void  mc_sampling(int, int, double, int *, int *);
// prints to screen the results of the calculations 
void  output(int, int, int *, int *);

int main()
{
  int max_trials, number_walks; 
  double move_probability;
  // Read in data 
  initialise(max_trials, number_walks, move_probability) ;
  int *walk_cumulative = new int [number_walks+1];
  int *walk2_cumulative = new int [number_walks+1];
  for (int walks = 1; walks <= number_walks; walks++){   
    walk_cumulative[walks] = walk2_cumulative[walks] = 0;
  } // end initialization of vectors
  // Do the mc sampling  
  mc_sampling(max_trials, number_walks, move_probability, 
              walk_cumulative, walk2_cumulative);
  // Print out results 
  output(max_trials, number_walks, walk_cumulative, 
         walk2_cumulative);
  delete [] walk_cumulative; // free memory
  delete [] walk2_cumulative; 
  return 0; 
} // end main function
\end{lstlisting}
The  input and output functions are 
\begin{lstlisting} 
void initialise(int& max_trials, int& number_walks, double& move_probability) 
{
  cout << "Number of Monte Carlo trials ="; 
  cin >> max_trials;
  cout << "Number of attempted walks=";
  cin >> number_walks;
  cout << "Move probability=";
  cin >> move_probability;
}  // end of function initialise   

Random walk

The algorithm tests the probability of moving to the left or to the right by generating a random number.

void mc_sampling(int max_trials, int number_walks, 
                 double move_probability, int *walk_cumulative, 
                 int *walk2_cumulative)
{
  long idum;
  idum=-1;  // initialise random number generator
  for (int trial=1; trial <= max_trials; trial++){
    int position = 0;
    for (int walks = 1; walks <= number_walks; walks++){   
      if (ran0(&idum) <= move_probability) {
	position += 1;
      } 
      else {
	position -= 1;
      }
      walk_cumulative[walks] += position;
      walk2_cumulative[walks] += position*position;
    }  // end of loop over walks
  } // end of loop over trials
}   // end mc_sampling function  

Simple python code with visualization of one-dimensional random walk

The python code here is just a mere rewriting of the above c++ code, with the difference that it employs matplotlib and gives the final plot.

© 1999-2018, Morten Hjorth-Jensen. Released under CC Attribution-NonCommercial 4.0 license