Multiscale Simulation (II): Monte-Carlo Methods
2026-03-23
After this lecture, you will be able to:
Monte Carlo (MC) methods describes a large set of stochastic methods to perform sampling in a (usually) high-dimensional space.

Buffon’s needle problem is one of the earliest demonstrations to estimate an unknown quantity (here \(\pi\)) via sampling. The probability of a needle with length \(l\) intersecting parallel lines spaced at \(d\) is:
\[ p = \frac{2}{\pi} \frac{l}{d} \]
Simulation from wikipedia
Yes. The vacancy-mediated diffusion process (Lecture 9 & Assignment 2) shows this approach. We know that the macroscopic diffusion must follow the Fick’s equation
\[ J_A = -\tilde{D}\nabla c_A \]
and we can simulate the process (i.e. not knowing \(\tilde{D}\)) while only caring about the probabilities that a vacancy exchanges with a certain type of atom.
The following are taken from Landau and Binder, A Guide to Monte Carlo Simulations in Statistical Physics, (Cambridge U. Press, Cambridge U.K., 2000).
In a Monte Carlo simulation, we attempt to follow the ‘time dependence’ of a model for which change, or growth, does not proceed in some rigorously predefined fashion (e.g., according to Newton’s equations of motion) but rather in a stochastic manner which depends on a sequence of random numbers which is generated during the simulation.
The Monte Carlo (MD) method and Molecular Dynamics (MD) are both simulation techniques in material science but can be easily confused.
Similarity
Difference
One main topic in the kinetics course is how to link the probability of a system to a certain energy scale. For a property \(A\) of interest, in the canonical (NVT) ensemble of system, such distribution follows the Boltzmann distribution
\[ \frac{p(A_i)}{p(A_j)} = \exp(- \frac{U_i - U_j}{k_B T}) \]
To estimate the expectation of \(A\) in NVT ensemble, we are required to calculate:
\[ \langle A \rangle_{NVT} = \frac{\int_\Omega A \exp(- \frac{U}{k_B T}) d\mathrm{r}} {\int_\Omega \exp(- \frac{U}{k_B T}) d\mathrm{r}} \]
How do we compute the integrals?
In the center of Monte Carlo methods, we are dealing with the integral of some (presummably) high-dimensional function \(f(\mathbf{x})\) on a domain \(\Omega=\{x_1, x_2, \cdots, x_n\}\), which can be transformed by another probability density function \(\rho(\mathbf{x})\)
\[\begin{align} F &= \int_{\Omega} f(\mathbf{x}) d \mathbf{x} \\ &= \int_{\Omega} \frac{f(\mathbf{x})}{\rho(\mathbf{x})} \rho(\mathbf{x}) d \mathbf{x} \end{align}\]and
\[ \rho(\mathbf{x}) \ge 0, \qquad \int_{\Omega} \rho(\mathbf{x})\, d\mathbf{x} = 1. \]
The integration problem for \(F\) now effectively becomes to find the expection for \(f / \rho\), when we draw samples \(x\) from the distribution \(\rho(x)\):
\[\begin{align} F &= \int_{\Omega} \frac{f(\mathbf{x})}{\rho(\mathbf{x})} \rho(\mathbf{x}) d \mathbf{x} \\ &= E\left[ \frac{f(\xi)}{\rho(\xi)} \right] \end{align}\]When the distribution \(\rho(x)\) is uniform, then we have
\[\begin{align} F \approx \frac{\Pi_{j=1}^{n} L_j }{N_{\text{trial}}} \sum_{i=1}^{N_{\text{trial}}} f(\xi_i) \end{align}\]where \(L_i\) is the domain size in dimension \(j\).
Why does the Monte Carlo method work? Compare the two ways to calculate \(F\)
The expectation value for \(A\) in a statistical mechanics system using a uniform sampling strategy is now:
\[ \langle A \rangle_{NVT} = \frac{\sum_i^{N_\text{trial}} A(\xi_i) \exp(- \frac{U(\xi_i)}{k_B T})} {sum_{i}^{N_\text{trial}} \exp(- \frac{U(\xi_i)}{k_B T})} \]
This is a powerful conclusion! Every single thermodynamic quantity in the current ensemble can be estimated by random sampling. A long-enough MC simulation will eventually reveal the property.
Is this good enough? 
The uniform-sampling MC method is not a feasible approach. There are a very large number of configurations that would be randomly generated that have effectively zero Boltzmann weight due to high-energy overlaps between the particles.
The solution: importance sampling / Metropolis-Hastings algorithm
Originally proposed by Metropolis et al. in 1953 paper Equation of State Calculations by Fast Computing Machines. This is the basis of (almost) all MC sampling up to date:
Screenshot of 1953 paper
Let’s review the integral question again. Do we really need to use a uniform probability distribution \(\rho\), in systems that has Boltzmann behavious?
\[ F = \int_{\Omega} \frac{f(\mathbf{x})}{\rho(\mathbf{x})} \rho(\mathbf{x}) d \mathbf{x} \]
\[ \frac{p(A_i)}{p(A_j)} = \exp(- \frac{U_i - U_j}{k_B T}) \]
Implementation of a Metropolis MC algorithm (in material science)
old \((x, y, z)\) coordinates of atomsnew by either moving randoml distance or shift by lattice grids\[ RN < \exp(-\frac{U_{\text{new}} - U_{\text{old}}}{k_B T}) \] 6. Goto step 2 and repeat
The example belows shows the Metropolic MC 1D random walk. Observe its behaviour when inside the probability “peaks” (potential wells)
The previous example shows that metropolis MC is able to find the equilibrium state of a system, even if we started away from that.
In contrast, standard molecular dynamics simulations are often hard to find other equilibria, if without proper biased sampling techniques
Have we seen something similar in the lecture notes before?
Course materials from Lecture 12, Copyright: Vilas Winstein (UC Berkeley) Demo link. Youtube link
Potts model (similar to the Icing model in solid-state physics) is a widely used technique to simulate the spatial composition domains in polycrystalline materials using the Monte Carlo method.
\[ H = \sum_{\langle i,j\rangle} J \left(1 - \delta_{q_i,q_j}\right) \]
In a Metropolis MC simulation using the Potts model, Monte Carlo algorithm updates
From our previous examples we clearly see the (Metropolis) MC method is a powerful tool to reveal equilibrium composition, but what about systems that evolve over time?
Solution: Kinetic Monte Carlo (KMC)
The KMC modelling is a special case of Markov-Chain Monte Carlo (MCMC), which tries to answer the question: how fast can I move from states to the other other.
We will introduce the direct method of KMC
Initialize the system at \(t=0\).
Find all possible transition paths for current state \(i\) either by:
For each path, identify the end state \(k\)
Calculate individual rate \(k_{ij} = k_0 \exp(-\frac{E_{ij}^a}{k_B T})\) and total transition rates \(K = \sum_j k_{ij}\)
Pick a random number \(RN_1\), select the transition pathway \(i \to j_k\) that follows
\[ \sum_{j=1}^{j_k - 1} k_{ij} < RN_1 \cdot K < \sum_{j=1}^{j_k} k_{ij} \] 6. Move the system to state \(j_k\), and repeat step 2
If time needs to be tracked instead of the simulation steps, after selecting the pathway, the system time advances by
\[ \Delta t = -\frac{1}{K} \ln RN_2 \]
where \(RN_2\) is a second independent random number.
This is because for ANY of the first-order \(i \to j\) process to happen, the time scale is \(\frac{1}{K}\), while less likely events scales exponentially.
KMC simulation can handle much large length and timescales than MD.
Comput. Mater. Sci._ 135, 78–89 (2017).