Table of Contents
Radial distribution functions
A radial distribution function (rdf) is, in the most general sense, a distribution function - a histogram - that is proportional to the probability of finding a given quantity within a radial coordinate. For an illustration see http://www.physics.emory.edu/faculty/weeks//idl/gofr.html. It is most commonly used as a synonym for the pair correlation function (pcf) $g(r)$ but can apply to any quantity that is radially distributed with a spatial density of $d\mathbf{r} = 4 \pi r^2 dr$. This is assuming azimuthal ($\phi$) and polar ($\theta$) symmetry - for an even more general discussion revisit spherical coordinates in your favorite vector calculus or applied maths textbook.
Pair correlation function
The most basic description of the pair correlation function (pcf) $g(r)$ is that it is a function related to the probability of finding a particle at a given distance from the center of another particle. Let us consider the different regimes of $r$ in $g(r)$, the distance argument of the pair correlation function. If one imagines the packing of hard spheres, the smallest or closest distance two hard spheres can be separated by is a single diameter $\sigma$. At arbitrarily large distances, $g(r)$ is related to the density of particles. $g(r)$ is normalized by the density such that at large distances $r\gg\sigma$, $g(r)$ tends to unity, $1$. In the small and intermediate $r\sim \sigma$ regime, the peaks in the pair correlation function correspond to areas of tighter packing and the troughs to correspondingly less dense packing. Hence $g(r)$ is a measure of local spatial ordering in a multiparticle system (which may be crystalline solid, fluid, or other). Formally,
$$g(r) = \frac{V}{4\pi r^2 N^2} \langle \sum_i \sum_{j\neq i} \delta(r-r_{ij}) \rangle, r_{ij} = |\mathbf{r}_i-\mathbf{r}_j| $$
Pseudocode for calculating the pcf from an MD simulation:
def calc_pair_corr(coords, sigma, rmax, nbins, L): '''Calculate and return the pair correlation function as a histogram. args: coords: N by 3 array of particle coordinates sigma: particle diameter rmax: maximum radial coordinate for g(r) nbins: number of histogram bins L: length of cubic box returns: g: nbins by 1 pair correlation function There are easier, more efficient ways of doing this (see np.histogram) but this simple pseudocode is written for illustrative purposes. ''' N = len(coords) # particle number dr = rmax / nbins # histogram bin width rho = N / L**3 # particle number density g = [] # pair correlation function for r in np.linspace(sigma, rmax, nbins+1)[:-1]: total = 0 for vec in coords: distances = np.linalg.norm(minimum_image(coords - vec, 1.0), axis=-1) # find distances of other particles from given particle # (remembering the minimum image convention!) total += ((r <= distances) & (distances < r + dr)).sum() # count how many are within [r, r+dr] of given particle V_shell = (4/3) * np.pi * ((r + dr)**3 - r**3) # spherical shell volume total = total / (rho * V_shell * N) # normalization g.append(total) return np.array(g)
This may be done in just two lines using the freud module - see created rdf object. The rest of the code is mostly just import statements and user-interfacing.
# typical import statements import argparse, gsd, gsd.hoomd import numpy as np from gsd.fl import GSDFile import freud import sys # typical user-input arguments, loading hoomd trajectories, etc. rdf = freud.density.RDF(bins=bins, r_max=args.max) for f in frames_to_take: # sys write out statements frame = trajectory[int(f)] points=frame.particles.position box = frame.configuration.box rdf.compute(system=(box, points), reset=False) np.savetxt(args.output_file, np.c_[rdf.bin_centers,rdf.rdf],header="# bin, RDF")
Here's an implementation using np.histogram and scipy.spatial.distance:
def calc_pair_corr_hist(coords, sigma, rmax, nbins, L): '''Same as above but uses np.histogram''' N = len(coords) # particle number dr = rmax / nbins # histogram bin width rho = N / L**3 # particle number density g = [] # pair correlation function # problem is to convert an N by 3 coordinate list into # an N(N-1)/2 by 1 list of pairwise distances. This is easily # done with scipy.spatial.distance.pdist from scipy.spatial.distance import pdist histogram = np.histogram(pdist(coords), bins=nbins, range=(0, nbins*dr)) r = (histogram[1] + dr/2)[:-1] # centers of the bins r_shell_Vols = (4 * np.pi / 3) * ((r + dr/2)**3 - (r - dr/2)**3) hi_r = rho * r_shell_Vols return 2 * histogram[0] / hi_r / (N - 1)
What explains the normalization here? Notice the histogram array gives the count of particles in a given spherical shell. So we must divide by the shell volume $V_\textrm{shell} = \frac{4}{3}\pi[(r+dr)^3-(r-dr)^3]$ to get a density. Actually the histogram counts all pairwise distances so we must then normalize by the total density of pairs $N_\textrm{pairs} / V$ where $V$ is the total volume. Since $N_\textrm{pairs} = N(N-1)/2$ this yields $(\textrm{histogram} / V_\textrm{shell}) / (N_\textrm{pairs} / V) = 2 \textrm{histogram} / [(N/V) V_\textrm{shell} (N-1)]$ seen here.
Here's what $g(r)$ looks like for different phases.
coords_gas = np.random.rand(1000, 3)-0.5 # 1000-particle random gas coords_crystal = cubic_lattice(10, 1.0) # 10x10x10 cubic crystal plt.plot(np.linspace(0.05, 0.5, 101)[:-1], calc_pair_corr(coords_crystal, 0.05, 0.5, 100, 1.0), 'b*-', label = 'cubic crystal') plt.plot(np.linspace(0.05, 0.5, 101)[:-1], calc_pair_corr(coords_gas, 0.05, 0.5, 100, 1.0), 'k*-', label = 'random gas') plt.show()
This is done using quick Python code from scratch in a Jupyter notebook. However, using freud and hoomd we can get quite accurate and smoother-looking pcf's. Here's an example of a system near crystallinity:
We see that for a crystalline system, $g(r)$ will have well-defined peaks at well-defined locations (depending on the crystal geometry , multiples and factors of the lattice constant). For a fluid, the peaks will be smoother and less well-defined, and for a gas the pcf tends to unity after a small hump quite quickly.
The pcf can be used to determine bond lengths as well. A good estimate for the mean bond length $d$ is the location of the first peak in the pcf. However, authors [3] found that an even better estimate is found by fitting the first peak in the pcf to a skew-normal distribution (SND). Increasing temperature causes the first peak in the pcf to shift left or decrease, but in reality the bond length undergoes thermal expansion (increase, shift right) and this can be accounted for by fitting the first peak in the pcf to an SND rather than simply targeting the absolute maximum.
Bond distribution functions
Assuming harmonic bond potentials $V(r) = \frac{1}{2}K(r-r_0)^2$ with resting bond length $r_0$ and spring constant $K$, we expect a Gaussian bond distribution $p(r) \sim \exp(-\frac{V(r)}{kT}) = \exp[-\frac{K}{2kT}(r-r_0)^2]$. The normalization factor is obtained by setting $\int_0^\infty p(r) 4\pi r^2 dr = 1$. One could do this standard exercise in Gaussian integration and look it up in a table of integrals or do it in Mathematica (integrals of the form $\int r^n \exp(-\alpha^2 r^2) dr$ are ubiquitous in physics) but there really is no need to reproduce the result here (edit: actually normalization is necessary since the normalization factor is not a constant shift but dependent on $\alpha$ and $r_0$). Using Mathematica, the analytical normalization prefactor is computed to be
$$\frac{1}{\int_0^\infty \exp[-\alpha^2 (r-r_0)^2]4\pi r^2 dr} = \frac{\alpha^3}{2\exp(-\alpha^2 r_0^2) \pi r_0 \alpha + \pi^{3/2} (1+2r_0^2 \alpha^2)(1+\textrm{erf}(r_0 \alpha))}$$
Where $\alpha \equiv \sqrt{\frac{K}{2kT}}$. One can see that after histogramming a bond distribution from simulation, then normalizing to unit density, and then normalizing by shell volume, this should be directly comparable to $p(r)$ which is the Boltzmann distribution normalized by the above factor. See code below.
# import statements import numpy as np import os import hoomd from hoomd import data from hoomd import md bond_dist_avg_hist = [] for f in frames_to_take: sys.stdout.write("\rCurrent frame %3d/%3d"%(f,N_frames)) sys.stdout.flush() frame = trajectory[int(f)] box = frame.configuration.box bonds = frame.bonds.group bond_distances = np.linalg.norm(my_disp_in_box(\ frame.particles.position[bonds[:,1]] - \ frame.particles.position[bonds[:,0]], \ frame.configuration.box[0:3]), axis=-1) # my_disp_in_box implements MIC bond_dist_hist, bin_edges = np.histogram(\ bond_distances, bins=args.num_bins) bond_dist_hist = bond_dist_hist / np.sum(bond_dist_hist) # normalizes density to unity bin_centers = (bin_edges[1:] + bin_edges[:-1]) / 2 shellVols = (4 * np.pi / 3) * (bin_edges[1:]**3 - bin_edges[:-1]**3) bond_dist_avg_hist.append(bond_dist_hist / shellVols) bond_dist_avg_hist = np.array(bond_dist_avg_hist) np.savetxt(args.output_file, np.c_[bin_centers,bond_dist_avg_hist.mean(axis=0)],header="# bin, bond_len_hist")
def boltz(r, alpha, r0): '''Return expected bond length distribution''' return (alpha**3 / (2 * np.exp(-r0**2 * alpha**2) * np.pi * r0 * alpha + np.pi**1.5 * (1+2*r0**2 * alpha**2)*(1+erf(r0*alpha)))) * np.exp(-alpha**2 * (r-r0)**2)
References
- D. Keffer, Structural Properties from Molecular Dynamics Simulation. MSE 614, Dept. of Materials Science & Engineering, University of Tennessee, Knoxville TN.
- Dror, Ron. Energy functions and their relationship to molecular conformation. CS/CME/BioE/Biophys/BMI 279, Stanford University, Stanford CA. Sept. 22 and 24, 2020
- Sukhomlinov, Sergey V.; Müser, Martin H. (2017). Determination of accurate, mean bond lengths from radial distribution functions. The Journal of Chemical Physics, 146(2), 024506–. doi:10.1063/1.4973804
- Allen, Michael P., and Dominic J. Tildesley. Computer Simulation of Liquids. 2017.