Radial distribution functions

A radial distribution function (rdf) is, in the most general sense, a distribution function - a histogram - that is proportional to the probability of finding a given quantity within a radial coordinate. For an illustration see http://www.physics.emory.edu/faculty/weeks//idl/gofr.html. It is most commonly used as a synonym for the pair correlation function (pcf) $g(r)$ but can apply to any quantity that is radially distributed with a spatial density of $d\mathbf{r} = 4 \pi r^2 dr$. This is assuming azimuthal ($\phi$) and polar ($\theta$) symmetry - for an even more general discussion revisit spherical coordinates in your favorite vector calculus or applied maths textbook.

Pair correlation function

The most basic description of the pair correlation function (pcf) $g(r)$ is that it is a function related to the probability of finding a particle at a given distance from the center of another particle. Let us consider the different regimes of $r$ in $g(r)$, the distance argument of the pair correlation function. If one imagines the packing of hard spheres, the smallest or closest distance two hard spheres can be separated by is a single diameter $\sigma$. At arbitrarily large distances, $g(r)$ is related to the density of particles. $g(r)$ is normalized by the density such that at large distances $r\gg\sigma$, $g(r)$ tends to unity, $1$. In the small and intermediate $r\sim \sigma$ regime, the peaks in the pair correlation function correspond to areas of tighter packing and the troughs to correspondingly less dense packing. Hence $g(r)$ is a measure of local spatial ordering in a multiparticle system (which may be crystalline solid, fluid, or other). Formally,

$$g(r) = \frac{V}{4\pi r^2 N^2} \langle \sum_i \sum_{j\neq i} \delta(r-r_{ij}) \rangle, r_{ij} = |\mathbf{r}_i-\mathbf{r}_j| $$

Pseudocode for calculating the pcf from an MD simulation:

def calc_pair_corr(coords, sigma, rmax, nbins, L):
    '''Calculate and return the pair correlation function 
       as a histogram.   
       args:
           coords: N by 3 array of particle coordinates
           sigma: particle diameter
           rmax: maximum radial coordinate for g(r)
           nbins: number of histogram bins
           L: length of cubic box
       returns:
           g: nbins by 1 pair correlation function
       There are easier, more efficient ways of doing this 
       (see np.histogram) but this simple pseudocode is
       written for illustrative purposes. 
    '''
    N = len(coords)      # particle number
    dr = rmax / nbins    # histogram bin width
    rho = N / L**3       # particle number density
    g = []               # pair correlation function
 
    for r in np.linspace(sigma, rmax, nbins+1)[:-1]:
        total = 0
        for vec in coords:
            distances = np.linalg.norm(minimum_image(coords - vec, 1.0), axis=-1) 
            # find distances of other particles from given particle 
            # (remembering the minimum image convention!)
 
            total += ((r <= distances) & (distances < r + dr)).sum()              
            # count how many are within [r, r+dr] of given particle
 
        V_shell = (4/3) * np.pi * ((r + dr)**3 - r**3) # spherical shell volume
        total = total / (rho * V_shell * N)            # normalization 
 
        g.append(total)
 
    return np.array(g)

This may be done in just two lines using the freud module - see created rdf object. The rest of the code is mostly just import statements and user-interfacing.

# typical import statements 
import argparse, gsd, gsd.hoomd
import numpy as np
from gsd.fl import GSDFile
import freud
import sys
 
# typical user-input arguments, loading hoomd trajectories, etc.
 
rdf = freud.density.RDF(bins=bins, r_max=args.max)
 
for f in frames_to_take:
    # sys write out statements 
    frame = trajectory[int(f)]
    points=frame.particles.position
    box = frame.configuration.box
    rdf.compute(system=(box, points), reset=False)
 
 
np.savetxt(args.output_file, np.c_[rdf.bin_centers,rdf.rdf],header="# bin, RDF")

Here's an implementation using np.histogram and scipy.spatial.distance:

def calc_pair_corr_hist(coords, sigma, rmax, nbins, L):
    '''Same as above but uses np.histogram'''
    N = len(coords)      # particle number
    dr = rmax / nbins    # histogram bin width
    rho = N / L**3       # particle number density
    g = []               # pair correlation function
 
    # problem is to convert an N by 3 coordinate list into
    # an N(N-1)/2 by 1 list of pairwise distances. This is easily 
    # done with scipy.spatial.distance.pdist
 
    from scipy.spatial.distance import pdist
    histogram = np.histogram(pdist(coords), bins=nbins, range=(0, nbins*dr))
    r = (histogram[1] + dr/2)[:-1] # centers of the bins
    r_shell_Vols = (4 * np.pi / 3) * ((r + dr/2)**3 - (r - dr/2)**3)
    hi_r = rho * r_shell_Vols
    return 2 * histogram[0] / hi_r / (N - 1)

What explains the normalization here? Notice the histogram array gives the count of particles in a given spherical shell. So we must divide by the shell volume $V_\textrm{shell} = \frac{4}{3}\pi[(r+dr)^3-(r-dr)^3]$ to get a density. Actually the histogram counts all pairwise distances so we must then normalize by the total density of pairs $N_\textrm{pairs} / V$ where $V$ is the total volume. Since $N_\textrm{pairs} = N(N-1)/2$ this yields $(\textrm{histogram} / V_\textrm{shell}) / (N_\textrm{pairs} / V) = 2 \textrm{histogram} / [(N/V) V_\textrm{shell} (N-1)]$ seen here.

Here's what $g(r)$ looks like for different phases.

coords_gas = np.random.rand(1000, 3)-0.5 # 1000-particle random gas
coords_crystal = cubic_lattice(10, 1.0)  # 10x10x10 cubic crystal 
plt.plot(np.linspace(0.05, 0.5, 101)[:-1], calc_pair_corr(coords_crystal, 0.05, 0.5, 100, 1.0), 'b*-', label = 'cubic crystal')
plt.plot(np.linspace(0.05, 0.5, 101)[:-1], calc_pair_corr(coords_gas, 0.05, 0.5, 100, 1.0), 'k*-', label = 'random gas')
plt.show()

$<\image>$ $<\image>$

This is done using quick Python code from scratch in a Jupyter notebook. However, using freud and hoomd we can get quite accurate and smoother-looking pcf's. Here's an example of a system near crystallinity:

We see that for a crystalline system, $g(r)$ will have well-defined peaks at well-defined locations (depending on the crystal geometry , multiples and factors of the lattice constant). For a fluid, the peaks will be smoother and less well-defined, and for a gas the pcf tends to unity after a small hump quite quickly.

The pcf can be used to determine bond lengths as well. A good estimate for the mean bond length $d$ is the location of the first peak in the pcf. However, authors [3] found that an even better estimate is found by fitting the first peak in the pcf to a skew-normal distribution (SND). Increasing temperature causes the first peak in the pcf to shift left or decrease, but in reality the bond length undergoes thermal expansion (increase, shift right) and this can be accounted for by fitting the first peak in the pcf to an SND rather than simply targeting the absolute maximum.

Bond distribution functions

Assuming harmonic bond potentials $V(r) = \frac{1}{2}K(r-r_0)^2$ with resting bond length $r_0$ and spring constant $K$, we expect a Gaussian bond distribution $p(r) \sim \exp(-\frac{V(r)}{kT}) = \exp[-\frac{K}{2kT}(r-r_0)^2]$. The normalization factor is obtained by setting $\int_0^\infty p(r) 4\pi r^2 dr = 1$. One could do this standard exercise in Gaussian integration and look it up in a table of integrals or do it in Mathematica (integrals of the form $\int r^n \exp(-\alpha^2 r^2) dr$ are ubiquitous in physics) ~~but there really is no need to reproduce the result here~~ (edit: actually normalization is necessary since the normalization factor is not a constant shift but dependent on $\alpha$ and $r_0$). Using Mathematica, the analytical normalization prefactor is computed to be

$$\frac{1}{\int_0^\infty \exp[-\alpha^2 (r-r_0)^2]4\pi r^2 dr} = \frac{\alpha^3}{2\exp(-\alpha^2 r_0^2) \pi r_0 \alpha + \pi^{3/2} (1+2r_0^2 \alpha^2)(1+\textrm{erf}(r_0 \alpha))}$$

Where $\alpha \equiv \sqrt{\frac{K}{2kT}}$. One can see that after histogramming a bond distribution from simulation, then normalizing to unit density, and then normalizing by shell volume, this should be directly comparable to $p(r)$ which is the Boltzmann distribution normalized by the above factor. See code below.

# import statements 
import numpy as np
import os
import hoomd
from hoomd import data
from hoomd import md
 
bond_dist_avg_hist = []
 
for f in frames_to_take:
 
    sys.stdout.write("\rCurrent frame %3d/%3d"%(f,N_frames))
    sys.stdout.flush()
    frame = trajectory[int(f)]
    box = frame.configuration.box
    bonds = frame.bonds.group
 
    bond_distances = np.linalg.norm(my_disp_in_box(\
                      frame.particles.position[bonds[:,1]] - \
                      frame.particles.position[bonds[:,0]], \
                      frame.configuration.box[0:3]), axis=-1) # my_disp_in_box implements MIC
    bond_dist_hist, bin_edges = np.histogram(\
                    bond_distances, bins=args.num_bins)
    bond_dist_hist = bond_dist_hist / np.sum(bond_dist_hist) # normalizes density to unity 
    bin_centers = (bin_edges[1:] + bin_edges[:-1]) / 2
    shellVols = (4 * np.pi / 3) * (bin_edges[1:]**3 - bin_edges[:-1]**3)
    bond_dist_avg_hist.append(bond_dist_hist / shellVols)
 
bond_dist_avg_hist = np.array(bond_dist_avg_hist)
 
np.savetxt(args.output_file, np.c_[bin_centers,bond_dist_avg_hist.mean(axis=0)],header="# bin, bond_len_hist")

def boltz(r, alpha, r0):
    '''Return expected bond length distribution'''
    return (alpha**3 / (2 * np.exp(-r0**2 * alpha**2) * np.pi * r0 * alpha + np.pi**1.5 * (1+2*r0**2 * alpha**2)*(1+erf(r0*alpha)))) * np.exp(-alpha**2 * (r-r0)**2)

References

D. Keffer, Structural Properties from Molecular Dynamics Simulation. MSE 614, Dept. of Materials Science & Engineering, University of Tennessee, Knoxville TN.
Dror, Ron. Energy functions and their relationship to molecular conformation. CS/CME/BioE/Biophys/BMI 279, Stanford University, Stanford CA. Sept. 22 and 24, 2020
Sukhomlinov, Sergey V.; Müser, Martin H. (2017). Determination of accurate, mean bond lengths from radial distribution functions. The Journal of Chemical Physics, 146(2), 024506–. doi:10.1063/1.4973804
Allen, Michael P., and Dominic J. Tildesley. Computer Simulation of Liquids. 2017.
http://www.physics.emory.edu/faculty/weeks//idl/gofr.html

StattLabWiki

Table of Contents

Radial distribution functions

Pair correlation function

Bond distribution functions

References