User Tools

Site Tools


simulation:analyzing_gsd_trajectories

Helpful tools

Freud and MDAnalysis are extremely useful python packages which can compute a wide range of properties from simulation data. Always check if they have a function implemented you can use.

We have some scripts in the shared GitHub repository https://github.com/astatt/analyze_scripts, check if something close to what you want to do already exists.

How to read and analyze a gsd trajectory

The binary file format gsd is used by hoomd-blue to write trajectories to the disk. You can't simply open them like text files, but there is a python package to read and write them. By using it, analysis scripts can be written easily:

import numpy as np
import gsd
import gsd.hoomd
 
# read gsd file
in_file = gsd.fl.open(name='input.gsd', mode='rb')
trajectory  = gsd.hoomd.HOOMDTrajectory(in_file)
N_frames = int(in_file.nframes)
 
for i,frame in enumerate(trajectory):
 
    # Quantities in the frame are numpy arrays
    Box = frame.configuration.box[0:3]
    pos = frame.particles.position
    vel = frame.particles.velocity   
    ids = frame.particles.typeid
 
    # Do calculation on a frame 
 
# Average, save, ... results

The hoomd-blue and gsd documentation can be used to look up how to access certain data. In general, data is stored as numpy arrays and all numpy/scipy functions are straightforward to use on the data.

Hoomd-blue stores the particle data in a wrapped format, so all coordinates are within -L/2,L/2 in each dimension and the origin of the simulation volume is at (0,0,0). The particle images are also written out. Note that the velocities are only written if dynamic =['momentum'] was specified in hoomd.dump.gsd(…).

When writing analysis scripts with multiple input/output parameters, the package argparse can be extremely helpful for processing parameters, defaults, as well as providing terminal help text output.

Calculate distances in a box with periodic boundary conditions

It is very common to have periodic boundary conditions (PBCs) in a simulation. In standard PBCs, particles that exit the simulation volume reenter on the opposite side and interact with the nearest periodic images of the other particles. PBCs mitigate surface effects from the small volume one would otherwise get.

When analyzing trajectories, distances have to be calculated by using the minimum periodic image convention. In practice, it is useful to use numpy for this task:

def dist_pbc(x0, x1, box):
    d = x0 - x1
    d_min_image = np.where(d > 0.5*box, d - box, np.where(d < -0.5*box, d + box , d))
    return d_min_image

This function is reasonably fast, works any kind of array-like input, as long as the dimensions of x0, x1 and box match. For example, if x0 and x1 are 3-D points, the function will return the distance between them. If x0 is an array of many 3-D points and x1 is a single 3-D point, the function will return an array with all distances between the points in x0 and x1.

For calculating many distances it is worthwhile to look into scipy functions and other existing code:

Calculate the center of mass in a box with periodic boundary conditions

Similar to calculating distances, the center of mass in a box with PBCs has to be calculated with care. The easiest is to apply a wrapping onto a circle/torus:

def calc_CM(points,box):
    # transform coords from -L/2,L/2 to 0,2pi range
    theta = (points + 0.5*box)/box*2.0*np.pi
    # calculate average
    si = np.sin(theta)
    co = np.cos(theta)
    theta_av = np.arctan2(-np.average(si,axis=0),-np.average(co,axis=0)) + np.pi
    # transform coords back to -L/2,L/2 range
    z_com = box*theta_av/(2.0*np.pi) -  0.5*box
    return z_com

Identify polymers in a gsd file based on their bonds

Find polymers in a gsd:

from collections import defaultdict
 
 
def connected_components(lists):
    R"""
    merges lists with common elements
 
    args: bonds from configuration in gsd file (i.e, frame.bonds.group)
 
    returns: list of connected particles by id (no specific sorting guaranteed) 
 
    Useful for finding bonded particles in a configuration, tested for
    linear polymers with consecutive bonds (e.g., 0-1-2-3-4-5, 6-7-8-9-10,..)
    and non consecutive ids (e.g., 0-5-6-8-10, 1-4-3-2-9,...). Will also work with 
    stars, combs,... (not sure about ring polymers, not tested.)
    Works with ints as well as str, and probably other types too.
 
    """
 
    neighbors = defaultdict(set)
    seen = set()
    for each in lists:
        for item in each:
            neighbors[item].update(each)
    def component(node, neighbors=neighbors, seen=seen, see=seen.add):
        nodes = set([node])
        next_node = nodes.pop
        while nodes:
            node = next_node()
            see(node)
            nodes |= neighbors[node] - seen
            yield node
    for node in neighbors:
        if node not in seen:
            yield sorted(component(node))
 

Usage is as simple as this:

 
trajectory = gsd.hoomd.open(name='test.gsd')
frame = trajectory[-1] # take last frame
bonds = frame.bonds.group
 
polymers = connected_components(bonds) 
 
for polymer in polymers:
    # do something with each polymer
    print(polymer)
simulation/analyzing_gsd_trajectories.txt · Last modified: 2023/01/05 20:51 by bj21

Donate Powered by PHP Valid HTML5 Valid CSS Driven by DokuWiki