Table of Contents
Make plots
General remarks
There are many different programs to make figures, e.g. gnuplot, matplotlib, inkscape,.. When designing figures the following things should be kept in mind:
- Clarity - no ambiguity and confusion, clear color/symbol choices for both color and b/w printing, be mindful of colorblindness, use labels (with units!) + figure captions, fontsize/type should match manuscript text (although there is debate over serif vs. sans-serif font), avoid excessive abbreviations whenever possible, all plots/graphs consistent in style
- Precision - truthful results and distortion free presentation, be careful about unconventional scaling and x/y axis choices, 3D representations, non-zero y-axis
- Efficiency - no unnecessary “junk” in your graphs, appropriate for the intended medium (talk, paper, poster)
- Resources:
- Color choices:
Histograms:
- Picking bin widths: https://onlinelibrary.wiley.com/doi/10.1002/pmrj.12145
Matplotlib
Using the matplotlib package is often the easiest way of plotting data, especially if the analyzing scripts are already in python.
import matplotlib.pyplot as plt import numpy as np def LJ_potential(r,sigma,epsilon,r_cut): conds = [r < r_cut, r >= r_cut] funcs = [lambda x: 4*epsilon*((sigma/x)**(12)-(sigma/x)**6) , lambda x: 0] return np.piecewise(r, conds, funcs) fig, ax = plt.subplots(1, 1) r = np.arange(0.6,4,0.01) ax.plot(r, LJ_potential(r,1.0,1.0,3.0),c='k') ax.set_xlim([0.8,3.1]) ax.set_ylim([-1.1,2]) ax.set_xlabel(r'$r \ [\sigma]$') ax.set_ylabel(r'$U_\mathrm{LJ}(r) \ [\epsilon / k_\mathrm{B}]$') plt.savefig("example.pdf") plt.show()
This produces a pretty nice plot right away. There are many ways of customizing the plots, if the same settings are used over and over again they can be put in a style class, like described here. Alternatively, each setting can be changed manually, or also changed in place, often with arguments for each function within matplotlib:
import numpy as np import matplotlib.pyplot as plt import matplotlib as mpl from mpl_toolkits.axes_grid1.inset_locator import inset_axes, mark_inset mpl.rcParams['text.usetex'] = True mpl.rcParams['font.family'] = "serif" mpl.rcParams['font.serif'] = "Computer Modern" mpl.rcParams['legend.fontsize'] = 12 mpl.rcParams['axes.labelsize'] = 12 mpl.rcParams['font.size'] = 12 mpl.rcParams['xtick.labelsize'] = 12 mpl.rcParams['ytick.labelsize'] = 12 mpl.rcParams['legend.frameon'] = False mpl.rcParams['legend.shadow'] = False mpl.rcParams['legend.numpoints'] = 1 mpl.rcParams['legend.scatterpoints'] = 1 mpl.rcParams['legend.handlelength'] = 0.3 mpl.rcParams['legend.handletextpad'] = 0.2 mpl.rcParams['lines.linewidth'] = 1.5 mpl.rcParams['lines.markersize']= 5 mpl.rcParams['savefig.bbox'] = "tight" mpl.rcParams['savefig.pad_inches'] = 0.1 mpl.rcParams['savefig.transparent'] = True mpl.rcParams['xtick.major.size'] = 4 mpl.rcParams['ytick.major.size'] = 4 mpl.rcParams['xtick.minor.size'] = 1.5 mpl.rcParams['ytick.minor.size'] = 1.5 mpl.rcParams['xtick.minor.visible'] = True mpl.rcParams['ytick.minor.visible'] = True mpl.rcParams['ytick.right'] = True mpl.rcParams['xtick.top'] = True mpl.rcParams['xtick.direction'] = "in" mpl.rcParams['ytick.direction'] = "in" def LJ_potential(r,sigma,epsilon,r_cut): conds = [r < r_cut, r >= r_cut] funcs = [lambda x: 4*epsilon*((sigma/x)**(12)-(sigma/x)**6) , lambda x: 0] return np.piecewise(r, conds, funcs) fig, ax = plt.subplots(1, 1, figsize=(6,4)) axins = inset_axes(ax, width="60%", height="50%", loc=1, bbox_to_anchor=(.2, .4, .6, .5), bbox_transform=ax.transAxes) r = np.arange(0.6,4,0.001) ax.plot(r, LJ_potential(r,1.0,1.0,3.0),c='red',label='Lennard-Jones potential') axins.plot(r, LJ_potential(r,1.0,1.0,3.0),c='red') axins.axhline(y=0,c='k',linestyle='--') mark_inset(ax, axins, loc1=3, loc2=1, fc="none", ec="0.5") axins.set_xlim([2.95,3.05]) axins.set_ylim([-0.01,0.005]) ax.set_xlim([0.8,3.1]) ax.set_ylim([-1.1,2]) ax.set_xlabel(r'$r \ [\sigma]$') ax.set_ylabel(r'$U_\mathrm{LJ}(r) \ [\epsilon / k_\mathrm{B}]$') ax.legend(loc=4) plt.savefig("example.pdf") plt.show()
Format matplotlib for a two-column LaTeX document
Often, preparing figures for a manuscript means to keep the final document formatting in mind. As mentioned, there are many different design philosophies, but if you want to make your plot look the same and fit into revtex4-1, this should get you started:
import matplotlib import matplotlib.pyplot as plt # set the font to be cm=Computer Modern=LaTeX default font matplotlib.rcParams['mathtext.fontset'] = 'cm' matplotlib.rcParams['font.family'] = 'STIXGeneral' # make the labels a bit smaller (default font size is 10) matplotlib.rcParams['xtick.labelsize'] = 9 # this enables full LaTeX equation rendering matplotlib.rcParams['text.usetex'] = True # a column in a two-column revtex4-1 LaTeX article is 8.6 cm wide # you can also use 8 cm for a slightly smaller figure cm = 1/2.54 fig, ax = plt.subplots(1,2, figsize=(8.6*cm,6*cm)) # plot ... # make sure nothing overlaps plt.tight_layout() plt.savefig("my_shiny_plot.pdf") plt.show()
Then, in your LaTeX document one can simply include this figure with:
\begin{figure}[!t] \centering \includegraphics{my_shiny_plot.pdf} \caption{A shiny plot} \label{fig:shiny} \end{figure}
This should result in correct font and font sizes. You can also explicitly specify the figure size with \includegraphics[width=8cm]{my_shiny_plot.pdf}
. It is always useful to render one figure and include it in the manuscript, tinker with it until it looks right, and then apply the desired formatting onto all figures.
Plot data from files
For reading data from files use numpy.genfromtxt()
:
mport matplotlib.pyplot as plt import numpy as np fig, ax = plt.subplots(1, 1,figsize=(6, 4)) FS = np.genfromtxt('Frenkel_casestudy_09.dat') ax.scatter(FS[:,0], FS[:,1], label="$T = 0.9 \;\epsilon/ k_\mathrm{B}$", color='r') FS = np.genfromtxt('Frenkel_casestudy_20.dat') ax.scatter(FS[:,0], FS[:,1], label="$T = 2.0 \;\epsilon/ k_\mathrm{B}$", color='b') ax.set_xlim([0.0,1.0]) ax.set_ylim([-0.5,2.5]) ax.set_xlabel(r'$\rho \ [m/\sigma^3]$') ax.set_ylabel(r'$p \ [\epsilon / \sigma^3]$') ax.legend() plt.savefig("example.pdf") plt.show()
Plotting/Analyzing all files in a folder
Assuming there are many files which are all supposed to be plotted by the same script, python can automatically find all files matching a certain pattern:
import numpy as np import os, re import matplotlib.pyplot as plt # find all files in a folder 'out' which end on "pressure.dat" files_to_plot = [] for subdir, dirs, files in os.walk('./out/'): for filename in files: filepath = subdir + filename if filepath.endswith(".pressure.dat"): files_to_plot.append(filepath) fig, ax = plt.subplots(1, 1) for i,f in enumerate(files_to_plot): data = np.genfromtxt(f) # open the file ax.plot(data[:,0],data[:,1], label="%s"%f) # plot ax.set_xlabel(r'"time" $\ [\mathrm{ MC \ cycles}]$') ax.set_ylabel(r'$p \ [\epsilon / \sigma^3]$') l = ax.legend(bbox_to_anchor=(1.05, 1), loc='upper left') plt.savefig("example.pdf",bbox_extra_artists=(l,), bbox_inches='tight') plt.show()
If only some files need to be plotted, a simple list can be provided:
files_to_plot = ['./out/file1.dat','./out/file2.dat','./out/file3.dat']
This is also useful if the files don't follow a common pattern.
A more complicated example, where all pressure data is averaged and plotted:
import numpy as np import os, re import matplotlib.pyplot as plt from matplotlib import cm def P_correction(rho,sigma,epsilon,cutoff): c = 16/3.*np.pi*rho**2*sigma**3*epsilon*(2/3.0*(sigma/cutoff)**9-(sigma/cutoff)**3) return c # find all files in a folder which end on "pressure.dat" files_to_plot = [] L = [] N = [] T = [] for subdir, dirs, files in os.walk('./out/'): for filename in files: filepath = subdir + filename if filepath.endswith(".pressure.dat"): # finds all floats in filename - expecting three (N,L and T) all_floats = re.findall(r"[-+]?\d*\.\d+|\d+", filepath) if len(all_floats)>0: L.append(float(all_floats[1])) N.append(float(all_floats[2])) T.append(float(all_floats[0])) files_to_plot.append(filepath) # sorting one list according to values in another list, e.g. make sure we plot in order # not actually strictly necessary files_to_plot = [x for _,x in sorted(zip(L,files_to_plot))] N = [x for _,x in sorted(zip(L,N))] T = [x for _,x in sorted(zip(L,T))] # parameters_for_sorting HAVE to be sorted last L = [y for y,_ in sorted(zip(L,files_to_plot))] # Define the colors to be used using rainbow map (or any other map), or any other color list, or just a single color colors = [cm.rainbow(i) for i in np.linspace(0, 1, len(files_to_plot))] # now to the actual plotting: fig, (ax1,ax2) = plt.subplots(1, 2) FS = np.genfromtxt('Frenkel_casestudy_09.dat') ax1.plot(FS[:,0], FS[:,1], label="FS", color='k') FS = np.genfromtxt('Frenkel_casestudy_20.dat') ax2.plot(FS[:,0], FS[:,1], label="FS", color='k') # constants sigma = 1. epsilon = 1. cutoff = 3. for i,f in enumerate(files_to_plot): data = np.genfromtxt(f) data = data[20:] rho = N[i]/L[i]**3 P = np.average(data[:,1]) P_ex = P_correction(rho,sigma,epsilon,cutoff) if T[i]==0.9: ax1.scatter(rho, P+P_ex, label="%1.3f"%rho, color=colors[i]) else: ax2.scatter(rho, P+P_ex, label="%1.3f"%rho, color=colors[i]) ax1.set_xlabel(r'$\rho \ [m/\sigma^3]$') ax1.set_ylabel(r'$p \ [\epsilon / \sigma^3]$') ax1.legend() ax2.set_xlabel(r'$\rho \ [m/\sigma^3]$') ax2.set_ylabel(r'$p \ [\epsilon / \sigma^3]$') ax2.legend() plt.savefig("example.pdf") plt.show()
Extracting data from plot images
Often, you want to have the data from a plot/graph in a publication or book. It can be pretty difficult to read them off by eye, especially if the axes are logarithmic. g3data is a nice, but old, tool to do this.
- install g3data
- take screenshot of the plot in question and save it as png somewhere
- open the png with g3data
- set reference points/values on the axes, set logarithmic axes if needed
- click on the data points you want to extract
- save as txt somewhere
install g3data on Mac
brew install automake autoconf gtk+ git clone https://github.com/pn2200/g3data.git cd g3data/ autoreconf -i ./configure make ./g3data/g3data
install g3data on Ubuntu
sudo apt-get install g3data