General cmake installation recipe

Many programs and packages can be installed by the same general recipe:

git clone ... # or git pull etc.  
go to or create a 'build' folder
cmake .. -D 'some_flags' 
make 
make install # and/or make tests, ..

Some examples can be found here:

Hoomd-blue: install_hoomd_blue and official hoomd installation guide
LAMMPS: Install lammps and official LAMMPS installation guide
Gromacs: installation guide
Ovito: installation guide

Download source code

The first step is to download the code. Usually, this has to be done only once, especially if git (see here for more details)) is involved. If the code is hosted on a github/gitlab/bitbucket page, it looks like this git clone https://url-to-code. This downloads the source code and puts it in a new folder in the current directory. The folder name is usually the name of the repository or program, but can be changed by using git clone https://url-to-code <foldername>. If the code isn't hosted anywhere, manual download of a tarball .tar or .zip can replace this step. Either way, the source code has to come from somewhere.

Compile the source code

The details of how to compile a code strongly depend on the code.

Many modern codes aim to keep the files generated/used during compilation separate from the source files, e.g. for c++ keep object files .o, which are generated during compilation in a different folder than the .h and .cc or .cpp files. This is helpful for security reasons and cleanup, as it is less likely to accidentally overwrite or delete files. It also means those files can be easily excluded from the git repository, since they are specific for each computer/user. The canonical name for the folder used for files generated by the compiler or during the compiling process is build. Anything in this folder can be safely deleted. This leads to the next step, where after obtaining the source code, this folder has to be created or navigated into. Often, this looks like this:

cd program-name # or foldername 
mkdir build # create the folder 'build'
cd build  # go to the folder 'build'

After this, cmake, make, a compiler, or a script are used to actually compile the code. This step may differ substantially from code to code. A fairly common combination is to use ''cmake'' and then ''make''.

To understand why this is convinient, it is useful to take a couple of steps back. Using c++ as an example, a simple code could be compiled like this in the terminal g++ -o <name-for-executable> source1.cpp source12.cpp ... If the code uses specific flags (e.g. -g –Werror …) they need to be added to the command as well. If the code is in sub-folders, their path has to be included. If some libraries are used, their paths have to be specified as well, which can get long and complicated very quickly. It also depends on the current computer/system. Remembering this very long, complicated system-specific command line is cumbersome.

One way of handling this issue is using make. Make (or rather a Makefile) is a buildsystem - it drives the compiler and other build tools to build the code. This file contains a list of very rigid instructions on how to generate certain targets. While Make can find some libraries automatically, paths and instructions are often hard-coded into the Makefile. Makefiles can contain multiple targets, for example one for debug flags, another one for flags/library locations specific for a cluster, and so on. This way, the long command line input can be saved in a Makefile and the code gets compiled by typing make <target> in the terminal instead of the g++ … command. This is still operating system dependent and quite manual, as each target has to be added depending on operating system, locations, paths, etc.

CMake is a generator of buildsystems, e.g. it adds a layer above the Makefile. It can produce Makefiles, it can produce Ninja build files, it can produce KDEvelop or Xcode projects, it can produce Visual Studio solutions, depending on the code and operating system. It uses a CMakeLists.txt file as input. This way, a code can be platform or operating system independent. CMake will automatically attempt to find the prerequisites (e.g. installed programs, libraries, files, etc) and deduce their locations. It then generates the Makefile based on that information. Once setup properly, it also allows to influence the specifics of the compiling process by using flags -D 'some_flags' . Commonly they change what debug options are used, what versions, what optional libraries/components/partis, if documentation is build, if tests are compiled, where to install to .. etc. This a is very convenient way encode all these cases into one system. Cmake can also used to communicate some information between git, the code, documentation, a good example is a version number, which cmake can get from git and forward to the code as a #define VERSION defined variable, and also to the documentation. This makes maintenance easier, as the version number has to be changed in a single spot, instead of three different spots. Cmake has a long list of sophisticated features like this.

It can be somewhat difficult to remember all the potential flags, ccmake . (executed in the build folder) is a little tool which will provide a list of flags and their current set values. Both cmake and make are always executed in the build folder. Once this recipe is set up correctly, a user does not need to remember or know details (like specific linking paths etc) about this process and can simply use the cmake flags to steer the compiler from the outside.

Testing the code

Modern code should come with some tests which can be executed to check that it compiled correctly. Often, make test, make tests or similar will run them. test here is what a target would be for a normal Makefile.

Installing the code

The last step is installing the code, after it got compiled correctly. make install, again install is the target. This is usually pretty fast, it is just copying the executable (and potentially other things) to the right folder.

When to execute what part of the recipe

Sometimes it can be difficult to know that to execute when.

In very general terms, downloading the git repository or source code only has to be done once. Unless the whole folder was deleted by accident or a completely different code base should be used, the initial git clone <url-location> is used exactly once. It doesn't hurt to do it multiple times, but it is unnecessary. If the code got updated remotely, a git pull anywhere in the project folder (including in the build folder!) should be sufficient. Because the git repository is already set up, git knows the url or location where to pull from.

If the git is updated with git pull instead of re-initialized with git clone <url-location> the build folder doesn't have to be re-created either.

Any time the flags are changed, e.g. switch debug on, build the documentation on or off, …, cmake needs to be re-executed. If a new source code file is added, file structures are changed, or a new version of a prerequisite is installed, cmake needs to be re-run.

If a file is simply changed, only make install needs to be re-executed. Meaning during normal development of code, the work flow could look as simple as this:

implement new function # or feature in existing files, edit, etc,..
cd build # if not already in right folder 
make  # check that it compiles, deal with compiler errors etc,..
make test(s) # test new code 
make install # install new code

Often, lines 1 and 3 are executed in a circle until the code compiles, then line 1-4 until tests pass, and then 1-5. It might be useful to just have a terminal open in the build folder where lines 3-5 can be executed over and over.

Sometimes a really invalid or odd state can be created by accident and it is not clear at all what is going wrong, especially with really funny compiler errors. It can be helpful to start from scratch to trace down the problem:

cd <top-program-folder> # e.g. the one with the 'README.md' file
rm -rf build/  # remove everything the compiler generated 
mkdir build 
cd build  # now an empty clean slate!
cmake .. -D 'flags' # check flags, cmake output etc,..
make  # check that it compiles, deal with compiler errors etc,..
make test(s) # test new code 
make install # install new code

This way, the cmake output can be checked, it will print out all found compilers, packages, etc.

If the source code or parts of it were deleted accidentally or messed up for some reason, git can be used to recover the file or state of the code. Depending on the severity it can also be really useful to re-execute the cmake -D flags step.

[installation]

StattLabWiki

Table of Contents