Executing STEMsalabim

STEMsalabim is executed on the command line and configured via input configuration files in libConfig syntax. To learn about the structure of the configuration files, please read Parameter files.

Note

Some of configuration parameters can be changed via command line parameters, which are described in Command line arguments.

STEMsalabim supports both threaded (shared memory) and MPI (distributed memory) parallelization. For most efficient resource usage we recommend a hybrid approach, where one MPI task is run per node that spawns a bunch of threads to parallelize the work within the node. (See Hybrid Parallelization model for more information on how STEMsalabim is parallelized.)

Thread-only parallelization

You can execute STEMsalabim on a single multi-core computer as follows:

$ stemsalabim --params=./my_config_file.cfg --num-threads=32

This will run the simulation configured in my_config_file.cfg on 32 cores, of which 31 are used as workers.

MPI only parallelization

For pure MPI parallelization without spawning additional threads, STEMsalabim must be called via mpirun or mpiexec, depending on the MPI implementation available on your machine:

$ mpirun -n 32 stemsalabim --params=./my_config_file.cfg --num-threads=1 --package-size=10

This command will run the simulation in parallel on 32 MPI processors without spawning additional threads.

Note

We chose a work package size ten times the number of threads on each MPI processor (which is 1 here). This is so that each thread calculates (on average) ten pixels until results are communicated via the network. This reduces management overhead but increases the amount of data sent via the network.

Hybrid parallelization

Hybrid parallelization is the recommended mode to run STEMsalabim.

For hybrid parallelization, make sure that on each node only a single MPI process is spawned and that there is no CPU pinning active, i.e., STEMsalabim needs to be able to spawn threads on different cores.

For example, if we wanted to run a simulation in parallel on 32 machines using OpenMPI and on each machine use 16 cores, we would run

$ mpirun -n 32 --bind-to none --map-by ppr:1:node:pe=16 \
    stemsalabim                                         \
        --params=./my_config_file.cfg                   \
        --num-threads=16                                \
        --package-size=160

The options --bind-to none --map-by ppr:1:node:pe=16 tell OpenMPI not to bind the process to anything and to reserve 16 threads for each instance. Please refer to the manual of your MPI implementation to figure out how exactly to run the software. On HPC clusters it is wise to contact the admin team for optimizing the simulation performance.

Running the Si 001 example

In the source code archive you find an examples/Si_001 folder that contains a simple example that you can execute to get started. The file Si_001.xyz describes a 2x2x36 unit cell Si sample. Please see Crystal file format for the format description.

In the file Si_001.cfg we find the simulation configuration / parameters. The file contains all available parameters, regardless of whether they have their default value. We recommend to always specify a complete set of simulation parameters in the configuration files.

You can now run the simulation:

$ /path/to/stemsalabim --params Si_001.cfg --num-threads=8

After the simulation finished (about 3 hours on an Intel i7 CPU with 8 cores) you can analyze the results found in Si_001.nc. Please see the next page (Visualization of crystals and results) for details.