Some of configuration parameters can be changed via command line parameters, which are described in Command line arguments.
STEMsalabim supports both threaded (shared memory) and MPI (distributed memory) parallelization. For most efficient resource usage we recommend a hybrid approach, where one MPI task is run per node that spawns a bunch of threads to parallelize the work within the node. (See Hybrid Parallelization model for more information on how STEMsalabim is parallelized.)
You can execute STEMsalabim on a single multi-core computer as follows:
$ stemsalabim --params=./my_config_file.cfg --num-threads=32
This will run the simulation configured in
my_config_file.cfg on 32 cores, of which 31 are used as workers.
MPI only parallelization¶
For pure MPI parallelization without spawning additional threads, STEMsalabim must be called via
mpiexec, depending on the MPI implementation available on your machine:
$ mpirun -n 32 stemsalabim --params=./my_config_file.cfg --num-threads=1 --package-size=10
This command will run the simulation in parallel on 32 MPI processors without spawning additional threads.
We chose a work package size ten times the number of threads on each MPI processor (which is 1 here). This is so that each thread calculates (on average) ten pixels until results are communicated via the network. This reduces management overhead but increases the amount of data sent via the network.
Hybrid parallelization is the recommended mode to run STEMsalabim.
For hybrid parallelization, make sure that on each node only a single MPI process is spawned and that there is no CPU pinning active, i.e., STEMsalabim needs to be able to spawn threads on different cores.
For example, if we wanted to run a simulation in parallel on 32 machines using OpenMPI and on each machine use 16 cores, we would run
$ mpirun -n 32 --bind-to none --map-by ppr:1:node:pe=16 \ stemsalabim \ --params=./my_config_file.cfg \ --num-threads=16 \ --package-size=160
--bind-to none --map-by ppr:1:node:pe=16 tell OpenMPI not to bind the process to anything and to reserve
16 threads for each instance. Please refer to the manual of your MPI implementation to figure out how exactly to run the
software. On HPC clusters it is wise to contact the admin team for optimizing the simulation performance.
Si 001 example¶
In the source code archive you find an
examples/Si_001 folder that contains a simple example that you can
execute to get started. The file
Si_001.xyz describes a 2x2x36 unit cell Si sample. Please see
Crystal file format for the format description.
In the file
Si_001.cfg we find the simulation configuration / parameters. The file contains
all available parameters, regardless of whether they have their default value. We recommend to always
specify a complete set of simulation parameters in the configuration files.
You can now run the simulation:
$ /path/to/stemsalabim --params Si_001.cfg --num-threads=8
After the simulation finished (about 3 hours on an Intel i7 CPU with 8 cores) you can analyze the
results found in
Si_001.nc. Please see the next page (Visualization of crystals and results) for details.