Modules Environment, Slurm and Spack
Table of contents
1. Purpose of Software Environment Modules
The Environment Modules package allows us to simultaneously offer many software products in multiple versions. When we upgrade software, the previous versions will still be available to users, thus avoiding unplanned disruptions of ongoing projects.
Use the Modules environment commands to coherently set and unset all paths and environment variables (e.g. PATH, MANPATH, CPATH, LD_RUN_PATH, etc.) that are necessary to use available (and sometimes conflicting) software packages on our HPC systems, by simply loading or unloading the corresponding module files.
We use various methods to install software, in particular: manual installation, Anaconda, and Spack. The orgnization of the modules on our systems reflects these differences.
On a typical UIBK system, the available sections may look like
- Application-Software
- Development
- Python+R-Anaconda3
- Spack-Instances
- Spack-leo5-20230116
Most installed third party software packages are available in a two-level structure of module files
application/version.
Example:
matlab/R2022a
For software installed with Spack, the structure is
software/version-toolchain[...]-hash.
The toolchain is the compiler used to build the given package.
Example:
zlib/1.2.13-gcc-8.5.0-xlt7jpk
is the module for the zlib library built with GCC 8.5.0. The 7-character hash allows to distinguish between multiple variants of the same software.
Some module names contain additional components, such as the MPI version used with a parallel software package.
Example:
fftw/3.3.10-openmpi-4.1.4-intel-2021.7.1-ndq6d76
is the module for the fftw linked with OpenMPI 4.1.4 built with the Intel Classic toolchain.
When you issue a module load command, e.g.
module load application/version
the module's environment variables will be set in your current shell.
Please note: All software which comes pre-installed with the operating system, such as the default Gnu Compiler Collection (GCC) and many basic Linux utilities, can be accessed without loading modules. Different versions of the same software may be available via the Modules environment.
2. Setting up the Modules environment
The Modules environment will automatically be initialized when you log in.
Depending on your needs and habits, you may...
- ... manually load modules as needed (recommended if you use different packages or versions at different times), or
- ... add module load commands to your $HOME/.bashrc (recommended if you use always the same set of software packages).
2.1 Modules Environment and Batch Jobs
By default, jobs submitted with sbatch will inherit all environment variables of your current shell, including currently loaded modules.
If you want to set a job's environment variables and modules independently, do the following:
- Use the --export=NONE option on the sbatch command line or #SBATCH directive,
- Include all desired module load commands in your job script.
Example:
#!/bin/bash
#SPATCH --job-name=myjob
#SBATCH --export=NONE
module load software-a/version-a
module load software-b/version-b
my-commands
Please note: in contrast to SGE, Slurm jobs will not run your $HOME/.bashrc file. So module load commands contained in .bashrc will be run only when you log in.
3. Working with the Module Environment
The module command has a number of sub-commands. In the following sections, we briefly discuss the most important ones.
3.1 module avail
Display a list of available modules, grouped by categories as discussed above
Example:
$ module avail
----------- /path_to_module_categories/{Application-Software|Compiler|...} -----------
application1/version1 application2/version2 application3/version3 ...
3.2 module show
The show subcommand displays all changes to your environment by a given module. This way, you can also find out about where application binaries or libraries for linking your own programs are located.
Example:
c102mf@login.leo5[0]:~$ module show matlab/R2022a
-------------------------------------------------------------------
/usr/site/hpc/modules/leo5/Application-Software/matlab/R2022a:
setenv MATLAB /usr/site/hpc/x86_64/generic/matlab/R2022a
prepend-path PATH /usr/site/hpc/x86_64/generic/matlab/R2022a/bin
prepend-path LD_LIBRARY_PATH /usr/site/hpc/x86_64/generic/matlab/R2022a/runtime/glnxa64
prepend-path LD_LIBRARY_PATH /usr/site/hpc/x86_64/generic/matlab/R2022a/bin/glnxa64
prepend-path LD_LIBRARY_PATH /usr/site/hpc/x86_64/generic/matlab/R2022a/sys/os/glnxa64
3.3 module load
With the load subcommand you can load one or more of the available module files:
module load gcc/11.3.0-gcc-8.5.0-rwipohd openmpi/4.1.4-gcc-11.3.0-gfkyxua netcdf-c/4.9.0-openmpi-4.1.4-gcc-11.3.0-zvsdgrp
Please note: Traditionally, the environment variable LD_LIBRARY_PATH had to be set at run time so programs linked using libraries from installed software could locate the correct runtime objects.
In our UIBK/LEO installation, this is no longer necessary. When you load a module installed by Spack, the LD_RUN_PATH variable is set to include the location of the shared objects, so the correct libraries are added to the RPATH attribute of your executables. So, to run your program, you need to load only the modules that you need to access shell level commands. Libraries will be found via the RPATH attribute of your executables and so are automatically loaded as needed.
Example:
To build your program, do
module load gcc/11.3.0-gcc-8.5.0-rwipohd
module load openmpi/4.1.4-gcc-11.3.0-gfkyxua
module load netcdf-c/4.9.0-openmpi-4.1.4-gcc-11.3.0-zvsdgrp
make myprogram
To run your program, simply do
module load openmpi/4.1.4-gcc-11.3.0-gfkyxua
mpirun -np ntasks myprogram
3.4 module list
With the list subcommand you get the list of all currently loaded module files:
user@login.leo5[0]:~$ module list
Currently Loaded Modulefiles:
1) gcc/11.3.0-gcc-8.5.0-rwipohd 2) openmpi/4.1.4-gcc-11.3.0-gfkyxua 3) netcdf-c/4.9.0-openmpi-4.1.4-gcc-11.3.0-zvsdgrp
3.5 module unload/purge
With the unload subcommand you can unload one or more of the loaded module files (see list of loaded modules above):
Similarly, with the command
$ module purge
all loaded modules are unloaded at once.
3.6 module help
The help subcommand on its own gives general information about module usage. When adding a specific module to the help subcommand, some more information about this module is displayed:
user@login.leo5[0]:~$ module help openmpi/4.1.4-gcc-11.3.0-gfkyxua
-------------------------------------------------------------------
Module Specific Help for /usr/site/hpc/modules/leo5/Spack-leo5-20230116/openmpi/4.1.4-gcc-11.3.0-gfkyxua:
openmpi@4.1.4%gcc@11.3.0~atomics~cuda~cxx~cxx_exceptions+gpfs~internal-hwloc~java+legacylaunchers~lustre~memchecker+pmi+romio+rsh~singularity+static+vt+wrapper-rpath build_system=autotools fabrics=auto schedulers=slurm arch=linux-rocky8-icelake/gfkyxua
An open source Message Passing Interface implementation. The Open MPI Project is an open source Message Passing Interface [...]
Please note: (UIBK extension) For software installed with Spack, the module help command will also display the particular set of options used to install the package. This allows you to distinguish between variants of packages whose names onyl differ by their hash. The syntax used for these specifications is described unter point 6. Spack Specification Expression Syntax.
4. Modules and Spack
New Spack versions are released approximately once per year, supporting new versions of existing software and new functionality. We will make these versions available to users by installing new Spack release-instances. We try to keep these reasonably complete by adding software as needed.
At login, your MODULEPATH will always refer to the newest Spack release-instance.
As the need arises due to requests for individual software versions more recent than provided by the stable Spack releases, we will install additional instances of the develop-versions of Spack. These will typically only contain a few software packages requested by users.
All available Spack instances may be listed by issuing
$ module avail spack
spack/v0.19-leo5-20230116-release spack/v0.20-leo5-20230124-develop
The output will indicate the Spack version, install date and release status.
To access any given Spack instance or directly use Spack commands, issue
module load spack/version
This will remove the default set of Spack modules from your MODULEPATH and add the selected instance instead. Then issue module avail to obtain an overview of software installed in that spack instance.
Above command will also give you access to the spack command and provide an option to activate Spack's shell integration.
5. Additional information about the Modules environment
For further information concerning the Modules environment please have a look at the man pages man module.
The official documentation can be found at the Environment Modules website.
6. Spack Specification Expression Syntax
The following output of the spack help --spec command should help understand the output of the module help name command described above.
spec expression syntax:
package [constraints] [^dependency [constraints] ...]
package
/hash
constraints:
versions:
@version
@min:max
@min:
@:max
compilers:
%compiler
%compiler@version
%compiler@min:max
compiler flags:
cflags="flags"
cflags=="flags"
variants:
+variant
++variant
-variant or ~variant
--variant or ~~variant
variant=value
variant==value
variant=value1,value2,value3
variant==value1,value2,value3
any package from 'spack list', or
unique prefix or full hash of installed package
single version
version range (inclusive)
version <min> or higher
up to version <max> (inclusive)
build with <compiler>
build with specific compiler version
specific version range (see above)
cppflags, cflags, cxxflags, fflags, ldflags, ldlibs
propagate flags to package dependencies
cppflags, cflags, cxxflags, fflags, ldflags, ldlibs
enable <variant>
propagate enable <variant>
disable <variant>
propagate disable <variant>
set non-boolean <variant> to <value>
propagate non-boolean <variant> to <value>
set multi-value <variant> values
propagate multi-value <variant> values
architecture variants:
platform=platform
os=operating_system
target=target
arch=platform-os-target
cross-compiling:
os=backend or os=be
os=frontend or os=fe
dependencies:
^dependency [constraints]
^/hash
linux, darwin, cray, etc
specific <operating_system>
specific <target> processor
shortcut for all three above
build for compute node (backend)
build for login node (frontend)
specify constraints on dependencies
build with a specific installed dependency
examples:
hdf5
hdf5 @1.10.1
hdf5 @1.8:
hdf5 @1.8: %gcc
hdf5 +mpi
hdf5 ~mpi
hdf5 ++mpi
hdf5 ~~mpi
hdf5 +mpi ^mpich
hdf5 +mpi ^openmpi@1.7
boxlib dim=2
any hdf5 configuration
hdf5 version 1.10.1
hdf5 1.8 or higher
hdf5 1.8 or higher built with gcc
hdf5 with mpi enabled
hdf5 with mpi disabled
hdf5 with mpi enabled and propagates
hdf5 with mpi disabled and propagates
hdf5 with mpi, using mpich
hdf5 with mpi, using openmpi 1.7
boxlib built for 2 dimensions
libdwarf %intel ^libelf%gcc
libdwarf, built with intel compiler, linked to libelf built with gcc
mvapich2 %pgi fabrics=psm,mrail,sock
mvapich2, built with pgi compiler, with support for multiple fabrics