Spack: Lmod, Core compilers, and Spack

Created on 24 Apr 2017  ·  46Comments  ·  Source: spack/spack

We're just starting to use Lmod here at ANL and I'm testing out Spack's module support. So far I've built intel-parallel-studio with the system GCC, then I built mvapich2 with this Intel compiler. This results in:

$ module avail
...
Core/intel-parallel-studio/professional.2017.2
intel/17.0.2/mvapich2/2.2
...

Now when I try to load them:

$ module load Core/intel-parallel-studio/professional.2017.2
$ module load intel/17.0.2/mvapich2/2.2

Lmod is automatically replacing "Core/intel-parallel-studio/professional.2017.2" with
"intel/17.0.2/mvapich2/2.2".

I understand that Lmod has a nice feature that prevents you from loading multiple libraries built with different compilers, but I thought that putting things in Core would allow it to work. What am I missing here?

For reference, my modules.yaml looks like:

modules:
  enable::
    - lmod
    - tcl
  lmod:
    core_compilers:
      - '[email protected]'
    hash_length: 0

@alalazo?

modules

Most helpful comment

If it were GCC I'd modify the specs file to add the rpath to the link line like so, but it looks like Clang has no equivalent.


Even better solution though (don't know why I didn't think of this earlier): extend modules.yaml to have

modules:
  lmod:
    llvm:
      environment:
        prepend_path:
          'LD_RUN_PATH': '${PREFIX}/lib'
    gcc:
      environment:
        prepend_path:
          'LD_RUN_PATH': '${PREFIX}/lib:${PREFIX}/lib64'

This fixes the link issues; adding it to my custom Core/clang gets everything working :)

All 46 comments

@adamjstewart There are a couple of issues here. The most important is that you need to module use a path that includes Core. For example:

$ module av

-------------------------------------------------------------- /home/mculpo/hpcac-2017/spack-modulefiles/share/spack/lmod/linux-ubuntu14-x86_64/Core --------------------------------------------------------------
   gcc/6.3.0

------------------------------------------------------------------------------------------- /usr/share/modules/versions -------------------------------------------------------------------------------------------
   3.2.10

----------------------------------------------------------------------------------------- /usr/share/modules/modulefiles ------------------------------------------------------------------------------------------
   dot    module-git    module-info    modules    null    use.own

Use "module spider" to find all possible modules.
Use "module keyword key1 key2 ..." to search for all possible modules matching any of the "keys".

From this configuration you can load gcc and it will add a new path to MODULEPATH:

$ module load gcc 
$ module av

----------------------------------------------------------- /home/mculpo/hpcac-2017/spack-modulefiles/share/spack/lmod/linux-ubuntu14-x86_64/gcc/6.3.0 ------------------------------------------------------------
   bzip2/1.0.6     libpciaccess/0.13.4    libxml2/2.9.4    mpich/3.2              openblas/0.2.19    perl/5.24.1          py-packaging/16.8       py-six/1.10.0    sqlite/3.8.5          zlib/1.2.11
   cmake/3.7.2     libsigsegv/2.11        m4/1.4.18        ncurses/6.0            openmpi/2.1.0      pkg-config/0.29.2    py-pyparsing/2.1.10     python/2.7.13    util-macros/1.19.1
   hwloc/1.11.6    libtool/2.4.6          mawk/1.3.4       netlib-lapack/3.6.1    openssl/1.0.2k     py-appdirs/1.4.0     py-setuptools/34.2.0    readline/7.0     xz/5.2.3

-------------------------------------------------------------- /home/mculpo/hpcac-2017/spack-modulefiles/share/spack/lmod/linux-ubuntu14-x86_64/Core --------------------------------------------------------------
   gcc/6.3.0 (L)

------------------------------------------------------------------------------------------- /usr/share/modules/versions -------------------------------------------------------------------------------------------
   3.2.10

----------------------------------------------------------------------------------------- /usr/share/modules/modulefiles ------------------------------------------------------------------------------------------
   dot    module-git    module-info    modules    null    use.own

  Where:
   L:  Module is loaded

Use "module spider" to find all possible modules.
Use "module keyword key1 key2 ..." to search for all possible modules matching any of the "keys".

The second one is that I need to check if intel-parallel-studio is supported the way you want to use it. Due to the way LMod is structured you need an Intel compiler (currently from the intel package) that is "built" with a Core compiler (the same way as in the example above [email protected] is built with Core). After that you can install intel-parallel-studio%intel and you'll have it visible once the Intel compiler from Core will be loaded.

Final remark: in case you need to use the Intel compiler with non-Intel MPI or LAPACK implementations I would reccommend installing intel-mkl%intel and intel-mpi%intel separately, so that you can switch them independently. If you install intel-parallel-studio and make it provide both mpi and lapack you can't use that package selectively for just one of the two APIs it provides.

In case I am missing some information or something is not clear feel free to ask. We are also in the process of updating our software and module files layout right now from a production version of Spack that is more or less one year old. If you want to peek at the configuration files we are currently using they are here.

@alalazo I stumbled on this to help me figure out Spack + Lmod as well. Somehow I've done something odd, but maybe I'm just not set up correctly. First, I have in the past rolled my own lmod files off of builds I did by hand. That got annoying, so I decided to try out spack. I moved my old modulefile path out of the way so that lmod wouldn't see it for now. Now, in my .tcshrc, following your above advice:

  echo "Sourcing for lmod"
  source $HOME/lmod/lmod/init/cshrc
  module use -a $SPACK_ROOT/share/spack/lmod/darwin-elcapitan-x86_64/Core
  module unuse $SPACK_ROOT/share/spack/modules/darwin-elcapitan-x86_64

and I also have in .spack/modules.yaml

modules:
   enable::
      - lmod
   lmod:
      hash_length: 0
      core_compilers: 
         - '[email protected]'

because I know clang is the "base" compiler, but I want to build gcc 6.3.0 and use that to build most of my stuff. Now, the first thing I did was build gcc 6.3.0, so I ended up with:

(80) $ module av

-------------------------------- /Users/mathomp4/lmod/lmod/modulefiles/Core --------------------------------
   lmod/6.5    settarg/6.5

------------------- /Users/mathomp4/spack/share/spack/lmod/darwin-elcapitan-x86_64/Core --------------------
   bzip2/1.0.6    gmp/6.1.2    libelf/0.8.13      m4/1.4.18    mpfr/3.1.5
   gcc/6.3.0      isl/0.18     libsigsegv/2.11    mpc/1.0.3    zip/3.0

Use "module spider" to find all possible modules.
Use "module keyword key1 key2 ..." to search for all possible modules matching any of the "keys".

Well, those are all the packages that were needed to build gcc/6.3.0. Now if I spack install zip%[email protected] and look at the modules:

(82) $ module av

-------------------------------- /Users/mathomp4/lmod/lmod/modulefiles/Core --------------------------------
   lmod/6.5    settarg/6.5

------------------- /Users/mathomp4/spack/share/spack/lmod/darwin-elcapitan-x86_64/Core --------------------
   bzip2/1.0.6    gmp/6.1.2    libelf/0.8.13      m4/1.4.18    mpfr/3.1.5
   gcc/6.3.0      isl/0.18     libsigsegv/2.11    mpc/1.0.3    zip/3.0

Use "module spider" to find all possible modules.
Use "module keyword key1 key2 ..." to search for all possible modules matching any of the "keys".

(83) $ module load gcc/6.3.0
(84) $ module av

----------------- /Users/mathomp4/spack/share/spack/lmod/darwin-elcapitan-x86_64/gcc/6.3.0 -----------------
   bzip2/1.0.6 (D)    zip/3.0 (D)

-------------------------------- /Users/mathomp4/lmod/lmod/modulefiles/Core --------------------------------
   lmod/6.5    settarg/6.5

------------------- /Users/mathomp4/spack/share/spack/lmod/darwin-elcapitan-x86_64/Core --------------------
   bzip2/1.0.6        gmp/6.1.2    libelf/0.8.13      m4/1.4.18    mpfr/3.1.5
   gcc/6.3.0   (L)    isl/0.18     libsigsegv/2.11    mpc/1.0.3    zip/3.0

  Where:
   L:  Module is loaded
   D:  Default Module

Use "module spider" to find all possible modules.
Use "module keyword key1 key2 ..." to search for all possible modules matching any of the "keys".

But, my next question would be this: is there a way to, in this case, see just gcc/6.3.0 and, maybe, clang/8.0.0-apple? That way I can "hide" the packages built with clang that were necessary to get to gcc 6.3.0? It's a bit disconcerting to see two zip modules.

Is this a mistake on my part by putting clang in my core_compilers? I tried adding gcc to that:

      core_compilers: 
         - '[email protected]'
         - '[email protected]'

but that is a bad idea it seems:

(93) $ spack module refresh --delete-tree
==> You are about to regenerate lmod module files for:

-- darwin-elcapitan-x86_64 / [email protected] ------------------
kekvvku [email protected]  6sa3bg2 [email protected]  c4ox5t4 [email protected]    lzrjkgj [email protected]  4m5p6pd [email protected]
y7dh2t4 [email protected]    5wblvpm [email protected]   5rjv56u [email protected]  6dr32bc [email protected]  tu6gf6h [email protected]

-- darwin-elcapitan-x86_64 / [email protected] --------------------------
uha46wd [email protected]  7siemkq [email protected]

==> Do you want to proceed? [y/n] y
==> Error: Name clashes detected in module files:

file: /Users/mathomp4/spack/share/spack/lmod/darwin-elcapitan-x86_64/Core/zip/3.0.lua
spec: [email protected]%[email protected] arch=darwin-elcapitan-x86_64 
spec: [email protected]%[email protected] arch=darwin-elcapitan-x86_64 

file: /Users/mathomp4/spack/share/spack/lmod/darwin-elcapitan-x86_64/Core/bzip2/1.0.6.lua
spec: [email protected]%[email protected]+shared arch=darwin-elcapitan-x86_64 
spec: [email protected]%[email protected]+shared arch=darwin-elcapitan-x86_64 

==> Error: Operation aborted

@mathomp4 I think what you are looking for is something like:

modules:
   enable::
      - lmod
   lmod:
      hash_length: 0
      whitelist:
        - gcc
      blacklist:
        - '%[email protected]'
      core_compilers: 
         - '[email protected]'

This will prevent the generation of the module files for specs that are compiled with clang, except gcc. If you try activating debug verbosity and:

spack -d module refresh --delete-tree

you'll see in detail which spec matches which rule.

@alalazo Hmm. No, that didn't quite work. After the module refresh:

(106) $ cat .spack/modules.yaml
modules:
   enable::
      - lmod
   lmod:
      hash_length: 0
      whitelist:
         - gcc
      black_list:
         - '%[email protected]'
      core_compilers: 
         - '[email protected]'
(107) $ module av

-------------------------------- /Users/mathomp4/lmod/lmod/modulefiles/Core --------------------------------
   lmod/6.5    settarg/6.5

------------------- /Users/mathomp4/spack/share/spack/lmod/darwin-elcapitan-x86_64/Core --------------------
   bzip2/1.0.6    gmp/6.1.2    libelf/0.8.13      m4/1.4.18    mpfr/3.1.5
   gcc/6.3.0      isl/0.18     libsigsegv/2.11    mpc/1.0.3    zip/3.0

Use "module spider" to find all possible modules.
Use "module keyword key1 key2 ..." to search for all possible modules matching any of the "keys".

Note that I don't mind having the clang-built modules (they are viable), it would be nice though to see:

(107) $ module av

-------------------------------- /Users/mathomp4/lmod/lmod/modulefiles/Core --------------------------------
   lmod/6.5    settarg/6.5

------------------- /Users/mathomp4/spack/share/spack/lmod/darwin-elcapitan-x86_64/Core --------------------
   clang/8.0.0-apple    gcc/6.3.0 

Use "module spider" to find all possible modules.
Use "module keyword key1 key2 ..." to search for all possible modules matching any of the "keys".

The problem might literally be both the chicken and the egg. The chicken, clang/8.0.0-apple isn't built by spack so how can it create a module file; the egg, gcc/6.3.0 is built by the chicken, so algorithmically it is part of Compiler in some sense.

Perhaps I need to blow everything out and rebuild? It's just...ugh...gcc. That's a slog of a build.

Note: that latter example is how my hand-built modules are (top section):

(110) $ module av

------------------------------------- /Users/mathomp4/modulefiles/Core -------------------------------------
   Anaconda2/4.1.1        Anaconda3/4.2.0        clang-gfortran/6.1.0        pgi/16.10
   Anaconda2/4.2.0        Anaconda3/4.3.1 (D)    gcc-gfortran/6.2.0
   Anaconda2/4.3.1 (D)    StdEnv                 gcc-gfortran/6.3.0   (D)

-------------------------------- /Users/mathomp4/lmod/lmod/modulefiles/Core --------------------------------
   lmod/6.5    settarg/6.5

------------------- /Users/mathomp4/spack/share/spack/lmod/darwin-elcapitan-x86_64/Core --------------------
   bzip2/1.0.6    gmp/6.1.2    libelf/0.8.13      m4/1.4.18    mpfr/3.1.5
   gcc/6.3.0      isl/0.18     libsigsegv/2.11    mpc/1.0.3    zip/3.0

  Where:
   D:  Default Module

Use "module spider" to find all possible modules.
Use "module keyword key1 key2 ..." to search for all possible modules matching any of the "keys".

You need to blacklist, not black_list... and we need better warnings for errors of this kind

Perhaps I need to blow everything out and rebuild? It's just...ugh...gcc. That's a slog of a build.

Ouch, don't do that! :smile: Module files are post-install hooks and they can be re-written using:

$ spack module refresh ...

If a spec is installed correctly you never have to uninstall it just to modify a module file.

@alalazo Ah. the magic of blacklist helped. Thanks! Can you walk me through that modules.yaml? For example, why isn't [email protected] a core compiler? I'd have thought it was.

That said, maybe I should nuke my install and reinstall. I saw the Tip at:
http://spack.readthedocs.io/en/latest/getting_started.html#build-your-own-compiler
and that might be a good thing for me?

I don't know really. I suppose for now I can keep moving on with gcc 6.3.0 and define it as my compiler in compiler.yaml so I don't have to keep adding %[email protected], though I suppose that's not a bad thing. But eventually I'll play with PGI, maybe even clang+gfortran...eep.

I saw the Tip at:
http://spack.readthedocs.io/en/latest/getting_started.html#build-your-own-compiler
and that might be a good thing for me?

That's kind of new to me :scream: I never used Spack like that given it is pretty good at re-using the compilers it compiled. @tgamblin @becker33 @scheibelp Is this something that slipped in accidentally or is it really the way we want people to use Spack?

For example, why isn't [email protected] a core compiler? I'd have thought it was

@mathomp4 A "core compiler" in this context means a compiler that comes with the system, and which you use to bootstrap other compilers.

<some-prefix/>Core will then be the entry point of your hierarchy and whenever you load a compiler in Core you'll made available the software built with that compiler. If you load another compiler, you'll switch the context. If you have some time, try to take the module file tutorial: I think it should cover most of the common cases.

That looks like something that slipped in accidentally. I don't see much of a reason to keep the advice in there -- it complicates the setup process.

@mathomp4 A "core compiler" in this context means a compiler that comes with the system, and which you use to bootstrap other compilers.

@alalazo Ahh. Okay. I was thinking "core compiler" was what lmod sees as "Core" since that is what all "Compiler" based modules are then based on. That's why I created a "clang" module in my hand-created modulefiles since that way I could have Open MPI for clang+gfortran and Open MPI for gcc-gfortran and no issues of contamination. Thus, if I could have "clang" as a Core compiler in spack-lmod, I could then carve out all its builds as well.

But since I seem to have issues building many things with clang, probably best I just ignore it anyway. :)

ETA: I suppose I could define another compiler that is clang-8 + gfortran 6.3 from that build I just made a la mixed toolchains: http://spack.readthedocs.io/en/latest/getting_started.html#mixed-toolchains Since many of the things I want will have Fortran as a need, I'll never be able to go full clang anyway, so I might as well ignore all the clang builds. Is that possible? Make a "clang-gfortran" compiler and then I can build clang/gfortran packages (maybe even once more in the case of zip, say).

re: the tip about building a compiler in a separate Spack instance

I'm building all of my tools and etc. via Spack and setting them up vi Lmod in my .bashrc.

I've noticed that I am unable to build anything in a freshly cloned Spack instance if I'm living in a Spack-based environment.

$ git clone [email protected]:llnl/spack foo
Cloning into 'foo'...
remote: Counting objects: 63153, done.
remote: Compressing objects: 100% (110/110), done.
oReceiving objects:  22% (13894/63153), 4.40 MiB | 2.15 MiB/s
remote: Total 63153 (delta 52), reused 3 (delta 3), pack-reused 63029
Receiving objects: 100% (63153/63153), 23.16 MiB | 3.77 MiB/s, done.
Resolving deltas: 100% (30146/30146), done.
Checking out files: 100% (2184/2184), done.
$ cd foo
$ export PATH=`pwd`/bin:$PATH
$ spack install [email protected]
Cloning into 'foo'...
remote: Counting objects: 63153, done.
remote: Compressing objects: 100% (110/110), done.
oReceiving objects:  22% (13894/63153), 4.40 MiB | 2.15 MiB/s
remote: Total 63153 (delta 52), reused 3 (delta 3), pack-reused 63029
Receiving objects: 100% (63153/63153), 23.16 MiB | 3.77 MiB/s, done.
Resolving deltas: 100% (30146/30146), done.
Checking out files: 100% (2184/2184), done.
$ cd foo
$ export PATH=`pwd`/bin:$PATH
$ spack install [email protected]
==> Installing libsigsegv
==> Fetching https://ftp.gnu.org/gnu/libsigsegv/libsigsegv-2.11.tar.gz
######################################################################## 100.0%
==> Staging archive: /home/hartzelg/tmp/foo/var/spack/stage/libsigsegv-2.11-qpmaxx6z62df4s4hwyehdwptv6kmrfhf/libsigsegv-2.11.tar.gz
==> Created stage in /home/hartzelg/tmp/foo/var/spack/stage/libsigsegv-2.11-qpmaxx6z62df4s4hwyehdwptv6kmrfhf
==> Ran patch() for libsigsegv
==> Building libsigsegv [AutotoolsPackage]
==> Executing phase : 'autoreconf'
==> Executing phase : 'configure'
==> Error: ProcessError: Command exited with status 1:
    '/home/hartzelg/tmp/foo/var/spack/stage/libsigsegv-2.11-qpmaxx6z62df4s4hwyehdwptv6kmrfhf/libsigsegv-2.11/configure' '--prefix=/home/hartzelg/tmp/foo/opt/spack/linux-centos7-x86_64/gcc-4.8.5/libsigsegv-2.11-qpmaxx6z62df4s4hwyehdwptv6kmrfhf' '--enable-shared'
/home/hartzelg/tmp/foo/lib/spack/spack/build_systems/autotools.py:266, in configure:
     258      def configure(self, spec, prefix):
     259          """Runs configure with the arguments specified in
     260          :py:meth:`~.AutotoolsPackage.configure_args`
     261          and an appropriately set prefix.
     262          """
     263          options = ['--prefix={0}'.format(prefix)] + self.configure_args()
     264
     265          with working_dir(self.build_directory, create=True):
  >> 266              inspect.getmodule(self).configure(*options)

See build log for details:
  /tmp/hartzelg/spack-stage/spack-stage-DfvhkV/libsigsegv-2.11/spack-build.out
$ tail /tmp/hartzelg/spack-stage/spack-stage-DfvhkV/libsigsegv-2.11/spack-build.out
checking for a thread-safe mkdir -p... /usr/bin/mkdir -p
checking for gawk... gawk
checking whether make sets $(MAKE)... yes
checking whether make supports nested variables... yes
checking for gcc... /home/hartzelg/tmp/foo/lib/spack/env/gcc/gcc
checking whether the C compiler works... yes
checking for C compiler default output file name... a.out
checking for suffix of executables...
checking whether we are cross compiling... configure: error: in `/tmp/hartzelg/spack-stage/spack-stage-DfvhkV/libsigsegv-2.11':
configure: error: cannot run C compiled programs.
$

The failure in libsigsegv's config.log is:

configure:3423: checking whether we are cross compiling
configure:3431: /home/hartzelg/tmp/foo/lib/spack/env/gcc/gcc -o conftest    conftest.c  >&5
In file included from conftest.c:11:0:
/usr/include/stdio.h:33:21: fatal error: stddef.h: No such file or directory                                      # include <stddef.h>
                     ^
compilation terminated.                                                                                          configure:3435: $? = 1
configure:3442: ./conftest
/home/hartzelg/tmp/foo/var/spack/stage/libsigsegv-2.11-qpmaxx6z62df4s4hwyehdwptv6kmrfhf/libsigsegv-2.11/configure

It's not just this particular package, similar failures occur if I e.g. try to build python or ...

I'll often resort to silliness like (module purge; spack install foo2000) if I don't want to abandon my environment.

I assumed that this Wasn't Supposed To Work and haven't dug into it, but the tip makes it sounds as if it can/should and that I should investigate further.

Perhaps I can have my Spack and build on too?

@alalazo I just saw you linked to the tutorial for modules. That is much nicer! I was mainly looking at the other modules section of the docs, but I'm going full bore into the tutorial one! I might just steal that example one once I understand all the lines. Thanks!

Ian it worth reworking the docs to look more like the tutorial?

Or we can make the tutorials more prominent. Split Basics into separate Basic Usage and Tutorials sections. That way users will find the Modules Tutorial before they find the Modules reference page.

Or both things...

Anyway, back to the original topic:

Due to the way LMod is structured you need an Intel compiler (currently from the intel package) that is "built" with a Core compiler (the same way as in the example above [email protected] is built with Core). After that you can install intel-parallel-studio%intel and you'll have it visible once the Intel compiler from Core will be loaded.

@alalazo Are you suggesting that the only way to get Lmod working correctly is to build Intel with GCC, then build another Intel with Intel, and use that for all of the installations? That seems... less than ideal. I was hoping that by adding GCC as a Core compiler, Lmod wouldn't care about things that I build with it and I could let users load Intel and load things built with Intel.

@adamjstewart Isn't it what you need to do anyhow to build correctly with Spack? What I mean is that if you just install:

$ spack compilers
==> Available compilers
-- gcc ubuntu14-x86_64 ------------------------------------------
[email protected]

# This also installs an intel compiler
$ spack install --fake [email protected]+mpi
==> Installing intel-parallel-studio
==> Building intel-parallel-studio [Package]
==> Added global license file /home/mculpo/PycharmProjects/spack/etc/spack/licenses/intel/license.lic
==> Added local symlink /home/mculpo/PycharmProjects/spack/opt/spack/linux-ubuntu14-x86_64/gcc-4.8/intel-parallel-studio-cluster.2017.2-krwlznsciged2t3rhh7hngu3lgze3x3d/Licenses/license.lic to global license file
==> Successfully installed intel-parallel-studio
  Fetch: .  Build: 3.28s.  Total: 3.28s.
[+] /home/mculpo/PycharmProjects/spack/opt/spack/linux-ubuntu14-x86_64/gcc-4.8/intel-parallel-studio-cluster.2017.2-krwlznsciged2t3rhh7hngu3lgze3x3d

$ spack find
==> 1 installed packages.
-- linux-ubuntu14-x86_64 / [email protected] ------------------------------
[email protected]

Now you have an intel-parallel-studio installed with GNU. If you add the intel compiler after this installation and:

$ spack install --fake intel-parallel-studio%intel

you'll reinstall another intel-parallel-studio%intel.

What I end-up doing to avoid a 20GB waste every time is having a single external intel-parallel-studio installation and then setting up a packages.yaml that points always at the external installation for intel, intel-mkl, intel-mpi. Does it make sense or am I missing something obvious?

Isn't it what you need to do anyhow to build correctly with Spack?

Oh, I see what you mean now. Yes, if I need to build something with MKL, Spack tries to force me to reinstall intel-parallel-studio with %intel. To get around that, I've been adding it to my packages.yaml as an external package. Although see #3787.

But what I'm saying is slightly different. Lmod and MKL aside, if I just want the intel compilers installed and I want to build things with them, I would run:

$ spack install intel  # or intel-parallel-studio
$ spack compiler add <prefix>
$ spack install mvapich2 %intel

Now if I want to load these modules, there appears to be no way to load both intel and mvapich2 at the same time. I see this as a bug. Yes, you can install intel twice to workaround it, but ideally things like compilers (or anything I install with the core system compiler) wouldn't be compiler-dependent and I could load them both at the same time.

Another example of why this is important. Let's say a user needs me to install emacs or some other user application. But they want to use the Intel compilers. Do I need to install emacs with every compiler, or is there a way I can tell Lmod to always have emacs available?

P.S. This is my first time using Lmod or Environment Modules, so I have no idea what I'm doing. I'm going to read through the Lmod documentation, because I thought this was what Core packages were for.

@alalazo Okay, I found the time to read through the Lmod docs. Specifically, their documentation on How to use a Software Module Hierarchy. If you read through it, things are supposed to work the way I want. Basically, if I build all of my compilers with the Core compiler, the module files should end up in the root as:

Core/pkgName/version

When I load one of those compiler module files, it should set:

local mroot = os.getenv("MODULEPATH_ROOT")
local mdir  = pathJoin(mroot,"Compiler/intel", "17.0.2")
prepend_path("MODULEPATH", mdir)

Libraries that depend on the compiler they were built with (not Core) but don't depend on an MPI library should go in:

Compiler/compilerName/compilerVersion/pkgName/version

This includes MPI libraries, which should set:

local mroot = os.getenv("MODULEPATH_ROOT")
local mdir = pathJoin(mroot,"MPI/intel", "17.0.2","mvapich2","2.2")
prepend_path("MODULEPATH", mdir)

Packages that are dependent on both compiler and MPI should go in:

MPI/compilerName/compilerVersion/mpiName/mpiVersion/pkgName/version

This way I can load the intel-parallel-studio package AND load things built with that package without having to install the compiler twice. I can also install things like cmake or emacs as Core packages that don't depend on a specific compiler. We may need to implement the family directive to handle this. Compilers should belong to the compiler family, while MPI libraries should belong to the MPI family. Is any of this possible with the current develop branch or in #3183?

@adamjstewart It's all already implemented. The only thing that may be missing is to instruct lmod that intel-parallel-studio provides a compiler. If you try to build intel with a core compiler it already works the way you want.

Or better: they way you want it to work is also the way we are using it in the last year :smile:

To add on the above, the snippet that takes care of mapping installed specs to compilers is:

# If it is in the list of supported compilers family -> compiler
if self.spec.name in spack.compilers.supported_compilers():
    self.provides['compiler'] = spack.spec.CompilerSpec(str(self.spec))
# Special case for llvm
if self.spec.name == 'llvm':
    self.provides['compiler'] = spack.spec.CompilerSpec(str(self.spec))
    self.provides['compiler'].name = 'clang'

we might need to add another special case for intel-parallel-studio because the name of the package doesn't match that of the compiler.

I can submit a PR if you want. It might be less magic if this was explicit in the package.py. Honestly, I would rather build things with intel instead of intel-parallel-studio, but I think we don't have a license for the regular intel compiler standalone for some reason.

What we do is: install manually intel-parallel-studio somewhere, then use that single installation in packages.yaml for intel, intel-mkl and intel-mpi. I am currently checking if this still all works neatly with the current LLNL/develop. In case you are insterested I'll keep you posted here.

Oh, interesting. I'm not sure how well that works though. The installation hierarchies are slightly different between intel-parallel-studio and intel-mkl for example.

@alalazo I'm adding a patch to get intel-parallel-studio working:

# Special case for intel-parallel-studio
if self.spec.name == 'intel-parallel-studio':
    self.provides['compiler'] = spack.spec.CompilerSpec(str(self.spec))
    self.provides['compiler'].name = 'intel'

But I just realized another problem. The version numbers are completely different for intel and intel-parallel-studio. For example, loading [email protected], [email protected], [email protected], or [email protected] should all add the same MODULEPATH, namely intel/17.0.3 as this is what Spack detects when a compiler is added.

How can we get this working? I'm starting to think that we need a suite=cluster variant for the intel-parallel-studio package. We also need the version directive to support a when= statement as I mentioned in #2009. Then we could change the version of intel-parallel-studio to version('17.0.3', 'hash', url='http', when='suite=cluster').

My next question was going to be "how are we going to handle the fact that intel-parallel-studio can be both a compiler and MPI provider", but I see that you're already thinking about that direction in #3183. Sorry it's taken me so long to review that PR, I'll take a look now.

The version numbers are completely different for intel and intel-parallel-studio. For example, loading [email protected], [email protected], [email protected], or [email protected] should all add the same MODULEPATH, namely intel/17.0.3 as this is what Spack detects when a compiler is added.

and

My next question was going to be "how are we going to handle the fact that intel-parallel-studio can be both a compiler and MPI provider"

@adamjstewart Unfortunately at the moment we can't do much to alleviate the fact that intel-parallel-studio is both a compiler and an MPI provider. To treat this correctly (i.e. without workarounds or multiple "fake" installs) we need to wait until compilers will be treated as build dependencies for other compilers.

In #3183 I took care of the fact that intel-parallel-studio provides mpi, lapack, blas and scalapack and that you can put more than one of those things into the hierarchy of modules.


As I think I mentioned somewhere (maybe), we take care of Intel in production using a manual installation of intel-parallel-studio and letting Spack believe that it is more than one spec (assuming rhel6):

via the packages.yaml here. This seems to work pretty well so far.


That said, even outside of modules, I was thinking that the way we deal with binary packages is really suboptimal. I was thinking to try hacking some base package for these cases that basically:

  • extracts the binaries only once in a directory rooted in a new path (to be specified in config.yaml)
  • refers always to this directory for every installation (like if it was an external package)

in this way you can have as many installations as you want of e.g. intel-parallel-studio@cluster, but you will download and extract the tarball just once. How does it sound?

I agree with you that there are better ways to handle binary packages. No one likes to reinstall things with every compiler when the compiler doesn't make a difference. But let's talk about that another time.

For now, I'm just talking about module file support. Here's what I would like to achieve. I'm fine with installing intel, intel-mpi, and intel-mkl as separate packages. I want to install intel with the core compiler, and then add intel to compilers.yaml. Then I want to install intel-mkl and intel-mpi with %intel, not with the core compiler. This should make all of the current logic in Spack work, no hacks necessary, right?

However, I'm sure I'll have some users who want the entire suite installed, so I'll install intel-parallel-studio as well. I'll install this with the core compiler. This is where things break. The magic I want is that module load intel and module load intel-parallel-studio both have the same affect on the MODULEPATH. Right now, if I run module load intel-parallel-studio and module load intel/17.0.3/mvapich2/2.2, the first one gets unloaded. Don't we just need a way of telling Spack that intel-parallel-studio is a part of the compiler family, should be named intel, and has the same version numbers as intel?

Now that I think about it, if I'm willing to concede that I'll be installing intel-parallel-studio, intel, intel-mkl, and intel-mpi anyway, there's really no reason I can't install intel with the core compiler and the other 3 with %intel. Users who want the full Intel suite would have to run:

$ module load intel
$ module load intel-parallel-studio

but that's not that bad. As long as it doesn't get unloaded later. I think this is what @alalazo originally suggested, although with more packages.yaml hacking so as to avoid unnecessary installations. It's a bit unintuitive compared to the way I would do it if Spack didn't generate module files for me, but it's the best I can get right now.

If you don't want to reinstall intel-parallel-studio and you can accept a little asymmetry you can use lmod directly and create for your users a savelist named intel-parallel-studio, which would load:

  1. intel
  2. intel-mkl
  3. intel-mpi

Your users then need to type:

module restore intel-parallel-studio

to restore that state of things.

Related issue here is the treatment of MODULEPATH by share/spack/setup-env.sh. I'm going to include a little extra information because getting the cache setup for lmod was confusing, and hopefully somebody else will benefit (this thread has a lot of excellent information that helped me get here).

Skip to 2 for actual issue with MODULEPATH. 1 details the setup we arrived at.

1. Admin Setup

Only admins are using spack directly. There are some convenience scripts I setup, the one admins source from their ~/.bash_profile sets $SPACK_ROOT and then source $SPACK_ROOT/share/spack/setup-env.sh. Among other utilities defined (such as a function that does the qsub of compile jobs), there is an lmod spider cache update utility function:

# This function is useful for updating what shows up with `module av`
function update_totient_lmod_db() {
    # Load the `lmod` module so that we have `update_lmod_system_cache_files`
    echo "---> Loading the 'TotientAdmin' module."
    module load TotientAdmin
    if (( $? != 0 )); then
        echo "CRITICAL: could not 'module load lmod'." >&2
        return
    fi

    cleanup='eval echo "---> Unloading the '"'"'TotientAdmin'"'"' module." && module unload TotientAdmin'

    # Make sure we got the update command
    if ! command -v update_lmod_system_cache_files >/dev/null 2>&1; then
        echo "CRITICAL: 'module load lmod' did not provide 'update_lmod_system_cache_files'." >&2
        echo "          You may have used the 'spack' lmod module, which is not correct."     >&2
        $cleanup
        return
    fi

    # Some simple sanity checks (TSR <- Totient Spack Root)
    TSR="/share/apps/spack/totient_spack_configs/setup/lmod/moduleData"
    if ! [[ -d "$TSR" ]]; then
        echo "CRITICAL: folder '$TSR' does not exist!" >&2
        $cleanup
        return
    fi
    if ! [[ -w "$TSR" ]]; then
        echo "CRITICAL: '$USER' cannot write to '$TSR'" >&2
        $cleanup
        return
    fi

    echo "---> Running 'update_lmod_system_cache_files', this may take a while..."
    update_lmod_system_cache_files -d "$TSR/cacheDir" -t "$TSR/system.txt" $MODULEPATH

    $cleanup
}

For reference, the TotientAdmin module simply prepends to MODULEPATH exactly where the lmod installation is (note: the spack generated lmod module is invalid, but I don't think putting any effort there is worth it, users should be utilizing the one that came from their installation):

-- -*- lua -*-
whatis([[Name : TotientAdmin]])
whatis([[Version : 1.1]])
help([[A wrapper module to load the true lmod module.]])

-- Mark this module as something end-users should ignore
add_property("state", "ignore")

-- MODULEPATH modifications
prepend_path("MODULEPATH", "/share/apps/spack/spack_compilers/opt/spack/linux-rhel6-x86_64/gcc-4.9.2/lmod-7.4.11-ds2yqqh2l47t7crx7yvxit7vv47ns42m/lmod/lmod/modulefiles/Core")
load("lmod")

Note: the lmodrc.lua we are using is what defines the ability add_property("state", "ignore"). See docs on that here (I don't know why I have to use the cache url, the official link is broken).

2. The issue with MODULEPATH

Since the admins are the ones updating the MODULEPATH using update_totient_lmod_db, the issue comes from the fact that spack is going to prepend to (only) _our_ MODULEPATH the location where the TCL modules are. We don't want this, we only want the specific lmod modules we have enabled using modules.yaml.

  1. If I hard-override using config:: in config.yaml and omit the definition for module_roots: tcl, setup-env.sh will break.

  2. The hack I arrived at is in our site specific config.yaml:

    config:
        module_roots:
            # Tell spack to only use the `lmod` modules
            # I couldn't get it to work by commenting out,
            # The `setup-env.sh` sourced for the admin configs
            # will populate MODULEPATH so we just cheat and tell
            # it to look somewhere else.
            dotkit: /dev/null
            lmod: $spack/share/spack/lmod
            tcl: /dev/null
    
  3. The relevant section (near the bottom) of setup-env.sh has two problems:

    • It completely ignores lmod.
    • It assumes we want both dotkit and tcl to show up, regardless of if we explicitly remove them from config.yaml using config::.
    #
    # Set up modules and dotkit search paths in the user environment
    #
    _sp_share_dir=$(cd "$(dirname $_sp_source_file)" && pwd)
    _sp_prefix=$(cd "$(dirname $(dirname $_sp_share_dir))" && pwd)
    _spack_pathadd PATH       "${_sp_prefix%/}/bin"
    
    _sp_sys_type=$(spack-python -c 'print(spack.architecture.sys_type())')
    _sp_dotkit_root=$(spack-python -c "print(spack.util.path.canonicalize_path(spack.config.get_config('config').get('module_roots', {}).get('dotkit')))")
    _sp_tcl_root=$(spack-python -c "print(spack.util.path.canonicalize_path(spack.config.get_config('config').get('module_roots', {}).get('tcl')))")
    _spack_pathadd DK_NODE    "${_sp_dotkit_root%/}/$_sp_sys_type"
    _spack_pathadd MODULEPATH "${_sp_tcl_root%/}/$_sp_sys_type"
    

3. Why this matters

It really just comes down to presentation. I'm not saying it's bad that setup-env.sh is populating MODULEPATH, this is an excellent feature making for a seamless experience. But since lmod is getting some love here, it's worth pointing out that the currently populated MODULEPATH means that when we update our spider cache, modules that users are not supposed to know about (that don't obey the hierarchy) will show up. Setting it to /dev/null worked surprisingly well, so maybe just add that to the docs?

Perhaps checking first get_config('modules').get('enable', {}) would be a good idea?

Eesh, thanks to this thread for the clarification on lmod. I was having a hell of a time because I interpreted core_compilers as the top level of the module hierarchy, rather than what compiler to leave out.

@sethrj if you are new to LMod I would highly recommend reading https://lmod.readthedocs.io/en/latest/080_hierarchy.html

@adamjstewart OK, I've got that much figured out. Now the issue is that I've been spinning my wheels with a two-compiler toolchain. I've installed [email protected] using the "core" [email protected], and installed [email protected] using [email protected]; and I added that to the compilers.yaml with spack compiler find $(spack location -i llvm). Then I extended that compilers.yaml definition to use the Fortran compilers from [email protected] and to include the GCC libraries in clang's rpath.

Now with my core compiler set to [email protected], llvm only appears in module avail when I have GCC loaded. However, loading llvm sets the compiler family to clang which seems to unload all the modules and bring me back to the initial state...

Surely this isn't _that_ unusual a situation?

@sethrj -- I'm jumping in a bit late and haven't completely digested this thread.

If you're expecting llvm to appear as a "peer" compiler to gcc in the lists of things built by "Core", then you'll need to build it with [email protected]. When I do that, I get something like this from module avail:

---------------------------- /home/ghartzell/spack/share/spack/lmod/linux-centos7-x86_64/Core -----------------------------
   gcc/8.2.0-uc6sbum    llvm/7.0.1-py2.7-6emefuc    llvm/7.0.1-py3.6-kaij4zr (D)

With this config, you'll be able to load the gcc Core to access things built with [email protected] and the llvm Core to access those built with [email protected]. You won't ever be able to have modules loaded from both of the hierarchies simultaneously, that's the point of the Lmod hierarchical modules.

One of the points of the hierarchical modules is to avoid mixing libraries built with different compilers. I'm not sure how that would play out if you're mixing llvm with gnu Fortran?

My use case is focussed on providing applications to end users, not libraries to devs. To that end, I've been playing with supporting a flat module space with Lmod (e.g. #10990) as I'd prefer to stay with the more actively supported Lmod codebase than drop back to the tcl module system.

I'll admit that, at first, I wasn't seeing the llvm module in "Core", turned out that I hadn't whitelisted it in my modules.yaml after blacklisting everything built with [email protected] except gcc. This seems to work:

modules:
  enable::
    - lmod
  lmod:
    core_compilers:
      - '[email protected]'
      - '[email protected]'
      - '[email protected]'
    # hash_length: 0
    whitelist:
      - gcc
      - llvm
    blacklist:
      - '%[email protected]'
      - '%[email protected]'
      - '%[email protected]'
    verbose_autoload: false
    all:
      autoload: 'direct'
      suffixes:
        '+jit': jit
        '^[email protected]': 'py2.7'
        '^[email protected]': 'py3.6'
      environment:
        set:
          'SPACK_${PACKAGE}_ROOT': '${PREFIX}'
    ^python:
      autoload:  'direct'

One other thing that I've stumbled on a bit, Spack names the compiler package llvm and you need to use the llvm name in the modules.yaml, but the Lmod modulefiles end up in a directory named clang.

$ tree -L 2 share/spack/lmod/
share/spack/lmod/
└── linux-centos7-x86_64
    ├── clang
    ├── Core
    ├── gcc
    └── openmpi

5 directories, 0 files

Yes, as @hartzell mentioned, the simplest way to use LMod + Spack is to install _all_ compilers with your core compiler.

Thanks @hartzell , that's what I was afraid of -- it seems a bit arbitrary to be unable to bootstrap a second compiler (Clang) using a spack-installed one (GCC@8) rather than the system-provided one (GCC@4), especially if you also need to use the Fortran compiler from GCC8 in tandem with Clang.

I think that you can build you second compiler (Clang) with a "spack-installed one", but

  1. I think that would make it a third compiler (😃 ) ; and
  2. you could do so by adding the "spack-installed compiler" to the list of core compilers in your modules.yaml (untested).

At one point I thought I would be better off using the system compiler to build [email protected], then building [email protected] %[email protected] (harking back to the olden days of building a new gcc by hand, then using it to build itself). That led to a bit of a wild goose recursion chase when loading modules, discussed/confided in Lmod issue #245.

I think that using the system provided compiler as core and the other compilers as subsidiaries is just a simplifying assumption, making it easier to separate the things that were built with this from the things that were built with that.

The entire point of the hierarchical model is to prevent one from using libraries built with one assumption (e.g. a compiler, or an MPI implementation) with libraries built with another.

As I mentioned, I'm not sure how clear of a win that is with compilers. If you're able to (required to?) use GCC8's Fortran compiler (and presumably libraries?) with Clang, then you've pretty much violated it's assumptions.

The hierarchy that Lmod uses is extensible, but I seem to recall that something (either Spack or Lmod) considers the compiler to be a required part of the hierarchy. Perhaps that should be relaxed?

Have you considered just using the TCL modules, without any hiearchy? Depending on what packages you're building and how they get along with each other, it might be simpler.

Yeah, I have, but for CI purposes I wanted to be able to build out equivalent toolchains for GCC8 and LLVM7 (and be able to switch back and forth seamlessly), which seems ideally suited for lmod's hierarchical ability.

I was able to kind of hack this into place by manually copying gcc/8.2.0/llvm/7.0.1.lua to Core/clang/7.0.1.lua, and merging in the needed bits of Core/gcc/8.2.0.lua. The only problem now is that clang isn't (by default) adding the -Wl,-rpath $(spack location -i llvm)/lib needed to set up the rpath for libc++...

What happens if you build llvm with the system compiler (and adjust any {black,white}lists in your modules.yaml file)? That should get you the Core/clang/... module you need.

Does Clang have a way of specifying which GCC libraries to link to? I know there is something similar for Intel: https://spack.readthedocs.io/en/latest/getting_started.html#intel-compilers.

If it were GCC I'd modify the specs file to add the rpath to the link line like so, but it looks like Clang has no equivalent.


Even better solution though (don't know why I didn't think of this earlier): extend modules.yaml to have

modules:
  lmod:
    llvm:
      environment:
        prepend_path:
          'LD_RUN_PATH': '${PREFIX}/lib'
    gcc:
      environment:
        prepend_path:
          'LD_RUN_PATH': '${PREFIX}/lib:${PREFIX}/lib64'

This fixes the link issues; adding it to my custom Core/clang gets everything working :)

This discussion is dead, so I'm closing this issue.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

vsoch picture vsoch  ·  3Comments

hartzell picture hartzell  ·  3Comments

23skdu picture 23skdu  ·  3Comments

JavierCVilla picture JavierCVilla  ·  3Comments

ax3l picture ax3l  ·  3Comments