We're using Spack on CentOS 7 systems. We build our things nightly
to stay on top of changes.
The recent changes in #3206 changed the names of the directories in
which our modules end up. We splice a spack tree into our users
environments with a shell script that used to refer to the Core
directory with something like this:
_APPS_LMOD_CORE_DIR=${APPS_DIR}/share/spack/lmod/linux-centos7-x86_64/Core
which worked well enough since all of our systems are x86_64.
Now I have two problems (and I'm not even using regular expressions):
the Core modules end up in linux-centos7-haswell but there's no
good way for us to guess the haswell bit, it seems to be the arch
of the system on which the system compiler was built.
I poked around at the gcc command line a bit, but don't see any
info about how it was built
$ /usr/bin/gcc -dumpmachine
x86_64-redhat-linux
$ /usr/bin/gcc -print-multiarch
$
We use the system gcc (4.8.5) to build [email protected] and use that to
build the rest of the things
Unfortunately, the Core module prepends this bit onto the
MODULEPATH:
$ grep MODULEPATH share/spack/lmod/linux-centos7-haswell/Core/gcc/8.2.0-q5tss7s.lua
prepend_path("MODULEPATH", "/home/ELIDED/temp/spack/share/spack/lmod/linux-centos7-haswell/gcc/8.2.0")
but that directory doesn't exist, all the things that we've built
with the compiler we built with the system compiler end up in
linux-centos7-skylake_avx512.
ping @odoublewen
I think I'm hitting the same issue using the develop branch (b929774). I git pull'ed develop yesterday for the first time in about a month. After that, I installed a package and noticed that the module and install path is now linux_rhel7_skylake instead of linux_rhel7_x86_64. The system Iโm working on has a pre-skylake intel processor.
update:
Adding targets: [x86_64] in packages.yaml under the packages: heading appears to have informed the output of spec. A subsequent install went through without any obvious issues.
packages:
all:
...
target: [x86_64]
...
What OS/distro are you using? What compiler?
noticed that the module and install path is now
linux_rhel7_skylake
I wonder if there's some way (presumably the same mechanism that Spack is using...] to discover what target instruction set the compiler was built for (I'm betting skylake, but...)?
@hartzell We are running RHEL7 and the spack installs were using clang 7.0.0. In compilers.yaml we pass -mtune=native -march=core2 under the flags: heading to support the various generations of intel processors we have in our systems. The machine I was running spack on is a i7-7700 CPU; which is apparently using the 'Kaby Lake' microarch.
I think I hit a similar issue on my Mac where instead of darwin-mojave-x86-64 spack now uses darwin-mojave-broadwell. I haven't looked into it in detail, but spack now seems unable to load environment-modules on a fresh bash shell and a spack install of py-pip gives an ImportError upon running.
@anne-glerum Are you using environment-modules with a "flat" layout or lmod hierarchical layout?
@alalazo I don't know. I installed it using $ spack bootstrap and haven't changed any settings.
Following @cwsmith suggestion to set target in packages.yaml, I installed a new py-pip, but now I can't select the py-pip with the new specs:
$ spack load iqjjp3g [email protected]%[email protected] arch=darwin-mojave-x86_64
==> Error: the constraint '['iqjjp3g', '[email protected]%[email protected]', 'arch=darwin-mojave-x86_64']' matches multiple packages:
g25vvne [email protected]%[email protected] arch=darwin-mojave-broadwell
iqjjp3g [email protected]%[email protected] arch=darwin-mojave-x86_64
==> Error: In this context exactly one match is needed: please specify your constraints better.
@anne-glerum -- Did you clean out all of the stuff you built as darwin-mojave-broadwell before you built the new things as darwin-mojave-x86_64? If the old versions are still hanging around they might be confusing the issue.
@cwsmith
After that, [...] The system Iโm working on has a pre-skylake intel processor.
+
The machine I was running spack on is a i7-7700 CPU; which is apparently using the 'Kaby Lake' microarch.
I think that as far as our modeling goes skylake seems to be the correct detection. Could you run the executables on your host architecture? Also, we probably need to do some renaming as everybody expects what we call skylake_avx512 to be skylake. For what is worth we mostly followed GCC conventions for the naming :slightly_smiling_face:
From my point of view, the problematic thing is that the arch changes depending on the compiler. As @hartzell described at the top of this issue, we're using spack to bootstrap our way up to a relatively modern gcc (8.2.0) on a CentOS 7 system, i.e. using spack to build [email protected] using the system gcc (4.8.2).
The problem for me is that the stuff built by gcc 4.8.2 goes in linux-centos7-haswell while the stuff built with gcc 8.2.0 goes in linux-centos7-skylake_avx512. And as @hartzell noted, lmod's Core is going into linux-centos7-haswell
$ tree -L 3 spack/opt/spack/
spack/opt/spack/
โโโ linux-centos7-haswell
โย ย โโโ gcc-4.8.5
โย ย โโโ gcc-8.2.0-q5tss7soouu5rbk2ujndl54ir3ainmiq
โย ย โโโ other gcc 8.2.0 deps
โโโ linux-centos7-skylake_avx512
โโโ gcc-8.2.0
โโโ all my other packages
I don't think it's a big deal whether it's skylake_avx512 or skylake -- as long as it's consistent and we have a way of finding it out. The real mystery for me is where this "haswell" is even coming from... at first I thought it must be what spack arch returns when it has the system gcc in the PATH, but with a fresh clone of spack, in a fairly naive environment, the spack arch still says skylake_avx512
$ which gcc
/usr/bin/gcc
$ gcc --version
gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-36)
$ git clone [email protected]:spack/spack.git
$ ./spack/bin/spack arch
linux-centos7-skylake_avx512
Where is linux-centos7-haswell coming from?
@odoublewen It comes from the fact that haswell is the latest architecture GCC 4.8.5 knows about. Try to force a build with it and specify target=skylake_avx512 :wink: (docs here)
@hartzell This one is definitely on my list of issues. I need some time to go "back to the whiteboard" to check what could be the best way to deal with hierarchical module files and multiple targets.
@alalazo -- sounds good.
I'll fine tune @odoublewen's comment though, my biggest problem with the hierarchical modules is that the Core module file adds the wrong directory to MODULEPATH, so that loading the Core module does not make the other modules accessible. (point #2 in summary).
@alalazo @odoublewen Forgot to say: let me know if you think something is unclear or should be added to the user docs or to the packager docs.
@anne-glerum -- Did you clean out all of the stuff you built as
darwin-mojave-broadwellbefore you built the new things asdarwin-mojave-x86_64? If the old versions are still hanging around they might be confusing the issue.
@hartzell Is there a way to uninstall all packages with a certain arch?
@anne-glerum You can use:
$ spack install -a target=<name>
What you give as a spec to the uninstall command is interpreted as a constraint and the command will select anything that matches (and ask you before actually uninstalling it, of course).
Hmmm, something or someone marked @anne-glerum's question about uninstalling and @alalazo's reply as "off-topic". Was that automagic, or something one of use did? It seems relevant to the discussion....
let me know if you think something is unclear or should be added to the user docs or to the packager docs.
@alalazo -- they seem complete, in particular the user docs explain our situation.
I believe that there is a hierarchical structure to the architectures; for those of us who don't live and breath CPUS, is there a way to print them out that shows that structure (don't see any variants of --known-targets that does it...)?
@alalazo -- Am I correct that there's still a problem with generating Core module files when we use a newer compiler that gives Spack access to fancier architectures, or is there something that we're doing wrong in our modulefile configuration?
Hmmm, something or someone marked [...]
I did, as the error message was indicating:
==> Error: In this context exactly one match is needed: please specify your constraints better.
and spack load is not using lmod hierarchies. I can revert that if you think they are relevant.
Am I correct that there's still a problem with generating Core module files when we use a newer compiler that gives Spack access to fancier architectures, or is there something that we're doing wrong in our modulefile configuration?
I think you're correct - the unit test for lmod modules don't have any case of multiple architectures and that very likely requires some adjustment to the code.
I can revert that if you think they are relevant.
Not necessary, I just had never seen that before. I'm a firm believer in "There's no such thing as a dumb question." and sometimes being declared "off topic" can feel close to that.
It might be friendly to make a note that they being marked off topic and are (or aren't) worthy of a separate discussion....
I know that it's been marked "off topic" (:smiling_imp:), but I think that @alalazo's solution for @anne-glerum, which said:
@anne-glerum You can use:
$ spack install -a target=<name>
should have instead said
$ spack uninstall -a target=<name>
although the attached discussion might have made that clear.
@anne-glerum @hartzell
Not necessary, I just had never seen that before. I'm a firm believer in "There's no such thing as a dumb question." and sometimes being declared "off topic" can feel close to that.
Apologies, that was not the meaning on my side to mark them off-topic - I also replied to the questions and marked my answer off-topic as well. It was just a helper for myself to have open only the comments that were directly related to the lmod issue.
@hartzell Now I am uncomfortable marking this and the couple of messages above off-topic as well. Do you want to do that? :grin:
I believe that there is a hierarchical structure to the architectures; for those of us who don't live and breath CPUS, is there a way to print them out that shows that structure (don't see any variants of --known-targets that does it...)?
Unfortunately there's nothing like that at the moment - and I agree a command that prints the graph relationships among microarchitectures could be very useful . If in the meanwhile you want to peek at code you can have a look at this json file where we store all the static data for the modeling of cpus. Let me know if something is unclear there.
Now I am uncomfortable marking this and the couple of messages above off-topic as well.
Ah, the ripple effect in action. Apologies. I'll mark them off topic, but (as usual with GUI's...), I can't find the button. Don't see it under the three dots, or anywhere else obvious.... Clue, please?
@odoublewen and I were chatting out of band.
spack arch digs into /proc/cpuinfo (on linux) to figure out what the host architecture is.
Is there some way to ask what the target is for a given compiler, e.g. spack arch %gcc4.8.5 (that doesn't actually work, but you get the idea)? That would give us programatic way to figure out what to expect Lmod modulefile names to look like and etc.
and on a related note, is it always true that a particular release of gcc (e.g. [email protected]) supports up to a certain arch, or have distro's "patched" their gcc's to screw^H^H^H^H^Hgive us more options?
and on a related note, is it always true that a particular release of gcc (e.g. [email protected]) supports up to a certain arch, or have distro's "patched" their gcc's to screw^H^H^H^H^Hgive us more options?
Complete answer is: distros can patch their GCC to extend the features officially supported at any given version. What we did so far was to look carefully through GCC changes and man pages and encode the official support there.
There's one case where we crossed one patch that was extending the support for microarchitecture specific optimizations, it's here:
Right now this emits a single warning before starting the compilation, which is suboptimal but I think it's all we can do until we have a concretizer that can digest things like "if I respect these additional constraints it's fine to build for this target with this compiler version". Does that make sense to you?
Does that make sense to you?
Totally. Thanks!
Most helpful comment
I think I'm hitting the same issue using the develop branch (b929774). I
git pull'eddevelopyesterday for the first time in about a month. After that, I installed a package and noticed that the module and install path is nowlinux_rhel7_skylakeinstead oflinux_rhel7_x86_64. The system Iโm working on has a pre-skylake intel processor.update:
Adding
targets: [x86_64]inpackages.yamlunder thepackages:heading appears to have informed the output ofspec. A subsequentinstallwent through without any obvious issues.