@tgamblin @becker33 @adamjstewart @mathstuf @davydden @scheibelp @eschnett @alalazo
We were having a conversation yesterday about the Spack user experience --- specifically, whether the process of incrementally adding features, independent of focus on overall design, is beneficial in the long run. Here's a funny video from Microsoft on that issue:
https://www.youtube.com/watch?v=EUXnJraKM3k
Well... just today I came across a user who is quite interested in Spack --- but after looking at it has come to believe that Spack is too complex (see below). So he's thinking of doing a roll-your-own poor man's Spack himself. This is exactly what I mean when I say we should think about Spack from the user's perspective.
Why does this person think Spack is over-complicated? We have gone to great lengths to ensure that Spack works out-of-the-box. That it auto-detects your compilers. That you don't have to do extra bootstrap stuff before you're ready to build.
But that work has not translated for this user. I can only guess as to the reason... maybe the problem is the way we're presenting things. Spack might be looking increasingly like a "look at this great feature" kind of system. Which, at least for this potential user, is off-putting.
Maybe the problem is I forgot to share the spack.io
website with this potential user (maybe content should be removed from the README.md file on the Spack GitHub site, redirecting people to spack.io
instead).
This person is a developer. Maybe we need to think through the "Spack documentation experience" from the developer (vs. sysadmin) point of view.
Sphinx is nice. But maybe we need to split out the Tutotiral / Getting Started section into an even easier, even nicer presentation that occupies front-and-center.
Cleaning up what happens when you first run spack --help
I think is also an important step.
Whatever the problem is, I believe that somehow, we can do a better job of presenting essentially the same Spack that we have today, but in a simpler more streamlined way --- one that shows people that Spack is simple, easy, and can solve their problems. If people have that "wow" moment, they will start using Spack. Then, and only then, will they become motivated to learn about the complexity under the hood. And then we can solve their problem with the full, more detailed Spack documentation.
---------- Forwarded message ----------
From: ardi ardillasdelmonte@gmail.com
Date: Thu, Jan 19, 2017 at 8:57 AM
Subject: Re: [CMake] Managing a local installation of cmake-built open source packages
To: elizabeth.[email protected]
Cc: CMake MailingList cmake@cmake.orgThanks a lot, Elisabeth, Domen, Guillaume, and Konstantin,
I believe spack is the closest to what I need. However, all these
solutions (hunter, conan, spack...) have perhaps their strongest focus
in packaging, dependencies, automatic downloads, etc... while I prefer
to do all such tasks myself. I prefer to not have packages, just
download the source in the original developer provided form, untar it,
and to even build it on my own, following the developer instructions.
In other words, I want to be as little intrusive as possible, keeping
the original distribution file as is. Once it's built, then it's the
install phase what is critical, because a previous version of the
package might need to be uninstalled, or there might even be files
with equal names across different packages, as Domen pointed out.As I said, I think spack is the closest. However, I feel it tries to
automate too much the build. Yes, it gives you a lot of customization
options, but I'm not sure the complexity is worth the effort.However, I think I can follow the spack design without using spack:
Install every project on a different prefix. Then just keep on the
environment CMAKE_INSTALL_PREFIX set to a colon separated list of all
prefixes of all installed projects, and that's it.Uninstalling is trivial: delete the installation directory.
Keeping several versions of the same package is trivial too: just set
the currently used version in CMAKE_INSTALL_PREFIX
Updating is trivial as well: Install new version to a new prefix, and
either keep or delete the installation directory of the previous
version, and update CMAKE_INSTALL_PREFIX accordingly.Of course spack does all this automatically for you, but it does a lot
more, and, as I said, I'm not sure the added complexity and automation
is worth the effort.I think that by using this approach, I could reconsider moving to
spack in the future (I'd likely have to install all packages from
scratch if I move to spack later, but my directory hierarchy will end
up being the same, so all the work I do now -writing code and
projects- would be reusable without modification).Thanks a lot for all your ideas!!
On Thu, Jan 19, 2017 at 12:25 AM, Elizabeth A. Fischer
elizabeth.fischer@columbia.edu wrote:Ardi,
What you describe is pretty much what Spack does. I would take a look at
it, see if it meets your needs. Chances are, at least some of the packages
you need are already included in Spack:-- Elizabeth
On Wed, Jan 18, 2017 at 12:39 PM, ardi ardillasdelmonte@gmail.com wrote:
>Hi,
I want to install (on UNIX-like systems) a collection of open source
packages which use cmake as the build tool, but I need the
installation to be performed in a local directory (inside my home
directory), and I wish convenient updating to new versions of the
packages.I didn't arrive to a convincing solution, so any advice will be welcome.
Here are my thoughts:
The trivial solution is of course to directly install to a non-root
prefix when invoking cmake, but, however, this isn't well suited for
updating a previous installation of the packages (building and
installing a new version will only overwrite files that have the same
name, but it will keep old files that no longer exist in the new
version, cluttering the local installation directory with no longer
needed and mismatched files).A possibility would be to keep a copy of install_manifest.txt whenever
I install a package, and remembering to always run 'xargs rm <
install_manifest.txt' before installing a different version of a
previously installed package.But keeping the install_manifest.txt of each installed package (and
using it before updating a package) looks like a too-manual task,
candidate to some kind of automation.Another (perhaps wiser) possibility would be to use cpack for creating
either a RPM or DEB, and then use the corresponding package manager to
install the package. But this has problems too: most package managers
assume a / root installation directory. Also, I use several OSs: OSX,
Linux, and some BSDs, and I'm not sure that either the RPM nor the DEB
pkg managers will work flawlessly across all the OSs I use.What would you recommend here?
Thanks a lot!
The docs and help could definitely use a cleanup.
The particular user here sounds like a developer, and I'm guessing that in addition to installing dependencies, he wants to work on them. We are planning to discuss that in our working group on environments (which, FWIW, includes things like views, multi-versioned trees a facility might deploy, and environments like spack setup, where you are actively working on parts of the build). I am not convinced we have a simple usage model for the developer case (or some aspects of the others) yet, so this could use some work.
OT Question: what should we do with issues like this? There is no clear action to take to resolve this issue; it's an ongoing higher-level problem. I would like to be able to treat issues like todos and close them when they are done. Would discourse (e.g. discourse.brew.sh) a better format for this?
We could decide to just discuss them on the [spack] mailing list. But that
broadcasts to everyone. I like the discourse idea, but not sure of the
benefits of adding yet another piece of infrastructure.
Maybe just tag these "discussion" and periodically close discussions that
haven't seen posts in a while.
On Thu, Jan 19, 2017 at 12:39 PM, Todd Gamblin notifications@github.com
wrote:
OT Question: what should we do with issues like this? There is no clear
action to take to resolve this issue; it's an ongoing higher-level problem.
I would like to be able to treat issues like todos and close them when they
are done. Would discourse (e.g. discourse.brew.sh) a better format for
this?—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
https://github.com/LLNL/spack/issues/2873#issuecomment-273844741, or mute
the thread
https://github.com/notifications/unsubscribe-auth/AB1cd2lkpq67u1P4OJ_nR4tZqdvsMA2Mks5rT5_lgaJpZM4LoT9d
.
Works for me, at least until we have time to set up discourse. I think we'd replace the ML with discourse if we set it up. https://groups.io/ is another option.
Yeah, I only check the mailing list once a month anyway. Sounds good to me!
As an observer here, it is not clear to me what Ardi would want in this case. It seems manually managing packages is the closest to his ideal solution. I see this myself when trying to sell Spack to developers. Though I think it takes actually using Spack to appreciate what it can do for you in the HPC arena and that somebody may not know it would work for them without putting some time into learning it, and I know that just looking at the documentation would be somewhat intimidating.
I've been through this exercise myself of trying to find the best solution for building software, because ultimately I hate building software, which is why I always seem to spend most of my time trying to get software to build without my intervention. Outside of HPC, I would just be using some kind of Homebrew or apt-get because they are so simple. However, since those systems are so simple to use, people expect that from Spack, and when they don't get it, because they immediately hit a few inevitable issues on their custom machine that they will need to spend time learning how to mitigate, they leave.
I get worried myself about how Spack could become too complicated at some point, and maybe it already is, but then I remember that HPC research-type software is overwhelmingly complicated, the compilers are overwhelmingly complicated, and the machines are overwhelmingly complicated, and all so easy to break. I also think it is frankly unsustainable to manage without having something like Spack helping me in the future, and that's where I see Spack. It's not that I'm not depending on it right now, but I see it absolutely critical in the future, which is why I contribute to it and why I want my colleagues to use it. In my experience of trying to sell it to people, I don't think people believe something could ever make managing HPC software as simple as we are trying to pretend Spack does. If I can show them in detail some things it can do, they are usually impressed, but something will always break when trying to help with their use-case and it usually expands into a bigger issue and they get scared.
In my opinion as an actual user, I find the top-level Python package files and essentially the 'knowledgebase' for building all the packages to be a fantastic experience, but on the other hand, the configuration for a particular machine is the thing I find to be the most difficult part to get _right_. Although I am not sure if a 'correct' configuration of Spack actually exists for any one system (more specifically at least for any one supercomputer), or if that will always be inherent to the custom nature of HPC, but I've always appreciated the effort this team has put into trying to solve it.
So in the end, I appreciate the expansion of features as I think they are necessary to demonstrate the maturity of Spack, but I think I would also like to see an approach at some point to prune them down and be able to focus on a more robust user experience so that some time in the future it would be ridiculous in HPC to not be using Spack in your system management. Any effort to make it a better user experience at this point to sell people on it that would contribute to it to help make it an even better experience to sell more people on it sounds good to me. The documentation is large enough now that maybe it could be split into something more palatable for first users, but I do know that the amount of documentation available is one thing everybody likes about it. Maybe some of this helps gain perspective, but I fear I may just be rambling and not saying anything anybody on the team doesn't already know, so I will stop now.
The documentation is large enough now that maybe it could be split into something more palatable for first users, but I do know that the amount of documentation available is one thing everybody likes about it.
I definitely understand this. When I first started using Spack, sections like "Getting Started" were a few paragraphs of "just clone it and add it to your PATH". Now, we have a monolithic "Getting Started" section that attempts to address every possible thing that could go wrong with every compiler on every platform. I think splitting the documentation into a few high-level sections (Basics, Reference, Contributing) has helped a bit with this, as regular users can skip past the Contributing documentation if they don't plan on contributing. But we could probably do better than that. I think a "Troubleshooting" section is in order. And we could move some of the platform or compiler-specific stuff into a reference section. Basically, the top level "user-facing" Basics sections should be as light as possible, demonstrating ease of use and awesome features. The Reference sections should be for users who are already hooked on Spack and want to learn more about customization and power-user commands. And the Contributing sections should be for Spack-developers only. I think we're getting closer, but we still have a long ways to go.
I saw user-experience
label and thought to share my experiences too:
I am very enthusiastic about the project and spack has been very useful during daily development work. (using it on different hpc platforms x86
, OS X
, bg-q
, cray
)
@jrood-nrel has already summarised it nicely! So instead of repeating that I am summarising difficulties/challenges that I have seen during this journey:
While working / porting HPC codes on variety of platforms, I have been manually installing softwares (and profiling tools with different compiler configurations ). This has been always been time consuming and thats how I started using spack six months ago.
I don’t have internal
understanding of spack and (sparsely) read mailing list.
Some issues during early days were bit annoying : for example configuring Intel compilers first time with spack (which is very common in hpc world). Going through this google group thread and getting it working wasn’t straightforward. Similarly, using compiler modules on our clusters which needed environmental variables, LD_LIBRARY_PATH
etc. (these are now fixed)
spack setup
is another feature that I found very exciting! I wrote Spack packages for our software tools and was happy with it! But spack setup
was updated (is being updated) for various things and now the development workflow that I setup 2-3
months ago is broken. (I am going to update it soon).
In my usage of spack
I found certain issues more important (than new features) considering stability : for example , #2193 related to installation failure, #2093 related to external mpi package, #2828 related to specific bg-q system etc. These might be difficult to reproduce on other systems but I am sure if I ask my colleagues to try spack on some of our systems then they will hit these issues and I don’t have a clear idea about root cause or how to fix/debug it. I use basic set of spack features and I would like those to be stable across hpc platforms. As an early adopter in the team, if I can't figure out these issues, I can't confidently ask my colleagues to use the tool in their workflow.
So it would be great if there will be an opportunity to discuss/debug above-mentioned
type of issues one-to-one (via skype/teamviewer/webex) because many time they become blocker. (of course if someone is available, every 2/4/6 months etc..)
Just to clarify : I love the tool and appreciate the efforts from team! There will be issues due the complexity of hpc-world
and there is no magic! And I am happy to be part of this journey!
I'll close #1411, but for the record see this comment.
I want to make one more comment on this that I am sure we are all aware of, but I think it's important to remember. The reason someone would call Spack complicated is a direct result of HPC being fragmented. Something like this is so much easier on OSX, e.g. Homebrew, where they can even avoid the build process and distribute binaries. How awesome would that be? I guess Docker kind of hits at this angle, but I digress. I think the effort to make Spack a better user experience can only go so far from inside the project because developers will just drop something new on it. Spack is basically trying to put out fires that developers are lighting in my opinion.
Since we're all involved in research, I think it would be interesting to somehow quantify how much trouble a developer is causing others based on their chosen project build system as an example. By mining Spack as a knowledge-base as it grows, maybe by starting with how much logic the package file requires for the application or something. Then with the right data it may be possible to provably show something like CMake requires the least amount of user-facing overhead, so we can help push for more standardization like using CMake, or just best practices, or whatever, across the community for the greater good. So not only can Spack be used to solve the build complexity problem as it exists, but also used as a justification that the complexity problem must be reduced. I have a dream, etc.
@jrood-nrel: FYI, binaries are on the roadmap for this year, and we can distribute arch-specific binaries. The CERN folks have had a PR implemented: #445, which they use locally, and we plan to work with them to make it secure (signed binaries) this year. There are likely issues to work out regarding our current architecture support and how to guarantee that a binary is arch-specific or not, but I think that will be something we can work out once this feature is in. We are also planning to have continuous build dashboards running on spack.io.
I like the idea of quantifying the impact of build systems. I will say that one of the benefits of using a good build system is amenability to good packaging, so this may have been kind of a chicken/egg problem for HPC people for a while now, especially given that the build systems don't always support the architectures we care about.
Final note: we are mostly HPC people but Spack is useful beyond HPC and it would be cool to promote it there. The ARM ecosystem is interesting; people often build for very specific architectures there, and having an OSS package manager that can handle that could be beneficial for them. Also, the line between your laptop and supercomputers is blurry -- I think that using the same tool in both places, assuming it is simple enough on the laptop, is very appealing.
@tgamblin I would expand my example of build systems to any build practices that could be quantified.
I have to say, I never thought binaries were an option. That is interesting.
I also agree on all your points, but I would distinctly consider supercomputing a superset of workstations but I understand the benefits of it using it there as well. I also understand the benefits to promoting it outside of HPC, but for commodity computing, the existing tools are obviously well-established. Maybe expanding beyond HPC it would be possible to coin it under something like 'specialized' computing or something even more general to break away from the niche aspect of it, in terms of marketing.
Some issues during early days were bit annoying : for example configuring Intel compilers first time with spack (which is very common in hpc world). Going through this google group thread and getting it working wasn’t straightforward. Similarly, using compiler modules on our clusters which needed environmental variables,
LD_LIBRARY_PATH
etc. (these are not fixed)
typo -> I meant to say "_these are now fixed!_"
We should also take a closer look at the two users that we know we have lost. Let's look at what Ardi had to say about Spack:
I believe spack is the closest to what I need. However, all these solutions (hunter, conan, spack...) have perhaps their strongest focus in packaging, dependencies, automatic downloads, etc... while I prefer to do all such tasks myself. I prefer to not have packages, just download the source in the original developer provided form, untar it, and to even build it on my own, following the developer instructions.
Uninstalling is trivial: delete the installation directory.
Ardi doesn't want a package manager that downloads things for him. He doesn't want it to keep track of dependencies. He doesn't want it to extract the tarball. He doesn't want it to build the package for him. He doesn't even want it to be able to uninstall the package for him. At some point we have to step back and wonder, "Maybe Spack isn't what he's looking for?". We can try to please everyone, but not everyone is looking for a package manager at all.
If we look at the other user we know we have lost for the time being (https://github.com/LLNL/spack/issues/1411#issuecomment-273822212), we see a much different story:
While I really like the concept It was very difficult not to get it to reinstall too much stuff, or to play well with existing modules. Documentation was also a bit scarce.
It's clear that this user is looking for a package manager like Spack, but has run into problems:
This problem is also my biggest complaint about Spack. At this point I've accepted the fact that Spack will reinstall things already found on my system, although many users are surprised by this due to their experiences with other, less complicated, package managers. What I haven't accepted is that every time the hashing algorithm changes, Spack decides to reinstall the 500+ packages that my users depend on.
We need a better way for users to add external packages. #2507 aims to make a simpler interface for this. Maybe we should put more focus and attention on it? If a user could load a module and then run spack external find
, much like they do with spack compiler find
, I think they would be a lot happier. Automatic detection of system installations would also be very useful.
I honestly don't feel like our documentation is that scarce, at least not from a user's perspective. There's a lot of things that could be added to the Contribution Guides, but I feel like our Tutorials are pretty in depth, maybe too much so.
@Exteris I know you no longer use Spack, but if you wouldn't mind telling us a bit more about why Spack wasn't working for you, we might be able to address those problems and win you back.
@malcook I see that you also abandoned Spack (https://github.com/LLNL/spack/issues/32#issuecomment-253643540). Would you be willing to share your experience here?
Hello,
In my consideration for deployment across a medium sized computational facility of shared linux boxes (mostly running bioinformatics/proteomics/imaging applications by diverse group of analysts), I shifted preferences, in sequence from EasyBuild to Environment Modules (lmod) to Spack, and lastly to Guix.
My switch from spack to guix was in part motivated by my perception of better documentation, larger development community and set of relevant recipes and pre-built apps, more rigorous treatment of dependency management, and in my (snap judgement) opinion, probably more likely to remain a player in the long term. Also, the guile meta-language was appealing to an old lisper, though it might be considered a barrier to entry in other quarters – YMMV.
Alas, I am running with none of them now… I spent too long deliberating and not enough deploying.
As per the issue of reinstalling things, I would gladly have suffered that consequence for the sake of integrity/consistency.
HTH,
malcook
Hi, I am and will be going through the process of learning Spack and creating my own package etc. I'd be happy to scribble down suggestions for improving docs, e.g. small things that a savvy user often overlooks and consider "known".
For instance, I was just now trying to create a package for *.jar that is part of a *.zip file. I was immediately search http://spack.readthedocs.io/en/latest/packaging_guide.html for "unzip" and "zip" and found nothing of use. The I turned to Google and then grep -E "(zip|compr)" -r pack/var/spack/repos/builtin
without success. Eventually, by trial and error and inspecting the temporary Spack directory, I realized it's automatically unzip'ed. Only later I see that the correct search word would have been "expand" (and "extract"), which would have taken me to Section 'Skipping the expand step' saying:
Spack normally expands archives automatically after downloading them. If you want to skip this step (e.g., for self-extracting executables and other custom archive types), you can add expand=False to a version directive.
In my case, listing examples or all types of archives that are expanded would have helped, e.g.
Spack normally expands archives (e.g. tar.gz, zip, bz2, ...) automatically after downloading them.
Now, to my point / proposal:
I'm happy to make suggestions like this as I go, and at random times, but it's not clear to me how I can give this feedback. I don't want to have to spend to much time having to compile a set of suggestions and having to explain suggestions up front (unless asked to or unclear). I'd like to just drop a comment as a notice something and keep on doing my work.
It doesn't really make sense to create separate issues for individual suggestions. Would it work to keep an open issue titled "Beginners: Please give feedback / suggestion here" and just allow people to scribble down comments as they notice things that can be improved? Alternatively, we can create our own, e.g. "Beginner feedback by Henrik B". The only downside I see is that "Watchers" might get lots of notifications (don't think you can unfollow / mute a specific issue. A solution to this is to create a dummy repository LLNL/spack-feedback-beginners
just for this purpose.
@HenrikBengtsson:
Thanks for the feedback!
In my case, listing examples or all types of archives that are expanded would have helped, e.g.
Spack normally expands archives (e.g. tar.gz, zip, bz2, ...) automatically after downloading them.
Want to submit a PR for this?
It doesn't really make sense to create separate issues for individual suggestions. Would it work to keep an open issue titled "Beginners: Please give feedback / suggestion here" and just allow people to scribble down comments as they notice things that can be improved?
I think this would be fine, if you want to create such an issue. Of course, if you want to keep it even simpler, you could make the changes in the docs and submit PRs 😄 . I think the suggestion above would add to the basic usage docs.
@tgamblin, I've done PR #2955 for that case, but it's a half-baked PR because ideally it should list all extensions that are expanded, but I don't know that. I guess it can be discussed further in the PR. Point is that, some of these feedback are not mature enough for being individual issues or PRs. I'll go ahead a use a "scrap" issue thread for more loose feedback.
I have tried spack in two different settings. The first was my laptop, to get libraries needed to locally run our code. This failed because spack did not recognize my linux distro, and stopped me there completely. The error message was very cryptic. But whatever, bugs happen.
A few weeks later I used it to setup a module environment for users of a small cluster in our group.
The goal here was to make most software available, to users that are not extremely picky about versions. Here I would like to use the system compilers, mpi libraries, m4 et cetera.
This proved to be very difficult, so we ended up compiling most of these through spack anyway.
This all worked quite okay eventually, but took longer than expected (and leads to many modules, making the system feel much more complex to our users)
Now to make this available to our users...
I looked through the basic usage section of the guide, and could find nothing about loading the specs spack has created. I found out later it is in workflows, and spack load is actually discouraged because it is too slow. The only option left is to make environment modules, but these lose dependency information...
(side note: please use / in environment module names so I do not have to type the whole name)
Anyway, we make a bunch of environment modules, set every users' modulepath to this and it gets some use now. We have a very simple setup and could also do with a few handcrafted modulefiles of course, but I feel like something simple like this should be easy to do with spack as well.
(side note 2: Can we not have a lightweight version of spack load? Or Cythonize the thing?)
Not so many concrete things here, sorry. Hope it helps anyways.
I found out later it is in workflows, and spack load is actually discouraged because it is too slow. The only option left is to make environment modules, but these lose dependency information...
(side note: please use / in environment module names so I do not have to type the whole name)
@Exteris Did you go through the module part of Spack tutorial? I have more or less the same problems as you have, and configuring modules.yaml
and packages.yaml
was the way to get a simple layout for modules.
@alalazo I see now that there is a lot written on this indeed, and it all seems very possible.
I do not recall seeing this half a year ago, seems like the docs are getting a lot better (or I did not read all of the docs before starting (which I feel should not be necessary, perhaps we need a reading guide))
@alalazo @Exteris Would you agree that Spack should default to using the module layout described in the tutorial:
modules:
tcl:
naming_scheme: '${PACKAGE}/${VERSION}-${COMPILERNAME}-${COMPILERVER}'
Yes, that seems very reasonable (and corresponds to the schemes seen on many other supercomputers)
@paulhopkins :+1: for me. Doing it should be just a minor modification to the default modules.yaml
shipped with spack.
This failed because spack did not recognize my linux distro, and stopped me there completely. The error message was very cryptic.
For the record, this has been fixed. We were originally using Python's platform module, which doesn't recognize linux distributions like Arch and Amazon Linux. We switched to the distro
module in #1629, which has been much more stable.
Here I would like to use the system compilers, mpi libraries, m4 et cetera. This proved to be very difficult
This is being worked on in #2507. I agree that our current support for externally installed software is pretty clunky. @citibeth has talked about shipping packages.yaml
files with a list of all system packages that come with, say, CentOS 7.
The only option left is to make environment modules, but these lose dependency information...
Can we modify our modules to also load dependencies as well? I thought that worked? Or is that only for Lmod?
I agree that our default modules.yaml
layout isn't the greatest. I am afraid of changing it globally and changing user's layouts. The new documentation on configuring modules is very helpful. We are about to switch to modules at ANL's LCRC, so I'll be digging through the code and making things work for our users. I agree that you should be able to load a module without typing out the entire hash, so I'll have to find a way to do that.
On Jan 31, 2017 07:35, "Adam J. Stewart" notifications@github.com wrote:
This failed because spack did not recognize my linux distro, and stopped me
there completely. The error message was very cryptic.
For the record, this has been fixed. We were originally using Python's
platform module, which doesn't recognize linux distributions like Arch and
Amazon Linux. We switched to the distro module in #1629
https://github.com/LLNL/spack/pull/1629, which has been much more stable.
Here I would like to use the system compilers, mpi libraries, m4 et cetera.
This proved to be very difficult
This is being worked on in #2507 https://github.com/LLNL/spack/pull/2507.
I agree that our current support for externally installed software is
pretty clunky. @citibeth https://github.com/citibeth has talked about
shipping packages.yaml files with a list of all system packages that come
with, say, CentOS 7.
The only option left is to make environment modules, but these lose
dependency information...
Can we modify our modules to also load dependencies as well? I thought that
worked? Or is that only for Lmod?
I agree that our default modules.yaml layout isn't the greatest. I am
afraid of changing it globally and changing user's layouts. The new
documentation on configuring modules is very helpful.
I think detecting Lmod during install and enabling it in modules.yaml
would be helpful to first comers.
We are about to switch to modules at ANL's LCRC, so I'll be digging through
the code and making things work for our users. I agree that you should be
able to load a module without typing out the entire hash, so I'll have to
find a way to do that.
Here's how I modify modules.yaml
to skip modules (still exploring though
diff --git a/etc/spack/defaults/modules.yaml
b/etc/spack/defaults/modules.yaml
index 25fe208..86d24f3 100644
--- a/etc/spack/defaults/modules.yaml
+++ b/etc/spack/defaults/modules.yaml
@@ -16,7 +16,8 @@
modules:
enable:
- tcl
- - dotkit
+## - dotkit
+ - lmod
prefix_inspections:
bin:
- PATH
@@ -40,3 +41,9 @@ modules:
- PKG_CONFIG_PATH
'':
- CMAKE_PREFIX_PATH
+ lmod:
+ hash_length : 0
+ all:
+ suffixes:
+ "+X": x11
+ "cppflags=\"-fPIC\"": pic
.
I think if the modules shipped with a configuration with more standard naming and prereq loading enabled by default, that would be way better awesome.
I am currently trying to use spack again (on Marconi) for compilation of mumps, scotch and pastix in several versions. The user guide is indeed much better than it was half a year ago!
Steps I have tried for now:
Is there any way to let spack detect build-time dependencies from my system?
For many of these I do not care if they are built with gcc or intel, or if the version matches exactly.
Examples are:
bison:
paths:
[email protected]: /usr/bin/bison
buildable: False
It would also be nice if spack spec could tell me if it is going to use a path or module for a package.
Recent discussion has brought up several issues. I have replied to them all on new issue threads, to leave this thread available for the core issue of user stories related to a perception of complexity. Other users should see this a thread to post their stories as well. See #2979 #2981 #2982 #2984 #2986 for take-off on recent issues raised here.
Did you go through the module part of Spack tutorial? I have more or less the same problems as you have, and configuring modules.yaml and packages.yaml was the way to get a simple layout for modules.
This is really good work. But let's not kid ourselves: as currently presented, it could contribute to the perception that Spack is overly complex.
Closing this issue as it is inactive since a while. We can reopen if the discussion restarts.
Please forgive me if this has been already discussed but I couldn't find anything.
I am in the process of considering a package manager for a mid-size C++ project and have come across Spack along with few others. Looking at the docs, Spack seems like _large_ project and that there is many concepts to learn before one can use it. this may not be the case, but that is the impression that I get.
Going through the _Getting Started_ section doesn't help the case either, it is too in depth and instead should cover most of what is under _Basic Usage_ instead, in my opinion.
@omeid -- Can you provide a bit of information about your project, what you're looking for in a package manager and perhaps list the alternatives that you're exploring? Understanding the things that you're trying to do and what's tripping you up would be useful.
Using Spack can be really simple: clone the repository, add its bin
directory to your path, run spack install this-app
and spack install that-app
and so on.
It gets more complicated from there, however, because it gives you choices (which other package managers, e.g. linuxbrew, don't). You can build this-app
using v1.0.0 of a library and build that-app
with v1.2.1 of that library. You can turn options on or off, use different compilers (from the system or Spack), use modules to configure users environments and/or etc....
Most of the complexity that I experience is inherent to what I'm trying to do, e.g. I want to generate Lmod modulefiles without hash suffixes but using various labels while skipping anything built by the system compiler and so on....
Alternatively, there is incidental complexity, things that make Spack more complicated than it needs to be (e.g. spack install --use-cache
vs spack build-cache install
.
It would be useful to see what bits you need to know to do what you're trying to do.
I agree that the Getting Started section could be a lot simpler. When I started using Spack, it was only a couple of paragraphs. Since then, it has morphed into "anything you could possibly need to do for one-time setup of Spack". We need to strike a good balance between complexity and thorough documentation.
Link to the Getting Started section: http://spack.readthedocs.io/en/latest/getting_started.html
Reorganization could be useful. IMO it would be good to group all the commands to get started into a single block:
git clone https://github.com/spack/spack.git
cd spack
export PATH=`pwd`/bin # . share/spack/setup-env.sh
spack install boost
spack load boost
These commands appear in the first few sections but a small amount of reorganization would make it clear that these get at most of the core functionality of Spack. For example the first spack install
example is in http://spack.readthedocs.io/en/latest/getting_started.html#add-spack-to-the-shell which might be something folks skim over initially. Everything after that section should be in the "intermediate" section but there is no clear separator.
Also: perhaps there could be a directory of sorts based on the use case? e.g. "you want to try out Spack on an Ubuntu VM" vs. "using Spack to manage installations on a Cray system"
Thanks for the follow up @hartzell,
The project is a C++ cross platform GUI application with either Lua or Python (still considering the options) runtime builtin for plugins.
One thing that I believe would greatly make using Spack more seamless is _native_ packages for various operating systems, so having at least RPM and DEB packages would be a good start.
Greetings,
I was convinced by a colleague to give Spack an honest evaluation. Thus, I spent most of today trying to get it working on a dev box. It took me a couple of hours to even get through the Getting Started guide and by the time I did, I had skipped around so much just trying to learn enough to get my first install complete that I'm not sure I fully read it...certainly not in order. Now that I'm on the other side I realize where I went wrong. Since this thread is about new users thinking Spack is too complicated (with specific regards to documentation), I thought I'd share. Hopefully it is useful. If someone thinks it is not informative and needs to go, I will happily dump it. :-)
I am working off of a Scientific Linux 7 box (minimal install, fully updated). I saw the requirements on the "Getting Started" page were " 2. A C/C++ compiler" so I did a yum install gcc
and went on my way. Now its easy and true enough that one could argue that this first issue is PEBKAC and that there's not much to be done for users that don't fully comprehend the very start of the instruction set. But I admit, I failed to install a c++ compiler yum install gcc-g++
.
The very first examples had issues which set me down the path of trying to just build my own GCC from Spack just to get /something/ to install. The heart of the issue was the error "configure: error: C++ compiler missing or in-operational." The moment I saw that, I realized my mistake and corrected it by installing gcc-g++. Spack was helpful enough to create ~/.spack/linux/compilers.yaml but it didn't tell me that's where it was looking for this information! I spent a LOT of time trying to figure out how it was getting variables. After all the documentation says "If auto-detection fails, you can manually configure a compiler by editing your ~/.spack/compilers.yaml file." That file didn't exist! So that couldn't have been my problem! Correct, because my problem was ~/.spack/ linux/ compilers.yaml! But because I didn't know that, I went through a lot of other portions of the documentation trying to figure out how to get Spack to see g++ when everything else could!
So Spack was nice enough to auto-generate the file, but apparently not nice enough to auto-update that file. There was very little to tell me where I went wrong. The first actual error messages about things failing were actually red-herrings. I had to go to the build log file to find the actual errors.
OK, finally. I got GCC installed. Next, let's set up LMOD and modules. So on the left hand side I dropped down to "Modules". I read down to the LMod hierarchical module files and I go to edit the default module.yaml file. Notice that the default module.yaml files doesn't have the core_compilers _and_ as the new comer I had just read about the optional compiler configuration in the getting started guide. I completely missed that core_compilers IS REQUIRED. So then I thought "great, now how do I get modules? do I need to rebuild GCC?" The answer is no, but since I didn't know that I went scrolling down till I got to "spack module refresh"! Great. Except it spit out tcl files, not lua.... Now the first hit on the search tells what went wrong: "Error raised if the key ‘core_compilers’ has not been specified in the configuration file." But it doesn't really help you figure out what to do about it. Which configuration file again? (Again, newbie - who at this point had completely forgotten about skipping that option before.)
Finally I found the Core/Compiler/MPI section and the lightbulb went off. "Oh, that's what I need. How did I miss this before?" Then I noticed that it is buried in the "tutorial 101". This super useful information wasn't in the modules section! Grr.... Whatever, it's working. Moving on.
What are the big tools on my system that users demand? The Open Source stuff is easy - I've long since written my own scripts to deal with those software packages. Let's tackle some of the hard packages. Ones that I really could use a tool like Spack to manage. Two critical ones are Matlab and the Intel tools.
Oh wow...I think I might need to open a ticket on just how bad the Matlab documentation is. I wanted to be as simple as possible, so I had my install key and the 2016b version. I tried. I really did. For multiple hours.
I didn't have a graphical interface, so I tried to use the silent option. Great, the spack info says there is something called variants. Great, there is a whole section on it but nothing really tells me HOW to use it. So I edit my packages.yaml file and under packages:
I add
matlab:
variants: silent
Wrong.
matlab:
variants: [silent]
Wrong.
matlab:
mode: [silent]
Wrong.
matlab:
mode: silent
Wrong.
Skipping dozens of variations as I'm sure you get the point. I could not figure out how to get the mode variant to work. So I just edited var/spack/repos/builtin/packages/matlab/package.py to force the default to silent. Then I needed to modify again to add the agreeToLicense=yes, which really messed with me because the first few times I edited the file pointed to by the error log I did not realize that Spack over-wrote that very file every single time. Then I needed to add my license key because even though it was in the file that the (very little) documentation said it needed to be in, the fileInstallationKey field was not being populated. Finally, it starts to install! Whoo! And immediate disappointment when it says that the installer detected 18 products to install but installed 0. Why? I literally have no idea. I give up. And I haven't even gotten to the HARD part of installing Matlab yet! (Hint, it requires installing Matlab a second time on-top-of-itself because there are actually two installers for different products - you can't install EVERYTHING in one go despite what their marketing and wording say.)
Let's try the Intel tools. I need EVERYTHING in the cluster install package. Nothing I find works and it looks like others install each product one at a time. sigh ok whatever. If I can get it installed, then it doesn't matter. Let's look at the tools spack list intel
. sigh Has many, but also missing a bunch...No Inspector or Vtune...but again, whatever. The core is there. Sadly, this was about the same path at Matlab. I spent a lot of time editing files that I don't think I should be editing just to force it to a failed install. I keep thinking "I've got to be doing this wrong." But I don't see a whole lot of guidance otherwise. I KNOW Spack works with Intel because there are mentions of it all over the documentation. But I'm not seeing a lot on the "how to install". Did I just flat out miss it?
Fine. I need a win. Let's go for something easy. spack install r
. To be honest, the moment I saw a failure, I didn't even attempt a debug. I just said "I'm done for the day". I may or may not try again.
My success rate:
GCC...Painful slow success
Matlab...fail.
Intel...fail.
R...fail.
I don't have a super complicated environment where I'm installing the same package multiple times with multiple compilers and multiple libraries and multiple ect ect ect. On the contrary, I have a fairly simple environment with only about a dozen or so applications, I just have to manage a LOT of versions of each. I also try to present a very simple module interface to my users. There is ZERO reason why they need to see binutils or zlib or any other build package in their module list. The simpler and dumber, the better.
In fact, I only have one package that is truly a time-suck and I've yet to find any thing that supports it (Spack/EasyBuild/Guix/ect). Sadly, many of the software packages considered a "must" on my HPC don't have packages in these build environments. I can easily relate to the guy above who just wants to build his own. I just have XKCD's take on Automation in the back of my head with past horror experiences of taking over a previous admins "home-grown environment" all telling me that I /REALLY/ shouldn't write my own. Still the temptation persists "surely it can't be this hard". :-/
Anyway, that's where I'm leaving off on day 1 of Spack. Hopefully that was useful from a "this is what a newb went through" view. I'm happy to take comments/feedback/criticism. :-)
Thanks!
Thanks @cstackpole! This is a great summary of common problems that new users run into. I can't tell you the number of times I've seen issues with the missing C++ or Fortran problem.
Let me respond to a few of your comments specifically:
It took me a couple of hours to even get through the Getting Started guide and by the time I did, I had skipped around so much just trying to learn enough to get my first install complete that I'm not sure I fully read it...certainly not in order.
Completely agree. As I mentioned above, when I first started using Spack, the "Getting Started" section was 2 paragraphs, max. It was basically "clone it and go". Since then we've added a lot more information. Perhaps, too much information. I think a good compromise would be to move all of the compiler-specific stuff to the "Compilers" docs and add a link saying "if you need to set up complicated compiler stuff, go here".
Oh wow...I think I might need to open a ticket on just how bad the Matlab documentation is.
I actually wrote the Matlab package back in #2614, and to be honest, I'm not proud of it. In that PR, I noted several caveats/problems that I ran into. Unless I missed something, or things have changed since then, Matlab is just difficult to package. Perhaps we should look at how other package managers handle it.
As for the Intel stuff, that nightmare is mostly Intel's fault (they change the installation hierarchy with every release and on every system). #7469 will help quite a bit when it finally gets merged, but I still don't think it will be perfect with regards to LMod integration.
Unfortunately, the things that are most difficult to package are often the most important packages (_I'm looking at your TensorFlow_). For common open source software that builds with Autotools or CMake, Spack is usually great. For packages with dozens of dependencies (like R), the chances that something goes wrong exponentially increases.
There is ZERO reason why they need to see binutils or zlib or any other build package in their module list.
I proposed a solution to this in #4400.
P.S. Welcome to Spack! If you haven't been scared off yet, feel free to open issues or ping us on our Slack channel if you get stuck and can't figure something out. Once you get the hang of things, it's much easier. The initial setup and configuration is the worst part.
@adamjstewart, Glad it was useful. I am really super appreciative of people who dedicate time to Open Source projects and was really hoping my experience was useful feedback and didn't just sound whiny. :-D
I can't tell you the number of times I've seen issues with the missing C++ or Fortran problem.
Then may I propose putting a giant WARNING box in that very first example? Something like:
If you have an error about a missing compiler, please verify you have gcc _and_ g++ in your $PATH. Then either delete the file(s) ~/.spack/(OS)/compilers.yaml | ~/.spack/compilers.yaml or learn more about customizing that file <insert link>
That would have saved me SO MUCH time and frustration. At that point, I didn't know enough about Spack to debug it. Especially if it is a common issue.
Since then we've added a lot more information. Perhaps, too much information.
In my experience with the Spack docs, I found the search very helpful. I just didn't always know how to associate that information. For example, on the modules issue. If the 101 information had been in the titled section "Modules", I probably would not have had that issue. Even if the information doesn't move, just having a link between the two of "learn more about LMOD integration here" would have saved me a lot of time and frustration.
I actually wrote the Matlab package
I stumbled across your post when I was working on trying to get Matlab to install but I didn't associate you being the author. Thanks for doing it! I can take this into another issue or off-line/whatever, but I really feel like I started down the wrong path but because I didn't know any better I just brute-forced my way until I went off the cliff. Any suggestions on where I went wrong?
As for the Intel stuff, that nightmare is mostly Intel's fault (they change the installation hierarchy with every release and on every system)
Exactly! I complain to them regularly about this. I actually have a pretty good relationship with many contacts at Intel, but this is one where I pester them a lot. The tools are good and I have a lot of users using them, but their installer is terrible and the constant change means my install scripts never work for more than a version or two. But this is also exactly what I was hoping we could solve on a community level because I know I'm not the only admin who curses the new Intel installers. :-D
7469 will help quite a bit when it finally gets merged, but I still don't think it will be perfect with regards to LMod integration.
I just read through that. Looks like an impressive improvement! That's an awesome step forward!
Unfortunately, the things that are most difficult to package are often the most important packages (I'm looking at your TensorFlow). For common open source software that builds with Autotools or CMake, Spack is usually great. For packages with dozens of dependencies (like R), the chances that something goes wrong exponentially increases.
Sadly that is incredibly true. I solved the need to package multiple versions of GCC in my home-grown-installer-scripts many years ago with only slight tweaks since then. Where I really struggle is in the complex-for-no-reason packages like Intel and Matlab. For Matlab, I always have to tinker with getting Java just right, then install, then re-install on top of itself to get the other bits that won't install on the first pass, then configure the right GCC compilers to match the compilers Matlab needs, then configure MDCS, then build all the add-on packages like Dynare. But I haven't cracked the install stage yet for Spack and my attempts with EasyBuild got me more-or-less close-but-not-quite-to the MDCS stage if I manually did the second install. I haven't been able to get further.
I dunno. Maybe I'm not the right target audience and I'm trying to use the tool in a "you can do it but you probably shouldn't" kind of way. :-)
I proposed a solution to this in #4400.
Thanks! I feel better knowing I'm not the only one who has the same issue. :-)
P.S. Welcome to Spack! If you haven't been scared off yet, feel free to open issues or ping us on our Slack channel if you get stuck and can't figure something out. Once you get the hang of things, it's much easier. The initial setup and configuration is the worst part.
Thank you! I just might take you up on that. Most of my supported software is trivial, but I know that I need something to help me manage this complex stuff. Besides if I ever "win the lottery" (the positive spin on "hit by a bus"), I would rather not have my coworkers cursing me for the weird environment I left behind. I would much rather have a tool that they can use with a community to help them if needed. I'm going to give it another go tomorrow. See where I end up. :-)
Thank for responding!
I haven't looked at the documentation much in the last years, but as @adamjstewart said, configuring Spack for your machine is the most difficult part. Getting all the compilers and packages.yaml
and modules.yaml
ironed out takes a lot of time, but once that's mostly done, there are so many things I just couldn't do as easily that I'm doing now with Spack. What helps the most in my opinion is seeing the .yaml
files other people are using, which is why you will see me offer up my main configuration pretty often at https://github.com/Exawind/build-test/tree/master/configs/machines/peregrine . There are a lot of people using it now, so you're likely not alone in anything you have or will encounter. Definitely join the Slack channel to ask questions.
Also, I haven't really been able to understand others' frustration with Intel. I just install the cluster edition with every variant on, edit the license file, and it gives me everything: vtune, mpi, inspector, etc. 🙂 I've just been in it too long, and I don't know what it's like coming in as a new user. Thanks for posting your experience!
Fortran compilers keep hitting people that I've convinced to use spack for installing openfoam.
This comes from the mpi dependencies and can't be disabled. After this first failure, they might add gfortran to their system, but the compilers yaml may not notice it, or is masked by an older version etc.
... gets a bit hard to explain how this is easy, but with a little perseverance they eventually get there.
@jrood-nrel said:
What helps the most in my opinion is seeing the .yaml files other people are using,
In that same spirit, here's the basic script that I use to install things. It changes in particular circumstances, locking down versions and variants, and etc....
https://gist.github.com/hartzell/d7d067e59695575c47e1c86b898b37fc
I based it on a script that someone (@healther?) shared a year or more ago, either here or on the mailing list.
@hartzell Oh cool, it does what I do with a setup-spack.sh
script but all in one script. We need like a Spack users group or something to see how other people are doing things in practice.
@jrood-nrel -- Glad you like it. Here's the Spack mailing list post with the script that I started from.
Turns out that it was @eschnett (Erik Schnetter) that was the original author.
If we are interested in building a set of "how do we use spack" documents I could contribute the principle setup of our deployment chain. It might be a good addition to the getting started part, but that will need to be cleaned out/reorganized first
Let's avoid adding anything to the Getting Started guide. Perhaps we can consider adding another repo to the Spack organization that contains Spack config files organized by lab and division. For example, I could add my config files to ANL/LCRC
or ANL/Bebop
. @tgamblin
@adamjstewart: I went ahead and made a repo for this: https://spack/spack-configs.
All: go ahead and throw your configs in there!
Once we get Spack Environments merged, I suspect people will want to post
their Environments for use by others, which includes the config files. So
far I've been posting my on a Spack branch (see below). I believe this
needs more thinking through...
https://github.com/citibeth/spack/tree/efischer/giss/var/spack/environments
On Sat, Jul 7, 2018 at 8:10 AM, Adam J. Stewart notifications@github.com
wrote:
Let's avoid adding anything to the Getting Started guide. Perhaps we can
consider adding another repo to the Spack organization that contains Spack
config files organized by lab and division. For example, I could add my
config files to ANL/LCRC or ANL/Bebop. @tgamblin
https://github.com/tgamblin—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/spack/spack/issues/2873#issuecomment-403211324, or mute
the thread
https://github.com/notifications/unsubscribe-auth/AB1cd3je01KxB1mL8g3o85dbdND-zEjFks5uEKUqgaJpZM4LoT9d
.
Like any piece of powerful software, Spack has many degrees of freedom, giving an administrator innumerable chances to make good or bad decisions that cannot easily be undone. In the light of the discussion above, what would be really helpful would be a comprehensive planning and implementation guide.
from developer's point of view, we just want to have a spack.yaml
file in the root repo to define a set of packages, dependencies etc so collaborator can build the repo just with spack install <spec>
or spack dev-build <spec>
. Maybe tutorial can cover this use case at the beginning.
Most helpful comment
As an observer here, it is not clear to me what Ardi would want in this case. It seems manually managing packages is the closest to his ideal solution. I see this myself when trying to sell Spack to developers. Though I think it takes actually using Spack to appreciate what it can do for you in the HPC arena and that somebody may not know it would work for them without putting some time into learning it, and I know that just looking at the documentation would be somewhat intimidating.
I've been through this exercise myself of trying to find the best solution for building software, because ultimately I hate building software, which is why I always seem to spend most of my time trying to get software to build without my intervention. Outside of HPC, I would just be using some kind of Homebrew or apt-get because they are so simple. However, since those systems are so simple to use, people expect that from Spack, and when they don't get it, because they immediately hit a few inevitable issues on their custom machine that they will need to spend time learning how to mitigate, they leave.
I get worried myself about how Spack could become too complicated at some point, and maybe it already is, but then I remember that HPC research-type software is overwhelmingly complicated, the compilers are overwhelmingly complicated, and the machines are overwhelmingly complicated, and all so easy to break. I also think it is frankly unsustainable to manage without having something like Spack helping me in the future, and that's where I see Spack. It's not that I'm not depending on it right now, but I see it absolutely critical in the future, which is why I contribute to it and why I want my colleagues to use it. In my experience of trying to sell it to people, I don't think people believe something could ever make managing HPC software as simple as we are trying to pretend Spack does. If I can show them in detail some things it can do, they are usually impressed, but something will always break when trying to help with their use-case and it usually expands into a bigger issue and they get scared.
In my opinion as an actual user, I find the top-level Python package files and essentially the 'knowledgebase' for building all the packages to be a fantastic experience, but on the other hand, the configuration for a particular machine is the thing I find to be the most difficult part to get _right_. Although I am not sure if a 'correct' configuration of Spack actually exists for any one system (more specifically at least for any one supercomputer), or if that will always be inherent to the custom nature of HPC, but I've always appreciated the effort this team has put into trying to solve it.
So in the end, I appreciate the expansion of features as I think they are necessary to demonstrate the maturity of Spack, but I think I would also like to see an approach at some point to prune them down and be able to focus on a more robust user experience so that some time in the future it would be ridiculous in HPC to not be using Spack in your system management. Any effort to make it a better user experience at this point to sell people on it that would contribute to it to help make it an even better experience to sell more people on it sounds good to me. The documentation is large enough now that maybe it could be split into something more palatable for first users, but I do know that the amount of documentation available is one thing everybody likes about it. Maybe some of this helps gain perspective, but I fear I may just be rambling and not saying anything anybody on the team doesn't already know, so I will stop now.