When I have [email protected] installed:
(cab689):~$ spack find python
==> 1 installed packages.
-- chaos_5_x86_64_ib / [email protected] --------------------------------
[email protected]
I now try to install py-twisted, explicitly setting the dependency ^[email protected]:
(cab689):~$ spack install -v py-twisted^[email protected]
==> Installing py-twisted
==> Installing python
==> bzip2 is already installed in /g/g22/gimenez1/src/spack/opt/spack/chaos_5_x86_64_ib/gcc-4.9.2/bzip2-1.0.6-wl4v7wdok42cfndertdgyxys2au2ljpz.
==> ncurses is already installed in /g/g22/gimenez1/src/spack/opt/spack/chaos_5_x86_64_ib/gcc-4.9.2/ncurses-6.0-2v7r63atwq6aw3p66bc3mkp7hxeoxgqx.
==> zlib is already installed in /g/g22/gimenez1/src/spack/opt/spack/chaos_5_x86_64_ib/gcc-4.9.2/zlib-1.2.8-mbw4kksfiiloopjcuqbwrktbxe7hq73x.
==> openssl is already installed in /g/g22/gimenez1/src/spack/opt/spack/chaos_5_x86_64_ib/gcc-4.9.2/openssl-1.0.2e-qs3iwf2rhwlck3qsyrlea7i7zbxluntg.
==> sqlite is already installed in /g/g22/gimenez1/src/spack/opt/spack/chaos_5_x86_64_ib/gcc-4.9.2/sqlite-3.8.5-2fhvbyidf72xkkazmqnng4ofp2z2hgxk.
==> readline is already installed in /g/g22/gimenez1/src/spack/opt/spack/chaos_5_x86_64_ib/gcc-4.9.2/readline-6.3-zclrirpahthnvxm2kj2qbz3rup6agcg5.
==> Already downloaded /g/g22/gimenez1/src/spack/var/spack/stage/python-2.7.10-4azwfxr6b6fddsanso7fgk5xivgdnffs/Python-2.7.10.tar.xz.
As you can see, spack tries to reinstall python 2.7.10.
I went ahead with the installation to see the dependency graph of the new python, here it is:
(cab689):~$ spack find -d python
==> 2 installed packages.
-- chaos_5_x86_64_ib / [email protected] --------------------------------
[email protected]
^[email protected]
^[email protected]
^[email protected]
^[email protected]
^[email protected]
^[email protected]
^[email protected]
[email protected]
^[email protected]
^[email protected]
^[email protected]
^[email protected]
^[email protected]
^[email protected]
^[email protected]
^[email protected]
The newly installed python has ^zlib, everything else is the same. py-twisted does not, however, depend on python^zlib, so not sure why spack is making this new dependency requirement. To verify that py-twisted is using the new python^zlib:
(cab689):~$ spack find -d py-twisted
==> 1 installed packages.
-- chaos_5_x86_64_ib / [email protected] --------------------------------
[email protected]
^[email protected]
^[email protected]
^[email protected]
^[email protected]
^[email protected]
^[email protected]
^[email protected]
^[email protected]
^[email protected]
^[email protected]
^[email protected]
^[email protected]
^[email protected]
^[email protected]
^[email protected]
^[email protected]
^[email protected]
^[email protected]
^[email protected]
TL;DR py-twisted should use the existing python, but it creates a new python^zlib for no apparent reason.
I think I need to put this on the agenda for the telcon today. Spack's current logic is:
This was intended be conservative, but what it ends up doing is not reusing as much as it could. If we did more matching against existing installs as part of the concretization process, we would reuse more by default, at the price of maybe linking against more old stuff.
So I guess my question is: do you want Spack to "settle" for installed stuff more often? I think the answer for most people (like @trws) is "yes". I think we should change the default concretization policy to be to match installed first.
That _will_ make installs less deterministic. i.e., your install order will affect what you link against. We should probably provide a command-line option to install a "clean-slate" version of the package, where we concretize without considering installed packages and rebuild stuff that's not current. That would let you get the current behavior only if linking with the existing package doesn't work.
@alfredo-gimenez: Thoughts?
@tgamblin This has bitten me too with GCC. I think the settling for installed stuff is good, but maybe this needs to be controlled per package. For example, rebuilding something small with a new dependency is no big deal, but if it's something like gcc or llvm it would be better to stick with what's already installed.
@davidbeckingsale: It sounds like you'd be happy with the default being to take what is already installed. You could always do an install --aggressive or something to rebuild more stuff.
I agree with @davidbeckingsale. Especially in a situation where I want multiple python libraries to all extend the same python, I'd prefer it to look for existing python.
The other weird thing is, python has zlib as a dependency, but spack first installed python with no zlib in the dependency graph...
@alfredo-gimenez: that happens when a package.py file changes and you do a git pull. Likely, you installed a prior version of the Python package that assumed a system zlib, and someone has since updated the python package.
This change is actually pretty easy to implement, so I think we can play with it. Do others have opinions on this? @eschnett, @alalazo, @nrichart, @mathstuf @mplegendre ?
The problem, as I see it, is that when adding a new dependency, existing installs don't have a flag for it and are treated as if they didn't build with it. As an example, adding a c++ flag to the GCC build to indicate that g++ is wanted would mean that existing builds "don't conform" since spack would think it doesn't meet the requirements (even though a gcc build _would_ have g++ already). There are at least two ways to deal with it:
I prefer the first simply because the second's maintenance burden is likely very high compared to the time to just rebuild the software again (When can the declarations safely be dropped? Was it dependent on some system state when it was built (say, a system Qt that happened to be found in which case there is no right answer)?).
I have openmpi installed, and whenever I install a package that requires mpi, spack begins to build mpich. So: yes please, spack should look at installed packages and variants.
I think this is different than taking a system version and making it a package (as far as spack is concerned at least). I think the process for that case would be "tell spack that this mpi I already have installed satisfies the "mpi" requirement any package might need" (maybe with some related information about a prefix or something else). For this case, it's as if we have a package frobnitz which happens to use MPI if it finds it on its own, but spack never knew that it needed MPI in the first place. So existing builds have MPI if MPI was (at the time of that package's build) present and doesn't if it wasn't there. I think what you want is indeed something spack should support, but this is not the same problem.
The work in https://github.com/LLNL/spack/pull/120 can handle the MPI case. It uses a package.yaml config file to set "preferred" package configurations in Spack. You could, for example, specify that you're a [email protected] and icc@15 shop, and Spack will by default concertize with that MPI and compiler. You can also start specifying locations for external packages, which Spack then use rather than building its own versions. By pointing an external package at your local MPI installation, and setting that version as preferred you could have default Spack builds always link against the local MPI.
This is independent of the original request to prefer already-installed packages. As a first-order we could prefer packages specified in the packages.py, and as a second-order we could prefer packages already installed.
@mathstuf:
I think we have this one covered. It's 1 but instead of throwing a bug we just conservatively rebuild. The specific semantics are implemented in spec.py#323. Basically if a spec is concrete (i.e. it is already installed and all known variants were filled in when it was concretized) then absence of a variant is treated as unsatisfiable. If the spec is _abstract_, then absence is treated as satisfiable, because you _could_ constrain it to have that variant. If we went to a model where we looked at existing installs first, we would do #1 but rebuild instead of saying sorry. This seems right to me. I suppose we could warn the user in cases where we rebuild something that is close to an already installed version. Printing out _why_ we rebuilt might actually be kind of cool.
"tell spack that this mpi I already have installed satisfies the "mpi" requirement any package might need" (maybe with some related information about a prefix or something else).
The externals stuff (#120), as @mplegendre mentioned, does this. That is getting merged along with the cray port PR #309.
I think what you want is indeed something spack should support, but this is not the same problem.
I think @eschnett meant he has OpenMPI installed via Spack. If OpenMPI is the first thing you install with Spack, it would be really nice if subsequent specs that require MPI just resolved against it.
@mplegendre
As a first-order we could prefer packages specified in the packages.py, and as a second-order we could prefer packages already installed.
Which to prefer gets a little complicated and you might want it configurable. I think we should allow someone to decide where they rank their installs stuff compared to their own and site packages.py concretization orders. I could see people wanting both. If I'm the guy deploying for everyone, I probably want to stick to the stack as much as possible. If i'm building in my home directory I might prefer my own packages to site concretization prefs (especially if they were bundled with Spack and not mine)
@mathstuf Let me clarify: I have the spack package openmpi installed, and nevertheless spack wants to install the mpich package to satisfy the virtual mpi package requirement. Instead, it should be happy with openmpi.
Based on this I think the consensus is to look at what is installed, and to add some type of precedence for file-based concretization preferences (once that is merged... I don't think anyone but me, @mplegendre, and @becker33 have seen that :)
@tgamblin Allowing users rank package.yaml vs site-installs vs user-installs would get complicated. I think we should just pick a simple order and enforce it.
If a user explicitly specifies a package version as preferred in package.yaml, then they probably already have that package installed or want to install it, so first-order should be package.yaml. Second-order should be preferring local installs and third-order should be preferring site-installs. These are all just about picking defaults, so if Spack picks a default wrong then a user can always just be more explicit about what they want.
@mplegendre: I think picking an order within a config scope makes sense. I was thinking more in terms of "what about site scope". My preference would be something like ~/.spack/packages.py > installed packages > $spack/etc/spack/packages.py. That way we can put sensible defaults in the Spack distro, which the users can do a hard override on, but if a user likes a particular MPI better and installs it, they get the MPI they implicitly asked for.
I'm coming in on this late, and it seems that the overall decision
matches my preferences, but I would note that a couple of things are
being conflated just a bit here in terms of options at least. It only
really matters in corner cases, but I think of these as separate:
Case 1 and 2 are, by my mind and seemingly this discussion, pretty
clearly things that we want not to cause rebuilds by default.
Presumably 3 and 4 are as well, but at least 2-4 probably warrant
configuration options or control flags, because someone will want to do
each of those. In gentoo-land, where as @tgamblin points out names are
often a bit odd, these roughly translate to 2:--update (-u) update
this package and all direct dependencies, 3:--deep (-D) consider the
entire dependency graph including past packages that are already
available, 4:--newuse (-N) rebuild every package seen that has had its
use flags (variants) change (by user addition or subtraction or package
maintainer addition or subtraction) since the version available.
These can be used alone or combined to get pretty much every behavior I
can think of, from nothing giving you "reuse everything" to (my usual
emerge flags) -uDN for "I want this package and all of its direct and
transitive dependencies to be their most up to date and variant
compliant version."
This also reminds me, is there a way to refer to "all packages directly
requested to be installed by the user?" It occurs to me because the
standard "update everything please" in emerge, and ports come to think
of it, is something akin to emerge -uDN world where world only refers
to those packages installed intentionally, excluding all that were
pulled in by dependency resolution.
Due to the time zone I am also entering the discussion late. Besides agreeing with basically everything has been said, I just want to add a consideration that maybe everyone considered implicit : it seems to me that most of the points in this discussion basically ask for more meta-data associated with package entries in the db.
python ^zlib issueThe original issue from which the the discussion started stems from the fact that a package.py file has been updated and, if I understood it correctly, we don't store any kind of hash for the installation recipe yet. If we inspect local installs before concretization we may very likely use outdated installed packages silently in cases like this. Adding an hash that relates only to the package installation instructions (or something similar) may permit at least to mark packages as up-to-date or out-of-date : package.py changed or even out-of-date : dependency <name> out-of-date and warn users about those situations. This goes in the direction of telling users why something fails (or just why something may be potentially dangerous).
@trws consideration about being able to update every package explicitly installed by a user may be implemented by adding an explicit_request = <boolean> attribute in db entries.
The last example that comes to my mind is external repositories. I still have to read the code, so that my concerns may have been answered there already, but in case : do we keep track of the provenience of an installed package? A use case that comes to my mind is having in an external repository a custom and site-specific version of a package that is also present in the built-in:
package.py from the external repository but left an installed version of itIs this basically solved given #839?
Not really. While #839 may solve this particular case with Python, the underlying problem is that if I install, let's say, hdf5+szip, and then I install netcdf, Spack will reinstall hdf5~szip because szip defaults to False. Spack does not currently take into account what is already installed. It decides exactly what should be installed, and then checks if that exact combination of versions and variants is already installed.
@adamjstewart but i think this particular issue would be solved if #839 is implemented assuming that Spack re-use build-only dependencies. I think that's what @mathstuf was referring to.
Yes, Python would no longer be re-installed I guess. But this problem occurs for every package in Spack, not just Python.
The second phase of #839 is to use already installed packages to resolve dependencies. That way if the current DAG is looking for python, but you have python+zlib installed, that satisfies it, so it will use that one instead, but if you need python~zlib, it will build a new one.
While #839 may solve this particular case with Python, the underlying problem is that if I install, let's say, hdf5+szip, and then I install netcdf, Spack will reinstall hdf5~szip because szip defaults to False. Spack does not currently take into account what is already installed. It decides exactly what should be installed, and then checks if that exact combination of versions and variants is already installed.
i would certainly like this to be as an option, but this is not a show-stopper as you can always do
spack install hdf5+szip
spack install netcdf ^hdf5+szip
Inconvenient but possible.
Whereas an option to treat build-only dependencies more flexible will actually allow to build a DAG where different packages need incompatible versions of python as a build-only dependency. This would also solve issues i explained here https://github.com/LLNL/spack/issues/839#issuecomment-232945208, which could be a showstopper.
The second phase of #839 is to use already installed packages to resolve dependencies.
Oh wait, seriously? I must have missed that. If that's the case then I'll be a very happy camper.
i would certainly like this to be as an option, but this is not a show-stopper as you can always do
spack install hdf5+szip spack install netcdf ^hdf5+szipInconvenient but possible.
It works, but it's definitely inconvenient. I currently have to install:
spack install netcdf-fortran %pgi ^netcdf~mpi+hdf4 ^hdf+szip ^hdf5+szip ^openssl%gcc
Now that's a mouthful.
I'm coming in really late to this discussion, but I've hit this issue a few times [1] and yesterday noticed a potentially useful feature in Homebrew that could possibly serve as an analog for Spack.
The brew pin command will prevent a package "from being upgraded when issuing the brew upgrade formula command". Perhaps something like this could be implemented similarly to (and perhaps using the same infrastructure as) the external buildable: false feature in packages.yaml)?
[1] For example, I have Python set to use [3.7.2] in my packages.yaml file, but spack spec py-ipython shows (truncated here):
[email protected]%[email protected]
^[email protected]%[email protected]
^[email protected]%[email protected]
even though the only dependency of py-appnope is the PythonPackage class. Changing the spec to py-ipython^[email protected] fixes the dependency. Also note that I only have python 3 installed, not python 2.
This _should_ be "just" a bug in the concretiser or some other dependency restricts to [email protected], the list in spack spec does not show all dependency edges, but only the "first" encountered
It may just be that the yams file lists the system python but doesn't mark python as unbuildable. The only way it will take any package version lower than the "ideal" is if it's marked unbuildable.
On 29 Mar 2019, at 5:35, Andreas Baumbach wrote:
This should be "just" a bug in the concretiser or some other dependency restricts to [email protected], the list in spack spec does not show all dependency edges, but only the "first" encountered
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHubhttps://github.com/spack/spack/issues/311#issuecomment-477982095, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AAoStbrieMy6q0fXHa1CPR3RbaLANe4Bks5vbgijgaJpZM4HAlJd.
Hi,
this issue has been open for more than three years and summarizes my frustrations with spack quite well.
Suppose I want to quickly try out some high-level simulation code, I don't want it to reinstall MPI+dependencies, hdf5, etc. I just want to type (say)
spack install --reuse
To get such behavior right now I have to maintain an extensive packages.yaml, and still it happens all the time that openmpi is being rebuilt because some irrelevant variant was added to a dependency. Are there any plans to include such an option in spack?
Are you aware of the possibility to do spack install <simulation> ^/<hashofyouropenmpi>?
In general I don't really see the point, as it does not cost _me_ time to do the rebuild. It does cost my computer and I don't care about its opinions too much. I _think_ that is also the point of view of most of the main contributors (just judging by this issue being open for 3 years) -> so feel free to open a pull request that implements this functionality. I'd expect it to be merged relatively quickly, but I'd also expect it to cost significant time to implement that (as it touches the inner workings of the concretiser)
thanks, the hash-based package selection looks useful, I didn't know about that. It still means I have to maintain the packages.yaml file carefully.
@healther I think most of the main contributors, including myself, would love to see Spack reuse installed dependencies as much as possible. But as you said, this completely changes how the concretizer works and would be a large overhaul, which is why this issue has been open for so long.
a non-intrusive approach would be to have a spack command to generate (part of) a packages.yaml file that presribes a number of installed packages (or loaded modules)
CC: @fryeguy52, @becker33
Any chance this will get fixed or spack will add an option to make spack do this?
For SNL ATDM we are needing to break up the building of the compiler, the "tools" packages, MPI, and the TPL packages into four separate spack install commands and spack is rebuilding the same package with different hashes several times but otherwise there are no differences that I can see in the specs.
We are already needing to generate package.yaml files to reuse the "tools" packages and MPI that is built in the downstream TPLs package builds. I guess we could script up the the generation of entries for the package.yaml files for these other libraries that get constantly rebuilt like 'libiconv', 'numactl', etc. We have already created the infrastructure for doing this so this would not be that hard to do. The only challenge is having to pin down the versions of all of these packages since that is the only safe way to find install directories of the the form:
spack/opt/spack/<arch>/<pkg-compiler-name>-<pkg-compiler-ver>/<pkg-name>-<pkg-ver>-<hash>
A-priori we know all of this info accept for the has <hash> so we do a:
$ ls -d spack/opt/spack/<arch>/<pkg-compiler-name>-<pkg-compiler-ver>/<pkg-name>-<pkg-ver>-*
Currently, that can return more than one directory because of this issue (so we just return the first directory found). But if we populate package_common_<pkg-compiler-name>-<pkg-compiler-ver>.yaml files with the list of packages that we know that spack is rebuilding between these different sets of packages, then I think we can guarantee only a single install of each of these packages for each compiler <pkg-compiler-name>-<pkg-compiler-ver>. Since we have a closed set of packages that we are needing to install with Spack for SNL ATDM, that should be tractable. But that will add to our scripting code which is already up to:
Language files blank comment code
-------------------------------------------------------------------------------
Bourne Shell 2 195 153 710
YAML 6 5 19 133
-------------------------------------------------------------------------------
SUM: 8 200 172 843
-------------------------------------------------------------------------------
But that 800+ lines of scripting code is less code than writing a package install system from scratch so so be it.
@bartlettroscoe this is dependent on a rewrite of the concretization algorithm that Todd is currently working on. @fryeguy52 can tell you about the demo @tgamblin gave at the workshop this week. It's not fully functional yet, but it appears to be coming along well.
Any progress on this? After much frustration I adopted a multi-stage installation similar to what @bartlettroscoe described, but it would be a lot easier if I could tell spack to reuse installed specs rather than rebuild the latest version.
Most helpful comment
@bartlettroscoe this is dependent on a rewrite of the concretization algorithm that Todd is currently working on. @fryeguy52 can tell you about the demo @tgamblin gave at the workshop this week. It's not fully functional yet, but it appears to be coming along well.