Formulae are having to re-specify the customized explicit version from the spec inside the resource blocks in order to get the names in the cache correct.
If we don't specify version "2.12.0-M4" for the devel spec in scala, brew thinks the version is "4" not "2.12.0-M4", but when that's corrected by adding the explicit version, the fix doesn't make it down into the nested resource blocks, which instead are ending up with the version "4" due to the failed independent parsing of the version from their own URLs. So the formula ends up having to specify the customized corrective version in every single nested resource block, otherwise the cached resource names will be "incorrect" and will potentially collide with a future _actual_ scala 4.0's resources.
diff --git a/Formula/scala.rb b/Formula/scala.rb
index 7682d0c..ffb7080 100644
--- a/Formula/scala.rb
+++ b/Formula/scala.rb
@@ -36,13 +36,11 @@ class Scala < Formula
resource "docs" do
url "http://www.scala-lang.org/files/archive/scala-docs-2.12.0-M4.zip"
sha256 "ccfbc72c5bfd3743b08f4ecfb3dea46cb2b982eaaba5ada24378a93c351f8c77"
- version "2.12.0-M4"
end
resource "src" do
url "https://github.com/scala/scala/archive/v2.12.0-M4.tar.gz"
sha256 "2f98086788e684a3236ca96bc1b32abca499a728099c273ed4f6e150fcb8edc4"
- version "2.12.0-M4"
end
end
brew install --devel --with-docs --with-src scala leads to
iMac-TMP:Formula joe$ ls ~/Library/Caches/Homebrew/|grep scala-
scala--completion-2.9.1
scala--docs-4.zip
scala--src-4.tar.gz
scala-2.12.0-M4.tgz
I'm afraid this is not fixable and by design. Since the resources can be literally anything and their version numbers (if any) can be completely unrelated to the main resource, it wouldn't make sense to propagate the version from the main resource to the other resources.
The only “fix” I can think of is to improve our version detection logic, while making sure not to break things for current or past formulae (i.e. versions auto-detected from URLs for all formulae past and present and all their resources are identical before/after the change).
it wouldn't make sense to propagate the version from the main resource to the other resources.
Why do resources need their own version at all? The propagation would solely be to a prefix before the resource name not a suggestion that that's the resource's own intrinsic version.
@ilovezfs File naming and nothing else. I personally think we should do resource file naming and cache deduplication based on their URL alone.
Why do resources need their own version at all? The propagation would solely be to a prefix before the resource name not a suggestion that that's the resource's own intrinsic version.
Because if we prefix all resources with the version of the formula, then resources that don't change from one formula version to another (not that uncommon) will need to be downloaded again with every formula version bump, even if nothing changed. This could be fixed by using the sha256 of the resource as a file name at the expense of a solely machine-readable cache.
File naming and nothing else. I personally think we should do resource file naming and cache deduplication based on their URL alone.
Even with the current scheme we have a certain (though manageable) duplication, every time two different formulae use the same version of the exact same resource. This is also fixable by using the sha256 of the resource as a file name at the expense of a solely machine-readable cache.
@MikeMcQuaid doing it that way would be perfectly acceptable since it would mean not specifying the version over and over in each nested resource block (think of how crazy that is if a devel spec had 20 nested go resources and we're adding version "foo-beta-314159" to every one of those blocks). The risk (and recourse, rm or fetch --retry) of a future cache collision seems to pale in comparison with the goofiness of copying the version down into every nested resource block. Allowing resources to have their own independent version is overkill.
Either SHA256 or URL based deduplication makes sense to me (and more sense than trying to parse the resource versions).
This is also fixable by using the sha256 of the resource as a file name at the expense of a solely machine-readable cache.
Also acceptable. Anything that doesn't involve copying versions around over and over is fantastic.
Anything that doesn't involve copying versions around over and over is fantastic.
I think you're exaggerating the issue, but maybe I just fail to see how much of an issue this is due to my minimal presence in homebrew/core lately.
https://github.com/Homebrew/homebrew-core/blob/master/Formula/scala.rb#L29-L47
looks ludicrous.
@UniqMartin I also think this practice of adding independent versions to resource blocks, while technically supported by the DSL currently, is barely used if at all. When I saw that in the scala formula, it immediately struck me as a bug where a contributor thought that specifying those versions was the right thing to do but no one corrected him. And indeed that appears to be the case because in the original PR where they were added, it wasn't discussed at all: https://github.com/Homebrew/legacy-homebrew/pull/40210
looks ludicrous.
Looks reasonable to me. (But obviously that's just my opinion.) It's only two extra lines for the two additional resources. And more broadly speaking it's only happening for a small minority of formulae where the version isn't automatically detected from the URL. How many of them have both a version that cannot be auto-detected _and_ a notable number of resources?
I also think this practice of adding independent versions to resource blocks, while technically supported by the DSL currently, is barely used if at all.
Of course not, just like with formula versions as they are auto-detected in the majority of cases. (See also the first part of this comment.)
Of course not, just like with formula versions as they are auto-detected in the majority of cases.
For devel specs where this nesting occurs that is simply not true. A large portion of the devel specs either get the version wrong currently or get it right only by specifying the version explicitly.
What more happens is: we notice when formula version autodetection is wrong but not for resources. I'm all for reducing the amount of DSL contributors need to write a good formula when we can automate it trivially (which we can, in this case).
What more happens is: we notice when formula version autodetection is wrong but not for resources.
This.
@UniqMartin does this look acceptable to you? https://github.com/Homebrew/homebrew-core/blob/master/Formula/smlnj.rb#L16-L128
I'm happy to close the issue if you think that looks OK.
Acceptable? Yes. Could it be improved? Certainly! Does improving this necessitate a crusade against resource versions in general? I don't think so. But maybe I'm misunderstanding and you're just exaggerating a bit to get your point across …
I'm happy to close the issue if you think that looks OK.
I don't think you should be basing this decision on my opinion alone. I just tried to make clear why they exist and why they might be a good thing. I'm certainly not going to oppose any changes that improve the situation (unless they come with major drawbacks in other areas).
Acceptable? Yes.
Then it's fine with me.
But not with me 😀
An example where the formula version is detected correctly (from the :tag at the top) and the resource version is detected incorrectly from its URL and not corrected with an explicit version corresponding to the tag:
https://github.com/Homebrew/homebrew-core/blob/8ab838a376610dffb1b4c86ab34a88dd54f7cb08/Formula/gitlab-ci-multi-runner.rb#L27-L31
which means every version bump of the formula will result in a collision for Library/Caches/Homebrew/gitlab-ci-multi-runner-86.64.tar.gz.
I've read through this issue a couple of times to understand it. In summary:
So, how about we:
resource.version altogether#{formula.name}-#{formula.version}--#{resource.name}.ext. E.g. mpv-0.24--docutils.tar.gzhashes/718c0f5fb677be0f34b781e04241c4067cbd9327b66bdd8e763201130f5175be.tar.gz -> mpv-0.24--docutils.tar.gz#{formula.name}--#{resource.name}-#{resource.version} and symlinking a hash as an initial migration step. E.g. hashes/718c0f5fb677be0f34b781e04241c4067cbd9327b66bdd8e763201130f5175be.tar.gz -> mpv--docutils-0.13.1.tar.gz.This approach solves the following problems:
Downsides:
mpv--docutils-0.13.1.tar.gz -> hashes/718c0f5fb677be0f34b781e04241c4067cbd9327b66bdd8e763201130f5175be.tar.gz,mpv-0.24--docutils.tar.gz -> hashes/718c0f5fb677be0f34b781e04241c4067cbd9327b66bdd8e763201130f5175be.tar.gz,mpv-0.25--docutils.tar.gz -> hashes/718c0f5fb677be0f34b781e04241c4067cbd9327b66bdd8e763201130f5175be.tar.gzI'm not sure which to go with here.
Does this seem reasonable? Am I missing anything in the approach? Happy to submit a PR if that makes it easier to see.
(and then I find #568 which goes into this in much more detail :D)
Remove resource.version altogether
👍
Store the file as: #{formula.name}-#{formula.version}--#{resource.name}.ext. E.g. mpv-0.24--docutils.tar.gz
As mentioned in https://github.com/Homebrew/brew/issues/568 I think it's worth storing in subdirectories by formulae if/when this changes.
Create a symlink named using the hash of file pointing to the cached file. E.g.
This makes me 😭 a bit. I'd much rather see a filename that contains a hash (partial if necessary) instead.
Maintain backwards compatibility by leaving the files as is
I don't think we need backwards compatibility on file naming.
I agree. #568 subdirectories is entirely better than what I said.
I'm pretty sure once #568 is in, this issue becomes moot. Would it be reasonable to close this issue and add "remove resource.version" to #568?
@joshka I'm game for that if @ilovezfs is?
If there's a PR for #568, maybe, but until then I don't think that makes sense.
Passing on this as cache naming should be considered an implementation detail and I'm not convinced most users care about this.
@MikeMcQuaid
Given this thread, should I care if brew audit --strict --online gives me a warning about an incorrectly detected resource version? If I add a version number in the resource block of the formula, for some reason the archive is not extracted by .install but just copied (Should I open a new issue about this?).
Given this thread, should I care if brew audit --strict --online gives me a warning about an incorrectly detected resource version?
Not really, no.
If I add a version number in the resource block of the formula, for some reason the archive is not extracted by .install but just copied (Should I open a new issue about this?).
Yes, please.
If I add a version number in the resource block of the formula, for some reason the archive is not >extracted by .install but just copied (Should I open a new issue about this?).
Yes, please.
It looks like you guys fixed it in one of the latest commits, running brew update made the issue disappear.
@albertosottile Glad to hear it, thanks for letting us know!
Most helpful comment
But not with me 😀