My understanding is that flag names are supposed to be case-less, i.e. they are normalized to all-lowercase in the Parsec instance of FlagName. This feels like an awkward place to do this, however. Why doesn't that conversion take place in mkFlagName? As it is now, a construct like
mkFlagAssignment [(mkFlagName "foo", True), (mkFlagName "FOO", True), (mkFlagName "Foo", False)]
gives
fromList [(FlagName "FOO",(1,True)),(FlagName "Foo",(1,False)),(FlagName "foo",(1,True))]
instead of:
fromList [(FlagName "foo",(3,False))]
That doesn't feel right?
cc @hvr
@hvr Ping.
Caseness is tricky (it's dependent on Unicode standard), the differences are very subtle. I'd rather make flags case-sensitive in cabal-version: 3.0
(Hackage doesn't accept non-ascii flags, but that's its choice)
EDIT case-sensitive, not insensitve in 3.0
I've extended the test suite in https://github.com/haskell/cabal/pull/5082 to verify that flag names are case-less.
Personally, I believe flag names should be case-less (and they should be constrained to plain ASCII, i.e. no Unicode). A change that behavior now or in the future would be a rather nasty surprise to everyone who's relied on the current behavior in their scripts and Cabal config files.
@phadej I'd limit case-insensitivity to the ASCII subset; but everything outside the ASCII subset ought to be case-sensitive; DNS case-folding does something like that as well. However, I'd rather make flags completely case-sensitive in cabal-version:3.0 to make things more consistent with the case-sensitivity of package names, and eschew the subtle issues (e.g. broken Leibniz equality expectations) of case-folding altogether.
Why are we not doing the obvious here and limiting the flag names to ASCII? (Plus, case folding)
Is there any plausible use case for non-ASCII flags, especially given that Hackage doesn't accept them per @phadej ?
Yes, it was discussed before: https://github.com/haskell/cabal/pull/4696#issuecomment-322504299.
I don't see much of an argument there, I'm afraid.
It's just "I want to be able to do X". There doesn't seem to be any reasoning as to why the restriction would be a problem. Yes, I understand that some programming communities may want to prefer being able to use whatever characters they want, but I can't see it would actually be a problem to just say "sorry, not supported".
This seems to me to letting perfect be the enemy of good and already seems to have caused quite a bit of frustration; see e.g. https://github.com/haskell/cabal/pull/5082
Oh well. (I'm not the one who's frustrated by this, but it seems pretty absurd on the face of it to drag our collective feet for such an incredibly marginal feature where simply saying "case insensitive ascii" would suffice.)
Personally, I think the philosophizing about the design space for Cabal 3.0 is great and I am all for it. Let's go full Unicode everywhere. Looks good to me!
What I don't understand is why any of this stands in the way of fixing issues that exist today. That frustrated me. I care more about what Cabal 2.2.x is doing than I care about some other major release that may or may not come out 3 years down the road. I too think that this conversation is a pretty good example of letting perfect be the enemy of good.
Indeed, I don't see any problem with thinking ahead, but holding up working code/fixes because we might do a theoretically better version in the future is a Bad Idea(TM). If a fix needs to be reverted later, fine let's do that! It's usually a complete non-issue in any sane development process to do that. Of course one shouldn't be completely reckless in accepting anything and everything (nor if there are concrete plans for a different approach), but I mean...
I also see this type of thing being an increasingly problematic part of the Cabal development process, regardless of what particular feature we're talking about. (Hence why I brought it up.)
I confess that in this case I don't see the argument. This isn't a proposal to change or fix the behavior of cabal -- it is a proposal for a refactor as far as I understand it? If this is correct, then we should judge refactors not by what they fix "today" (because definitionally they leave behavior unchanged) but rather how they open the way or close the way to future changes and extensions. As such, speculation about future design decisions seems very much on topic?
If I'm mistaken, and there's some wrong behavior that we would like to fix sooner (and not just awkwardness in the implementation of behavior we'd like to leave intact) then please correct me.
If I'm mistaken, and there's some wrong behavior that we would like to fix sooner (and not just awkwardness in the implementation of behavior we'd like to leave intact) then please correct me.
The wrong behavior I wanted to correct is that users who parse a FlagName from String get a value that compares independent of case against other FlagNames, because the parsers normalize the flag to all lower-case. Users of the Cabal library who construct a FlagName with the mkFlagName function, however, will get a value that may contain upper-case characters because the constructor does not normalize the name.
Ah, understood! So the issue is that different Cabal specs may enforce different normalization of things like flagnames, but we want them to share the same constructors (including smart constructors). At the risk of more work, perhaps the latter assumption is what's at issue -- i.e. "smart" constructors should be, as a rule, smart. And if smart constructors can't be the same smart across different specs, then we should have a smart constructor _per spec_ in such a case?
Yes, that is exactly what I mean by letting the "perfect be the enemy of good". We cannot fix an obvious inconsistency in the API today, because we'd rather discuss a grand re-organization of the API that will materialize a decade from now, if ever, and meanwhile people who readGenericPackageDescription a Cabal file from disk and wonder whether if defines mkFlagName "InstallExamples" will get False as an answer no matter what they try even though when they look at the file it clearly does define that flag. Duh!
@peti I think you misunderstood my comment. I was suggesting a criteria under which fixing this today (to make smart constructors actually smart) would be correct, and that the way to deal with the concern if we change the spec for 3.0 would be to _at that point_ have a smart-constructor _per spec_.
I.e. I was trying to propose and approach and criterion for correctness under which making the change now makes sense, and also under which the change now would _continue_ to have made sense, even if the spec changes (because at that point we would keep the existing smart constructor, perhaps renamed, and _also_ introduce a new one).
I think #5082 should be accepted if it's changed to act only on ASCII flags, since that IMO was the main objection from @hvr and @phadej (if we forget about future spec changes for now).
@gbaz, oh, I guess I misunderstood your comment, indeed. I am sorry.