After a discussion on the Stack issue tracker, it was revealed that the Cabal flag name requirements are rather lax, and currently seem to allow both sequential separators (- and _), as well as leading and trailing separators. For example, this command line works just fine:
cabal configure -f -totally--positive__
NOTE: this is using:
cabal-install version 1.24.0.2
compiled using version 1.24.2.0 of the Cabal library
However, it's unclear how to set this from the command line to either true or false. If I add this line on its own to my cabal file:
flag -totally--positive__
I see the following in my shell session:
$ cabal configure -f -totally--positive__
Resolving dependencies...
Configuring foo-0.1.0.0...
cabal: '--exact-configuration' was given, but the following flags were not
specified: FlagName "-totally--positive__"
$ cabal configure -f +-totally--positive__
Resolving dependencies...
Configuring foo-0.1.0.0...
cabal: '--exact-configuration' was given, but the following flags were not
specified: FlagName "-totally--positive__"
$ cabal configure -f --totally--positive__
Resolving dependencies...
Configuring foo-0.1.0.0...
As you can see, ultimately the double-leading-dash was able to disable the flag, but no combination of + and - I tried enabled the flag.
My recommendation: strengthen to parser to behave like the package name parser, disallowing leading, trailing, or sequential separators. Next best thing: remove leading separators.
There is no reason to disallow sequential separators, but we will disallow (because it's confusing) and discourage non ascii characters (cabal check), i.e. effectively making flag parser [a-z0-9_][a-z0-9_-]*.
Note: flag names are case-insensitive.
"There is no reason" is not a true statement. It may be your opinion, but there are certainly reasons. One reason: the disparity between flag and package name syntaxes is confusing. You may think such confusion is warranted, but please don't imply that I have no reasons for making my request. Thank you!
I understand that it would be more elegant to have identifier parser and derive package, flag, executable, test-suite and other name parser from it, but I'm quite sure we cannot retrofit that anymore.
I'd turn this issue around, the package name is special, others not so:
foo-0 would look like package identifier (i.e. name + version)But there's no reason to disallow writing, it might be silly, but someone might want to do so.
executable funny--0
main-is: Main.hs
...
We can discourage*double dash in flag names, via cabal check, but I personally don't see that as bad as non-ascii and leading dash. (Which I check in #4687)
@phadej I'm having a difficult time discussing this with you, as you are repeatedly telling me that there's no reason for this, despite the fact that I've told you otherwise. If you're telling me that, in fact, this decision is not open for discussion and you are making a call without allowing my to have any input, then please clarify. Stating that my reasons do not exist is not appropriate.
Is there any technical reason to disallow flag--names. Yes, it's aesthetically unpleasant, and might confuse tool implementors. First one is purely subjective reason, and the second one should be addressed with clear spec and proper library support, whatever the syntax is.
Background: The issue with stack happened, because there weren't any FlagName format specification. The current implementation is confusing (I don't know where to look), e.g. https://github.com/haskell/cabal/blob/267efc85ebe52cb04ba0a4513cb28cfd18bd7d92/Cabal/Distribution/Types/GenericPackageDescription.hs#L137-L153 implies that it's [a-z0-9_-]+, disallowing leading - (check). I'm fixing this as part of #4654, by writing current behavior in the spec.
Library support: I'm working on parsec based Cabal parser, and in Cabal-2.2 there will be a parsec parser for FlagName, which stack (and other tools) could reuse.
To be clear: -totally--positive__ will be invalid soon, but totally--positive__ would be still ok.. Trailing or sequential dashes doesn't trigger any cli usage issues, only leading ones.
Fwiw, I don't even see much of a big deal either regarding leading hyphens (and that's though it was me who originally brought up the suggestion to discourage leading hyphens in flag names).
My rationale for suggesting to discourage the use of leading hyphens in flag names is not so much a technical one (there's always a way to non-ambiguously express +/- based flag constraints in the presence of leading hyphens; it's no big deal to make the grammar definitive here), but rather one of usability. But one can at least argue there's a modest bit of inconvenience where the non explicitly +/- tagged forms of the flag toggling syntax are allowed, as well as possibly some legacy compat concerns.
But as for sequences of dashes or underscores or a flag name made up of a sequence of as, such as aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa, we'd end up having to debate where to draw the line regarding a flag naming style-guide. As much fun as that may sound to some of us, I don't want to go down that road, and I'd only consider actual technical reasons (which don't involve YAML) for things beyond discouraging leading hyphens.
Is there any technical reason to disallow flag--names
Given that my reasons have been rejected until now, I'm not sure what I can say that will pass your bar of "technical reason." So let me turn it around: you're currently tightening the restrictions on flag names; what technical reasons do you have for that?
My recommendation: strengthen to parser to behave like the package name parser, disallowing leading, trailing, or sequential separators.
FTR, I'm okay with this, but we should check that there are no .cabal files on Hackage that we no longer can parse due to this change. If it is the case, then we should reject old flag name syntax only for cabal-version: >= 2.2 files, and add a warning for older ones.
@snoyberg you propose tightening the restrictions on flag names, not we.
Cabal check warning about unicode is indeed subjective (And I could actually add it for package names as well).
Leading dash is exactly what you proposed.
Quoting you above:
There is no reason to disallow sequential separators, but we will disallow (because it's confusing) and discourage non ascii characters (cabal check), i.e. effectively making flag parser
[a-z0-9_][a-z0-9_-]*.
Perhaps I'm misunderstanding you, because that comment stated you wanted to disallow non ascii characters. If you mean that you just want to add a warning for that, that's fine. If your argument is that banning previously valid flag names breaks backwards compatibility (not explicitly stated, but somewhat implied above), that's a discussion to have. As I see it:
- in https://github.com/commercialhaskell/stack/issues/3345#issuecomment-322148628cabal checkcabal check (which your comments imply strongly there is), then at the very least I would recommend cabal check have a warning.For the whole time I was talking about cabal check warning. I do mention cabal check, I don't imply changing parser.
.cabal anyway.constraints will work), there is no technical restriction to be lax.~ There is no proper reason to start allowing leading-dash flags.cabal check will cause warning if flag starts with a dash, so Hackage won't accept packages with such flags. (AFAIK there isn't any such packages).-lead-dash (and unicode) flags in their private projects, if they like so.~4687 shows that removing leading dash is a bit problematic, as it breaks things a bit.
Can you please briefly summarise what the problems are here?
EDIT: I was confused, you can declare flag -foobar, but you cannot really use it:
% cat lead-dash.cabal
name: lead-dash
version: 0.1.0.0
synopsis: Lorem
description: Lorem Ipsum
license: BSD3
license-file: LICENSE
author: Oleg Grenrus
maintainer: [email protected]
category: Testing
build-type: Simple
extra-source-files: ChangeLog.md
cabal-version: >=1.10
flag -foobar
manual: True
default: False
library
exposed-modules: LeadDash
build-depends: base >=4.9 && <4.11
hs-source-dirs: src
default-language: Haskell2010
if flag(-foobar):
build-depends: base >=4.10
[FL973] ~/mess/lead-dash % /opt/cabal/1.24/bin/cabal configure -f+-foobar
Warning: The package list for 'hackage.haskell.org' does not exist. Run 'cabal
update' to download it.
cabal: lead-dash.cabal:24: Parse of field 'if' failed.
Edited my previous~2 comment.
Hackage will be strict, but users can have
-lead-dash(and unicode) flags in their private projects, if they like
Just chiming in to say that these differences between Hackage and Cabal are annoying to deal with. It means we end up with the actual standard (although there's no standard in this particular case) as well as the de facto standard on Hackage. This has happened before with the PVP: https://github.com/haskell/pvp/issues/4#issuecomment-269637656
@tfausak note, that this issue is partly invalid. You could declare flag -foo but you couldn't use it in the .cabal file.
The only PVP cabal check is about omitted upper bound on base, so please keep that argument out of this discussion.
I brought up the PVP issue because it's an example of the actual spec (PVP) not matching the spec in practice (Hackage). I dislike situations like that and would like to avoid it here.
Currently Cabal's flag parser matches [-_\p{Letter}|\p{Number}]+. The proposed solution to this issue, #4687, tightens that parser but only for cabal check. That means the Cabal flag "spec", such as it is, allows flags that Hackage rejects.
I think this is a problem because it would be reasonable for, say, Stack to implement a Cabal flag parser that matches the de facto Hackage spec. But then a user could try to make a flag like --⓪__א__⓪-- in a private/local package, see that it fails in Stack, and complain.
Please elaborate how Hackage practices and PVP disagree?
Sent from my iPhone
On 14 Aug 2017, at 18.46, Taylor Fausak notifications@github.com wrote:
I brought up the PVP issue because it's an example of the actual spec (PVP) not matching the spec in practice (Hackage). I dislike situations like that and would like to avoid it here.
Currently Cabal's flag parser matches [-_\p{Letter}|\p{Number}]+. The proposed solution to this issue, #4687, tightens that parser but only for cabal check. That means the Cabal flag "spec", such as it is, allows flags that Hackage rejects.
I think this is a problem because it would be reasonable for, say, Stack to implement a Cabal flag parser that matches the de facto Hackage spec. But then a user could try to make a flag like --⓪__א__⓪-- in a private/local package, see that it fails in Stack, and complain.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or mute the thread.
So, to summarise:
-- and __) will still be allowed because there are packages on Hackage using this syntax (example). I don't think that removing support for them (which will have to be conditional on cabal-version, as mentioned above) buys us much, but I'm open to be persuaded.From the linked issue: The PVP allows package-1 and package-1.0 to co-exist. Hackage disallows those packages because the two versions are confusing to use in bounds. Specifically, package >= 1 and package >= 1.0 don't mean the same thing.
For this issue: Cabal allows flag -foo. Hackage would disallow that via cabal check.
Yeah, I've long wanted to fix the Eq instance for Version, but that didn't happen because reasons.
Only two packages have flags that run afoul of @snoyberg's proposed new rules:
bytestring--lt-0_10_4, which has two hyphens in a row.abbrev_wn_and_tn_concrete_syntax_to_number_alone__safe_only_to_depth_19 and abbrev_wn_and_tn_concrete_syntax_to_single_digit__can_only_express_down_to_depth_9, which have two underscores in a row. (And at 71 and 82 characters respectively, these are the longest flags.) No flag has any non-ASCII characters. In other words, every flag matches ^[-_a-z0-9]+$. (The Cabal parser converts flags to lowercase, so I can't say anything about the cases of flags in general. Regardless, everything would match that regex in a case-insensitive manner.)
I got this information from this script: https://gist.github.com/tfausak/009829dc73386b0779a822b4137cba18/3eb1319662f3c214c6c75057a603757347c81543
And the reasons are in https://github.com/haskell/cabal/pull/3515#issuecomment-231714012.
FWIW, if @tfausak wants to fix PVP, you should open a PR for https://github.com/haskell/pvp about what precisely _B MUST be greater_ means, when Bold is absent. I did open the original issue (https://github.com/haskell/pvp/issues/4) but I lost steam to proceed.
I fully understand the PVP problem already. I only brought it up as an example of the actual spec (PVP) not matching the de facto spec (Hackage). I don't want to derail this thread; let's stay focused on Cabal flags.
@tfausak Thanks! So we can remove support for trailing separators as well.
@hvr Can you please elaborate a little bit here about your vague future idea for using leading underscores for something, so that we could decide on whether to leave them in or remove them.
@23Skidoo To be honest, I don't see any benefit of arbitrarily limiting the lexical syntax to begin with. Usually the burden of proof lies with the party that proposes the change, and I haven't see any compelling reason to impose this arbitrary limitation so far. In fact, any arbitrary exception we add makes the regexp only more complex. Initially the regexp was assumed to be [[:alnum:]_-]+ which was elegant and simple. The only concession I have suggested and which I considered sensible is the exception to disallow leading - for the sake of CLI usability and compat concerns (as it turned out, leading - have compat issues anyway). So that leaves us with [[:alnum:]_][[:alnum:]_-]*. Every other suggestion I've heard so far has no practical benefit whatsoever; it surely doesn't facilitate parser implementations, nor does would it provide any performance benefits, nor does it help to make grammars less ambiguous. If we arbitrarily disallow trailing [_-], why not also disallow trailing numbers, or maybe while we're at it, also disallow certain characters. So please excuse me if this all sounds rather absurd to me, and most importantly since we're considering arbitrarily disallowing some trailing characters which doesn't solve any actual technical problem to begin with, but in the contrary even complicates the implementation and its specification, which ironically makes this even more confusing for users to understand what are considered valid flag names. I already regret having even mentioned the suggestion to disallow leading hyphens, given how this escalated. TLDR: Let's just agree on [[:alnum:]_][[:alnum:]_-]* and move on to another topic.
It sounds like this ship has already sailed, but I'll say it again: I think that having the actual spec (the Cabal file format) differ from the spec in practice (Hackage) is a bad idea. Obviously there's some benefit in restricting flag names, otherwise #4687 wouldn't exist. Why not codify that in the spec rather than allowing it locally but rejecting it on upload?
I also disagree with this appeal to regex simplicity. Why not match tags with \S+? What could be simpler?
To clarify, I don't feel very strongly about this stuff, and am fine with keeping [[:alnum:]_][[:alnum:]_-]*, but if someone wrote a patch tightening the allowed syntax, I wouldn't be opposed to merging it.
I also find @tfausak's point about keeping Hackage-accepted and Cabal-accepted formats as close as possible persuasive (though this is not always possible because of historical reasons, like with the Version type where we need to be able to distinguish foo-1 from foo-1.0 because there are already such packages on Hackage).
I see that @phadej merged #4687, which checks for invalid flags with cabal check but allows Cabal to parse them. Is there any way that I can either (a) appeal this decision and revert the changes to cabal check, or (b) encourage changing Cabal to outright reject flag names that cabal check would warn about?
Regardless, it seems like this issue should be closed as wontfix, based on the comments so far.
Meta: this issue is Leading dashes in flag names, so technically this issue is fixed.
Every other change would benefit from being a separate issue.
But... it's not fixed. https://github.com/haskell/cabal/pull/4687#issuecomment-322213560:
-foobaris still allowed inGenericPackageDescriptionparser, but you cannot really use it.
This is the Cabal issue tracker, not the Hackage issue tracker.
encourage changing Cabal to outright reject flag names that
cabal checkwould warn about?
I'd accept a patch that did that.
@tfausak that comment is outdated. See https://github.com/haskell/cabal/blob/master/Cabal/Distribution/Parsec/Class.hs#L129-L134
@phadej: Hopefully you can forgive the error since that code came from an entirely separate PR (#4654, via 90b848a4dea40b5f034ea073922ae4a2d9ef9899) that I didn't know about.
@23Skidoo: I'll start working on a PR.
@tfausak, yes, sometimes speed of Cabal development surprises us!
% cat lead-dash.cabal
name: lead-dash
version: 0.1.0.0
synopsis: Lorem
description: Lorem Ipsum
license: BSD3
license-file: LICENSE
author: Oleg Grenrus
maintainer: [email protected]
category: Testing
build-type: Simple
extra-source-files: ChangeLog.md
cabal-version: >=1.10
flag -foobar
manual: True
default: False
library
exposed-modules: LeadDash
build-depends: base >= 4.9 && < 4.11
hs-source-dirs: src
default-language: Haskell2010
% /home/ogre/Documents/other-haskell/cabal/dist-newstyle/build/x86_64-linux/ghc-8.0.2/cabal-install-2.1.0.0/build/cabal/cabal check
Warning: lead-dash.cabal:14:1:
unexpected "-"
expecting white space: "-foobar"
cabal: Failing parsing "./lead-dash.cabal".
btw, Could you add <?> "Flag name", that should improve error messages
OK, since the main issue here (leading dashes in flag names) has been resolved, I'm closing this as fixed; if anyone wants to also disallow leading underscores and trailing separators, please create a PR and we'll discuss it there.
Most helpful comment
It sounds like this ship has already sailed, but I'll say it again: I think that having the actual spec (the Cabal file format) differ from the spec in practice (Hackage) is a bad idea. Obviously there's some benefit in restricting flag names, otherwise #4687 wouldn't exist. Why not codify that in the spec rather than allowing it locally but rejecting it on upload?
I also disagree with this appeal to regex simplicity. Why not match tags with
\S+? What could be simpler?