Go: flag: panic if flag name begins with - or contains =

Created on 5 Oct 2020  Â·  27Comments  Â·  Source: golang/go

Proposal

Currently, the flag package allows any string to be used as the name of the flag. It only validates whether the name of the flag is already registered. (code)

But, if the name of the flag is (1) empty, (2) starts with a hyphen, or (3) contains an equal sign, the flag can not be parsed properly. If you try to use the flag as described in the usage message, it will fail. (code)

So, I suggest that the flag package validates the name of the flag.

+ https://github.com/golang/go/issues/41792#issuecomment-704401130

And these are invalid flag names:

  • Empty String:

    • If the flag name is an empty string and is used in the -flag x form, the flag starting with hyphen('-''') can be the non-flag argument and the flag starting with double hyphen('--''') can be the terminator.
    • > Flag parsing stops just before the first non-flag argument("-" is a non-flag argument) or after the terminator "--".
  • Starting with hyphen:

    • According to the document the flag should start with '-' or '--'. But, if the flag name starts with hyphen, this package can’t distinguish between '-''-flag' and '--''flag'.
  • Containing equal sign:

    • This package uses equal sign to split the flag into a name-value pair. So, if the flag name contains equal sign, this package can't split it properly.
    • e.g. If the flag name is 'foo=bar' and the flag is '-foo=bar=value', this package can't find out the actual value.

Example

playgroud

package main

import (
    "flag"
    "fmt"
)

func main() {
    fs := flag.NewFlagSet("", flag.ExitOnError)

    emptyString := fs.String("", "", "empty string")
    hyphen := fs.String("-hyphen", "", "hyphen")
    equalSign := fs.String("=equalsign", "", "equal sign")

    fs.PrintDefaults()
    // Output:
    //  - string
    //        empty string
    //  --hyphen string
    //        hyphen
    //  -=equalsign string
    //        equal sign

    _ = fs.Parse([]string{"-=foobar"}) // bad flag syntax: -=foobar
    // _ = fs.Parse([]string{"--hyphen=foobar"}) // flag provided but not defined: -hyphen
    // _ = fs.Parse([]string{"-=equalsign=foobar"}) // bad flag syntax: -=equalsign=foobar

    fmt.Println(*emptyString, *hyphen, *equalSign)
}
Proposal Proposal-Accepted Proposal-FinalCommentPeriod

Most helpful comment

OK, retitled to be about panicking for names beginning with - or containing = signs.
Does anyone object to that?

All 27 comments

Since we have to check the error at flag.Parse time, I just want to point out that this could potentially break some currently working programs. Admittedly those programs are doing something odd.

@ianlancetaylor
Besides validating at flag.Parse time, I think we have other options.

  1. Validating at flag.Var time: Since the flags are defined at this point, I think this is the best time to validate. But, there is no way to report errors except for panic.
  2. Validating at flag.Parse time: Since the flags are parsed at this point, It’s not more appropriate than the method above, but it’s still a pretty good time to validate. And it also can handle errors according to the error handling method received through flag.NewFlagSet.
  3. Create flag.Validate function, and make validation optional: I don’t think it’s a great way, but it’s feasible.


IMHO, as long as there are programs that are using this package incorecctly, some breaks are inevitable. We will have to choose between future safety and backward compatibility.

Nearly all flag names are constants, and most are even literals. Would a vet check make sense?

vet checks are supposed to meet three requirements: correctness, frequency, and precision. I think correctness and precision is not an issue here, but this doesn't seem like a frequent problem.

First we'd need to know what a valid flag name is or is not.

This seems like a solution looking for a problem.

People also use the flag package in interesting ways that never call Parse (like using vars as an easy way to go from string to any value, I have seen this multiple times in configuration systems). In these cases the name is essentially irrelevant, often either the empty string (used when the flagset is irrelevant) or some other arbitrary string based on the property name (when multiple vars are added to a single flagset), either of which can break the naming rules suggested at both compile and run time without being an error. For this reason checking at Var time or with a vet check both seem like a bad breaking change to me.

Name rules only apply to Parse, so that should be the earliest the check can happen. You would still need a more complete description of exactly what the rules are though, which presumably would be based on the actual behavior of Parse itself, any var it can never assign could probably produce an error.
Even this is still a dangerous breaking change, if there is some library out there registering a global flag with a bad name it could break everybody that depends (even indirectly) on that library with no easy workaround. Which means you probably have to add an option to enable/disable the check at the least for those cases, and now we are getting into the weeds of increasing API surface and complexity for questionable gains.
The only things it would catch are flags that nobody has ever even tried to use, which seem like a fairly low priority problem to fix.

My conclusion is if we were starting from nothing I would agree with the checks being added to Parse, but from where we are right now it does not seem like an overall beneficial change to do anything at all.

First we'd need to know what a valid flag name is or is not.

This seems like a solution looking for a problem.

@robpike
IMHO, if the flag name does not cause any problems in all forms listed in the document, it’s a valid flag name. Otherwise, it's an invalid flag name.

And these are invalid flag names:

  • Empty String:

    • If the flag name is an empty string and is used in the -flag x form, the flag starting with hyphen('-''') can be the non-flag argument and the flag starting with double hyphen('--''') can be the terminator.

    • > Flag parsing stops just before the first non-flag argument("-" is a non-flag argument) or after the terminator "--".

  • Starting with hyphen:

    • According to the document the flag should start with '-' or '--'. But, if the flag name starts with hyphen, this package can’t distinguish between '-''-flag' and '--''flag'.

  • Containing equal sign:

    • This package uses equal sign to split the flag into a name-value pair. So, if the flag name contains equal sign, this package can't split it properly.

    • e.g. If the flag name is 'foo=bar' and the flag is '-foo=bar=value', this package can't find out the actual value.

flag.FlagSet.Var, which ends up being what registers any flag no matter what you use to create the flag, already panics if the same flag name is registered multiple times. The name passed to the second call is invalid because it is already registered, so the call panics with a "flag redefined" message.

I doubt it would come up much, but it doesn't seem like a problem to me for the call to also panic if the name is (1) empty, (2) starts with a hyphen, or (3) contains an equal sign. Those are all clear mistakes.

Is there any code we can point to where this mistake was made? It seems like the only case where you could make the mistake and not notice would be some system that sets a bunch of flags automatically and the source strings happened to be invalid. Are there real world examples of that?

I too am interested in how this comes up in practice. @KimMachineGun?

@carlmjohnson @rsc
I've not seen code that defines flags in this way. I just found this problem while writing test code for my library flago and thought it was a bug that had to be fixed. (since it cannot satisfy the specification in the document!)

+ In my opinion, the behavior of the code and the document (and specification) should always be in sync, especially open source.

The empty string seems unlikely, but I could see someone unfamiliar with how the flag package works trying to use flag.Bool("-b") to add a -b flag. A panic with a clear message would be better than silently installing an inaccessible flag.

Does anyone object to adding this panic (like duplicate registrations currently panic)?

As I mentioned, https://play.golang.org/p/wcv-qYOIGst is a perfectly valid and working program that this would break, recreated from something I am sure I saw in the wild but can no longer remember where.
I don't think this would be an okay breaking change, I would be happy with a panic added to Parse.

Since this problem is caused by 'defining' a flag with an invalid flag name, panicking at the time of flag 'definition' is more appropriate than panicking at the time of flag parsing. And defining a flag with an invalid flag name is entirely the fault of the author of the program, not the user of the program. But, the error that occur during flag parsing seems like the user's fault.

@ianthehat
I understand your concerns. But, IMHO, solving the problem in a proper way is more important than keeping compatibility for the programs that use the flag package in a strange way.

@ianthehat I don't have much sympathy for a program that creates a flag named "".
But I also can't tell what your opinion is: the two halves of this line seem to say opposite things:

I don't think this would be an okay breaking change, I would be happy with a panic added to Parse.

Why not? It was reasonable as a way to get a flag.Value for things that take that as a parameter (notably things that do config file handling, a flag.Value does not have a name). I agree it is trivial to work around (pick any non empty string in it's place), but it was not against the documented spirit of the existing API so changing it is a breaking change for any code that does that right now.

My opinion is:
â›” : adding a panic to FlagSet.Var is not okay, a breaking change of questionable benefit
🆗 : adding a panic to FlagSet.Parse is fine, an unnamed flag is a hidden problem

@andybons says safehtml does this (takes a *flag.Value as a parameter), for what it's worth. Still seems weird.

@andybons says safehtml does this (takes a *flag.Value as a parameter), for what it's worth. Still seems weird.

Specifically: https://pkg.go.dev/github.com/google/safehtml/template#TrustedSourceFromFlag

Another option is to make flag.Bool etc panic but not if you're using a custom FlagSet.
That would catch essentially all the easy cases without affecting "advanced" uses.

FWIW I grepped my Go corpus for flag.(Bool|Int|String...)\("" and turned up exactly one instance of this mistake, in my own code:

github.com/rsc/[email protected]/c2go/main.go:28:  strip   = flag.String("", "", "strip from input paths when writing in output directory")

Apparently I never used that flag!

I would be fine with any call that directly ends up in flag.CommandLine.Var causing a panic.
You could make it an option on a FlagSet that is on by default for that FlagSet and off by default for ones returned by NewFlagSet
The other upside of this is that if there is a library that declares an empty flag in your dependency chain you you now have a viable recourse to still use it while you wait for it to be fixed (flip the value on the main flagset before that initializer is reached)

@ianthehat, I am still not convinced about this empty string case, but maybe we should set that aside.
I don't really want to change behavior based on _which_ set it is.
What about letting "" go through as a way to "comment out" a flag or bypass (sketchily) safehtml's API,
but panic if the flag begins with - or contains =?
Those are definitely wrong and would be confusing.
The thing I'm most worried about here is a new Go programmer debugging why flag.Bool("-b", ...) doesn't seem to be registering -b.
I think we have a clear opportunity to be helpful in that case.

I would be more okay with that, I think the empty string is the only truly special case, all other uses I have seen that were not using parse were still (I think) using valid flag names (sometimes by accident, because they were using things like valid field names for instance)

OK, retitled to be about panicking for names beginning with - or containing = signs.
Does anyone object to that?

Based on the discussion above, this (panic on names beginning with - or containing =) seems like a likely accept.

Change https://golang.org/cl/271788 mentions this issue: flag: panic if flag name begins with - or contains =

No change in consensus, so accepted (for Go 1.17).

Was this page helpful?
0 / 5 - 0 ratings