This is actually a summary of https://github.com/TeXitoi/structopt/issues/364
Vec<T> and Option<Vec<T>> are not required = true by default.Vec<T> is multiple = true by default which allows not only multiple values (--foo 1 2 3) but also multiple occurrences (--foo 1 --foo 2 3).Option<Vec<T>> additionally allows zero number of values (--foo).Vec<T> is not required by default is inconsistent with all the other types in clap_derive that are required by default unless they are wrapped in Option (except bool but it's a very special case).Option<Vec<T>> is different from Vec<T>, different not in "not required" sense, confuses newcomers.Vec<T> allows multiple occurrences along with values is misleading.min_values = 1 for both Option<Vec<T>> and Vec<T> instead of multiple = true, allowing only non-zero number of values and disallow multiple occurrences (--foo 1 2 but not --foo nor --foo 1 --foo 2). If a user wants to allow zero values or multiple occurrences as well, they can explicitly specify it via min_values = 0 and multiple = true respectively.required = true for Vec<T>.cc @TeXitoi @Dylan-DPC @pksunkara
The fact that Vec
is not required by default is inconsistent with all the other types in clap_derive that are required by default unless they are wrapped in Option (except bool but it's a very special case).
I strongly disagree on this. A vector can have 0 elements, and that's the most used case. If you need at least one, you can add min_values(1) yourself.
If we use min_values(0), an empty vec would represent --foo <no values>, but "no --foo" would be disallowed due to required.
Option<Vec<T>> would represent just what it does currently:
command | Vec<T> result | Option<Vec<T>> result
-- | -- | --
app --foo 1 2 | [1, 2] | Some([1, 2])
app --foo | [] | Some([])
app | ERRROR | None
The min_values and occurrences issue should be fixed in #1026. Let's leave it out of the discussion here.
The fact that Option
> is different from Vec , different not in "not required" sense, confuses newcomers.
Can you explain what this means? I don't understand the sentence.
I think we should have min_values = 0 and multiple = true for both but required = true for Vec<T>.
I really disagree. Most of the time you're using a Vec, you just want the list of parameters, and you don't really care if an empty parameter was provided. Forcing using an option of vec in the most common case would impact a lot the ergonomics.
Here is a use case where I find structopt very confusing: https://github.com/paritytech/substrate/pull/5677/commits/6b0eed4b7df70a1c88c5e17df96064c122ca8671
When I read the struct I see that listen_addr is a required argument and port is optional. But the structopt parameters shows me that both arguments are conflicting. Before that commit, the parameters weren't declared as conflicting and I had no way to know that listen_addr was actually optional. The only way I could know that is by reading the table.
I personally do think that Vec<_> should mean required (the argument must be provided) and Option<Vec<_>> should mean optional (the argument can be omitted).
I invite you to read the lengthy discussion here.
Forcing using an option of vec in the most common case would impact a lot the ergonomics.
Maybe but it will be consistent with the rest of the API.
Pff, it took me some time to get back to, sorry for the delays guys.
The min_values and occurrences issue should be fixed in #1026. Let's leave it out of the discussion here.
That issue is related here indeed, but we can't leave it out of this discussion because this is about defaults in derive, not about distinguishing between the two.
Forcing using an option of vec in the most common case would impact a lot the ergonomics.
@TeXitoi I see your point about Option<Vec<T>> vs Vec<T> ergonomics. I agree with you, simple Vec is much more handy than Option<Vec<T>> to work with.
But we wouldn't be forcing anything! Users would still be very much able to do #[clap(required = false)] if they want to keep a raw Vec
struct Opts {
// This Vec is not option so it's required BY DEFAULT
// because it's not bool and not Option
required_vec: Vec<u32>,
// We can override default behavior
// while keeping the raw Vec
#[clap(required = false)]
not_required_vec: Vec<u32>,
// We can use Option<Vec> instead of overriding
// We lose some of the ergonomics but acquire the
// ability to distinguish between <empty list> an <no option at all>
opt_vec: Option<Vec<u32>>
}
In other words, this is question of good defaults, and I believe I've found the answer.
I think that our willingness of changing defaults should depend of what the most widespread usage is in practice. I decided to check via grep.app and this has been eye opening for me:
(IMPORTANT: this research assumes that #[structopt] attributes are one-line which is is quite frequently not true. If somebody has an idea how to cover
#[structopt(
method = name,
method2 = expr, method3 = expr)
attributes with regexes, speak up!)
required Vec or not.required = true and field is Vec - 33 matches.#[structopt(...)] field: Vec - 214 matches. Therefore, the number of not required vecs is 214 - 33 = 181.min_values = 0 vs min_values = 1 (behavior of multiple = true)
min_values" and there are only 13 results. None of them are min_values = 0. A little bonus: almost all of them are min_values = 1 which is the default behavior, lol.
Conclusion: nobody wants min_values = 0. Making it default would be pointless.
number_of_values - 11 results. max_values: 1 result.multiple = true - 11 matches.multiple_values = true and ask the community after the beta is out.I think that our willingness of changing defaults should depend of what the most widespread usage is in practice.
I don't agree with this. Maybe all of them tried Option<Vec<T>> first and then decided that it didn't make a difference so they optimised it back to Vec<T>.
Also, we are not technically breaking this because this a separate library clap and not structopt. I would opt for it easier to understand w.r.t semantics in how to use the types for the fields and being different with each other rather than existing usage.
min_values should always be 0. As you saw, people saw Vec and with semantics they thought min_values might have been zero and that is why they customised the behaviour to be min_values = 1. (Related to #1682)
Note: multiple means multiple occurrences not values according to clap docs
Type | required | multiple
-- | -- | --
Vec<T> | true | false
Option<Vec<T>> | false | false
Vec<Vec<T>> | true | true
Option<Vec<Vec<T>>> | false | true
command | Vec<T> | Option<Vec<T>> | Vec<Vec<T>> | Option<Vec<Vec<T>>>
-- | -- | -- | -- | --
app -f 1 2 | [1, 2] | Some([1, 2]) | [[1, 2]] | Some([[1, 2]])
app -f | [] | Some([]) | [[]] | Some([[]])
app | ERRROR | None | ERROR | None
app -f 1 2 -f 3 4 | [3, 4] | Some([3, 4]) | [[1, 2], [3, 4]] | Some([[1, 2], [3, 4]])
app -ff | [] | Some([]) | [[], []] | Some([[], []])
By default, if one wants to support a -vvv kind of flag, they would have to use Option<Vec<Vec<bool>>>. But they always have the option of customizing stuff just by describing it using methods like how you already proposed.
#[clap(short, multiple = true, required = false, max_values = 0)]
verbose: Vec<bool>
Maybe all of them tried Option
> first and then decided that it didn't make a difference so they optimised it back to Vec .
Sounds very unlikely, but maybe. @TeXitoi , has anybody ever told you something like "I tried Option<Vec> first, but settled with Vec because it's handy"? I remember only one such message: https://github.com/TeXitoi/structopt/issues/285 .
Also, we are not technically breaking this because this a separate library clap and not structopt
I agree, absolutely. But we aren't talking just about breaking changes, we're talking about good defaults and whether it's worth to change them. If we find that the current defaults are not optimal, we shall change them, otherwise we shall leave them as is.
As you saw, people saw Vec and with semantics they thought min_values might have been zero and that is why they customised the behaviour to be min_values = 1
Eleven (min_values = 1) people out of 214 (total number of Vecs discovered). Very few people have been confused, so min_values = 1 default is preferable for Vec and min_values = 0 is preferable for Option<Vec>. In my opinion.
I like your Vec<Vec<T>> idea. We need to fix https://github.com/clap-rs/clap/issues/1026 first though. Also, in the light of https://github.com/clap-rs/clap/issues/1682, we may also specialize Vec<[T; N]> (arrays) and Vec<(T1, T2, ...)> (tuples) along with the Option<...> variants of them.
By default, if one wants to support a -vvv kind of flag, they would have to use Option
>>
This is exactly what parse(from_occurrences) is for, isn't it?
Note: multiple means multiple occurrences not values according to clap docs
This part of reply ended up being literally drenched in sarcasm, an therefore was moved to it's own comment. I'm not blaming anyone and neither I mean to offend anybody. Just stating that the state of affairs is ridiculous.
Sure thing. According to 2.x docs. Well, let's test it! Playground
extern crate clap;
use clap::{App, Arg};
fn main() {
let m = App::new("app")
.arg(Arg::with_name("arg").multiple(true).takes_value(true))
.arg(Arg::with_name("opt").long("foo").multiple(true).takes_value(true))
.get_matches_from(&["test", "val1", "val2", "--foo", "optv1", "optv2"]);
println!("{:?}", m.values_of("arg").unwrap().collect::<Vec<_>>());
println!("{:?}", m.values_of("opt").unwrap().collect::<Vec<_>>());
}
So, we expect it to error, right?
["val1", "val2"]
["optv1", "optv2"]
Wait, what?
In practice, I've seen _a lot_ of people taking multiple as synonym for "multiple values". And this is how it does work: multiple values + multiple occurrences. This is the way it's been working for, how long? Five years? People got used to it, it's in their DNA now. We can't just change the behavior (while preserving the name of the method) and expect it to be received well. And we didn't.
Well, the master is much better! Right? Right. You'd expect it to be documented properly...
Specifies that the argument may have an unknown number of multiple values. Without any other
settings, this argument may appear only *once*.
For example, `--opt val1 val2` is allowed, but `--opt val1 val2 --opt val3` is not.
Of course. "May appear only once". Crystal clear. What the function looks like?
pub fn multiple(mut self, multi: bool) -> Self {
if multi {
self.setb(ArgSettings::MultipleOccurrences); // <- YOU MUST BE KIDDING
self.setting(ArgSettings::MultipleValues)
} else {
self.unsetb(ArgSettings::MultipleOccurrences);
self.unset_setting(ArgSettings::MultipleValues)
}
}
Sarcasm aside, I very much like how it's done in master:
// multiple values + multiple occurrences
pub fn multiple(mut self, multi: bool) -> Self;
// multiple values only
pub fn multiple_values(self, multi: bool) -> Self;
// multiple occurrences only
pub fn multiple_occurrences(self, multi: bool) -> Self;
Well done. Well done indeed. No sarcasm, the API is clear and tidy. Except copy-pasting docs and examples from 2.x probably wasn't the brightest idea.
Option<Vec> is a very recent feature. No one asked anything about Option<Vec> for several years.
I've never seen any question about Option<Vec> except from @cecton as far as I remember. Just one but report saying that Option<Vec> was a breaking change because someone was using Option<Vec<u8>>.
Also, the last like of @pksunkara 's table is impossible unless user explicitly specifies takes_value = false. All of the variants should be takes_value = true by default (multiple_values implies it).
app -ff | [] | Some([]) | [[], []] | Some([[], []])
-- | -- | -- | -- | --
In that case, assume that I meant multiple_occurrences only when I said multiple in my earlier comment.
What do you mean by takes_value = true by default? Does all the types we mentioned above has that as default? Then doesn't it mean that app -f is not allowed too? Vec<T> would never be allowed 0 values.
Does all the types we mentioned above has that as default?
In clap 2: it does so explicitly, this is documented:
Setting multiple(true) for an option with no other details, allows multiple values and multiple occurrences because it isn't possible to have more occurrences than values for options
In clap 3: not explicitly, I haven't been able to locate the place this is handled in, but I just checked this program:
use clap::{App, Arg};
fn main() {
let m = App::new("app")
.arg(Arg::with_name("opt").long("foo").multiple_values(true))
.get_matches();
println!("{:?}", m.values_of("opt").map(|v| v.collect::<Vec<_>>()));
}
And yes, it works just like if takes_value(true) has been set:
D:\workspace\probe>cargo run --
Finished dev [unoptimized + debuginfo] target(s) in 0.13s
Running `target\debug\probe.exe`
None
D:\workspace\probe>cargo run -- --foo val1
Finished dev [unoptimized + debuginfo] target(s) in 0.14s
Running `target\debug\probe.exe --foo val1`
Some(["val1"])
D:\workspace\probe>cargo run -- --foo val1 val2
Finished dev [unoptimized + debuginfo] target(s) in 0.13s
Running `target\debug\probe.exe --foo val1 val2`
Some(["val1", "val2"])
app -fff from the table would have been parsed as (also checked with master)
```
app -f ff
-- --
flag value
@kbknapp Since we have you here, please take a look at this issue.
Actually, if you wrap your head around the terminology a bit, it becomes quite clear:
String is not special)bool is an example of special type, it's not required by default, Vec, Option, arrays, tuples and such can be considered "adapter" for other types. Precise behavior depends on adapter and the type it wraps.It's a problem of good documentation more than anything, about the way it conveys this setup to user. I'm amending my previous statement about required with Vec, I see no problem with it anymore. Points about number_of_values and multiple_occurrences vs multiple_values still stand.
Whew ok that was a read!
Ever time I see issues pop up around multiple I cringe a little inside. In clap 1 there was only the concept of multipe = true, and I only ever really cared about -f v v -f v v (i.e. both occurrences and values) for no other reason than having not put a lot of thought into it.
Then clap 2 came around and there was confusion when people wanted multiple values, but not multiple occurrences, or vice versa. It was also over-loaded with flags which only occurrences are possible. This spawned things like number_of_values or min_values/max_values. Again, it was super confusing for the subset of people that had these various requirements, but most just adapted and moved on.
Looking back, I wish I'd made the sane default of multiple=true and no other info being -f v -f v is allowed -f v v is not. Which is what I attempted to make the default in 3.x because this was by far the most common request, most people expected -f v -f v (minus people mostly familiar with IRC like CLIs which was a minority). It also no longer required the number_of_values=1, which in all honest looks kind of confusing if you're not already familiar with clap...."Why am I say I want multiple values, AND also saying number of values is 1?!"
I also wanted to use the 3.x breaking changes to fix this confusion once and for all. I have a much better idea of what is most common. I want the defaults to be the most common case, and allow those who need something different be able to do so.
Unfortunately before I went on hiatus I didn't finish the docs which I'm sure has lead to even more confusion around the various types of multiples, and their defaults or interactions with each other.
I see multiples as one of those things, "It sounds sooo simple at first, why can't it just work!?" Until you dig into the details and realize, "Ooooh. Yeah I can totally see how some would want X while others need Y, and you need to distinguish between them sometimes, but not always." It's kind of like Strings in that manner; such a simple concept yet so hard to do correctly. ...Or CLIs for that matter :stuck_out_tongue_winking_eye:
Ok, so back to the topic at hand:
I find myself agreeing with @CreepySkeleton here. We're talking about sane defaults. I can also totally respect @TeXitoi's sentiment of Option<Vec<T>> is much more of a pain to deal with than Vec<T>. However I disagree that all options should default to 0 values. In fact, I've come across very few CLIs that rely on --foo <no value>. They do exist for sure, but most I've run into either expect a value, or expect no argument to be used (with or without a default value).
So I think this comes down to balancing Rust ergonomics and semantics (Vec<T>, where all Vecs can have zero values by default) with clap ergonomics and semantics (Option<T> means optional argument, thus Option<Vec<T>> means optional option which accepts multiple values...combined with in clap speak all options by default must have at least 1 value). I think it's perfectly reasonable to impose clap defaults/semantics here. People are building a struct specifically around claps semantics, so I think that although Vec<T> can totally have zero or more values by default the fact that it does not lining up with claps 1 or more values by default is fine and people will first look to their knowledge about clap semantics when building these structs. This is in part because there is no intrinsic Rust type that is one or more values.
I would suggest simply calling it out in the docs directly, that although the default is Vec<T> means required = true, if you want the ergonomics of Vec<T> just override the required attribute with false
Also, perhaps I'm out of touch, but if I wanted to allow multiple values with occurrences via derive in clap I'd instinctivily try to do Option<Vec<Vec<T>>>
| command | type |
| -- | -- |
| -f v -f v | Some([v, v]) |
| -f v v | Some([v, v]) |
| -f v v -f v v | Some([[v, v], [v, v]]) |
app -ffffrom the table would have been parsed as [-f=ff]
Correct. That is intended behaviour due to how parsing is structured. I don't think there is anything we can, or would want to do about this.
It's a problem of good documentation more than anything, about the way it conveys this setup to user.
100% agree.
Having said all that, if we settled on Vec<T> defaulting to not-required because Rust ergonomics are more important, it's not a huge issue to me so long as it's documented very clearly.
I just came here from https://github.com/TeXitoi/structopt/issues/396 and wanted to say, I really hope in a v3 breaking change, Clap can address the multiple / number_of_values confusion, both for the API and for the new derive macros.
Looking back, I wish I'd made the sane default of multiple=true and no other info being -f v -f v is allowed -f v v is not. Which is what I attempted to make the default in 3.x because this was by far the most common request, most people expected -f v -f v (minus people mostly familiar with IRC like CLIs which was a minority). It also no longer required the number_of_values=1, which in all honest looks kind of confusing if you're not already familiar with clap...."Why am I say I want multiple values, AND also saying number of values is 1?!"
+100 this bit. It really sticks out as a trap in what's generally a friendly API, and is exacerbated by the fact that you can easily not notice the bug is there.