Consider
type E struct {
A int
}
type T struct {
E
}
This works:
T{E: E{A: 1}}
This does not:
T{A: 1}
Makes some struct literals more verbose than they need be, and makes them asymmetrical to their usage (where you can access the embedded struct's fields directly).
Can we allow it?
(cc @bradfitz)
One possible argument against is that it allows you to effectively specify the same field twice, for example:
T{E: E{A: 1}, A: 2}
But this case already exists
S{B: 2, B: 3}
and is disallowed, so maybe that isn't worth worrying about.
FWIW, @adg and I both tripped over this independently. We both assumed T{A: 1} would work.
I've wished to be to able to do this many times (go/ast is a prime example).
I don't know that we ever thought through all the consequences, but it's perhaps worthwhile considering.
Is there any known reason it wasn't considered back in #164?
@dominikh issue #164 was about a "programmer error" or misunderstanding of the spec as written.
Usually we don't look at each such error and consider a spec change. However this has come up before (it has restricted what I wanted to do in APIs significantly) so maybe it's time to at least investigate the consequences of such a language change more thoroughly.
That said, language changes are really very low priority. We'll get to it when we get to it.
r's comment on this on golang-nuts was:
It might one day, but as it stands the requirement to provide more information is more robust against changes in the data types.
I have also written code that assumed this would work.
Considering the reading and assignment of embedded fields work it is quite unintuitive that this doesn't.
@neild and I were discussing this and there are two things that would have to be addressed (not showstoppers, just considerations to make):
1) What happens if both the anonymous field containing the embedded field and the embedded field itself are specified?
T{E: E{}, A: 42}
This is probably a compile error, much like specifying duplicate fields in struct literals today.
2) What happens if the anonymous field is a pointer type? Should setting an embedded field cause the anonymous field to be implicitly allocated?
I also suspect that this could be done before Go 2, since it makes previously invalid Go programs valid, and shouldn't alter the meaning of existing Go programs.
How much should we be consistent with assignments? If 1 is a compile error should the following also be a compile error?
https://goplay.googleplex.com/p/9CL_5Owvzi
For 2, in assignments you would get a runtime error. I feel the idea of allocation on the fly can obscure memory allocation and if you have multiple nested embedding a simple int assignment would do a lot more than a user would expect and surprise them. So I feel it might be better to just have a runtime error as in the case of assignment and no hidden memory allocation:
https://goplay.googleplex.com/p/HWSFemHcFf
Also that is the behavior without embedding:
https://goplay.googleplex.com/p/ixVAW3xa-Q
The assignment rules already cover the case where the same value (or part of a value) appears multiple times in an assignment.
The assignment proceeds in two phases. First, the operands of index expressions and pointer indirections (including implicit pointer indirections in selectors) on the left and the expressions on the right are all evaluated in the usual order. Second, the assignments are carried out in left-to-right order.
What should happen for the case that @zombiezen mentions above, in which a field is embedded via a pointer? Here are some possibilities:
It seems like choice 3 is the best one. The compiler can always tell how a field is embedded, so it can tell that a field is embedded via a pointer.
Is it too confusing for struct literals to permit direct embedded fields but not fields embedded via pointers?
1 or 3 are both valid options. 2 doesn't make sense given that 3 is a valid option.
3 would incentivize non-pointer embedding to get the simpler behavior even when a pointer embedding might make more sense.
1 seems more uniform and hence easier to use. Composite literals aren't necessary. They exist to make things easier.
I'd like to mention the curious case of an embedded, unexported pointer field with exported fields of its own:
package pkg
type u struct { A int }
type S struct { *u }
Currently, outside of pkg there is no way to allocate S.u or initialize A, although s.A is legal where var s pkg.S.
If pkg.S{A: 1} allocates, then you can do something that you couldn't before.
I don't know if there's any code out there that would care (that depends for its correctness on the inability to allocate u outside the package). There might be.
Hmm, that is a strong argument for option 3
Given a struct type T with arbitrary embeddings (directly, or indirectly), if the following code is valid:
var x T
x.f1 = v1
x.f2 = v2
...
for (possibly embedded and possibly exported) struct fields f1, f2, ..., and corresponding values v1, v2, ..., it seems that it should be valid to write:
T{f1: v1, f2: v2, ...}
and vice versa.
This rule alone would take care of visibility across packages and potential field name ambiguities. It would also explain what happens with pointer indirections (it will panic).
This is perhaps the simplest and most intuitive rule and would permit code transformation from one style to the other without change of semantics (which is currently the case for the restricted form of struct literals).
If the compiler is permitted to report an error when a struct literal is known to panic (which is easy to decide) then using a struct literal would be the "safer" choice. But one would lose the ability to transform code between the two variants without semantic change. I don't know if that's a good trade-off.
You could also only implicitly create the pointer if you could explicitly create it. In @jba's example you couldn't write pkg.S{A: 1} because you couldn't write pkg.S{u: &pkg.u{A: 1}}
Possible alternatives:
T{E{A: 1}}
T{E.A: 1}
If we implicitly allocating embedded pointers, if you have a long chain of embedded pointers, then it's non-obvious that:
x := Foo{Value: v}
is implicitly expanding to:
x := Foo{Bar: &Bar{Baz: &Baz{Spam: &Spam{Egg: &Egg{Value: v}}}}}
I don't think I like having non-obvious allocations like that.
@griesemer, your rule seems obvious (in hindsight). Could you go further and say that the sequence of assignments is actually the meaning of the struct literal? Interestingly, the spec never really gives the exact meaning of a struct literal. It strongly suggests it by saying keys are field names and values must be assignable to their respective fields. It even says omitted fields get the zero value. But it never actually says that the given values are assigned to the fields.
How much should we be consistent with assignments? If 1 is a compile error should the following also be a compile error?
https://goplay.googleplex.com/p/9CL_5OwvziFor 2, in assignments you would get a runtime error. I feel the idea of allocation on the fly can obscure memory allocation and if you have multiple nested embedding a simple int assignment would do a lot more than a user would expect and surprise them. So I feel it might be better to just have a runtime error as in the case of assignment and no hidden memory allocation:
https://goplay.googleplex.com/p/HWSFemHcFf
Also that is the behavior without embedding:
https://goplay.googleplex.com/p/ixVAW3xa-Q
@ghasemloo The links you've provided are not public 🛡️ not sure if that was intentional?
Most helpful comment
Given a struct type
Twith arbitrary embeddings (directly, or indirectly), if the following code is valid:for (possibly embedded and possibly exported) struct fields
f1,f2, ..., and corresponding valuesv1,v2, ..., it seems that it should be valid to write:and vice versa.
This rule alone would take care of visibility across packages and potential field name ambiguities. It would also explain what happens with pointer indirections (it will panic).
This is perhaps the simplest and most intuitive rule and would permit code transformation from one style to the other without change of semantics (which is currently the case for the restricted form of struct literals).
If the compiler is permitted to report an error when a struct literal is known to panic (which is easy to decide) then using a struct literal would be the "safer" choice. But one would lose the ability to transform code between the two variants without semantic change. I don't know if that's a good trade-off.