Originally came up in #1329. HasField with proto3 fails on simple / singular / non-message fields.
@nathanielmanistaatgoogle Mentioned that @haberman might be able to shine some light. Josh, what should we be doing here? The current approach isn't great:
# NOTE: As of proto3, HasField() only works for message fields, not for
# singular (non-message) fields. First try to use HasField and
# if it fails (with a ValueError) we manually consult the fields.
try:
return message_pb.HasField(property_name)
except ValueError:
all_fields = set([field.name for field in message_pb._fields])
return property_name in all_fields
Should we bother with checking "has field", e.g.
if _has_field(key_pb.partition_id, 'dataset_id'):
...
or should we instead just check Truth-iness
if key_pb.partition_id.dataset_id:
...
# OR MORE STRICT
if key_pb.partition_id.dataset_id != '':
...
I think the question is: what are you trying to do? Why is it relevant to you whether a field is set or not and what do you intend to do with this information?
In proto3, field presence for scalar fields simply doesn't exist. Your mental model for proto3 should be that it's a C++ or Go struct. For integers and strings, there is no such thing as being set or not, it always has a value. For submessages, it's a pointer to the submessage instance which can be NULL, that's why you can test presence for it.
Please don't access msg._fields. It's a private variable with no defined semantics.
Thanks for the clarification (that mostly was what I assumed to be true, so that's comforting). The original reason for using msg._fields was to keep from rocking the boat on the proto2 to proto3 move.
Changing _has_field(key_pb.partition_id, 'dataset_id') to key_pb.partition_id.dataset_id != '' (and similar usage of _has_field) will be simple.
Yes, testing against 0 and the empty string is definitely one option. By the way, you can probably just use Python's truthiness testing for this (ie. if key_pb.partition_id.dataset_id:).
It still would help to know: what are you actually going to do with this information? If the dataset_id is non-empty, what will you do differently than when it's empty?
@haberman for example, when parsing a Value (stored in the datastore as a property on an Entity) a user could very well have stored boolean_value as False, double_value as 0.0, integer_value as 0, and so on.
I am trying to excise uses of _has_field where not necessary (e.g. checking if the dataset_id is set on a ParitionId) but it seems the above case still requires it (even for these simple fields).
Ah! See this is why I am glad I asked. The proper way of solving this problem is with a oneof. Oneofs will properly preserve presence even if the value being stored is 0, false, etc. See: https://developers.google.com/protocol-buffers/docs/proto#oneof
Yup. The datastore v1beta3 API uses a oneof for the case listed above, so I should be able to excise _has_field once that version is released. (The v1beta2 API is still on proto2)
proto2 supports oneof! In practice nearly all proto3 features (maps, oneofs, well-known types, etc) are all available in proto2 also. The main exception is JSON support.
Also, changing an existing protobuf to use oneof is a wire-compatible change. And in most ways it is an API-compatible change also.
Gotcha, v1beta2 was released some time ago so maybe it wasn't available to the authors at that time. Thanks for being so helpful here!
Address this in #1450.
@haberman we're using protobuf to transition from some legacy code where an enum of values is mapped to integers including a valid enum with value of 0. In my protobuf the default value is 0 even if the field is "unset". Is there a way that I can check if this field is unset without using _fields? A related issue is the value 0 isn't even encoded to the wire even if I change the default. Any suggestions?
@TheJKFever Are you using proto2 or proto3 (ie. what does the "syntax=" declaration in your .proto file say?)
If you're using proto2, then you can use the HasField() method to check whether the field is set.
If you're using proto3, then there is no way to tell, sorry. This is by design; see extensive discussion on this here: https://github.com/google/protobuf/issues/1606
Thanks @haberman. We were using proto3, but decided to switch to proto2 for this reason. It seems like if you wanted to use proto3 and have the ability to see if someone explicitly set an int enum, you would need to make sure that your enum did not start at 0 or be okay with an implicit default of 0.
Suffering the same pain with @TheJKFever .
Can't tell whether it's a ENUM value with 0 or not receiving at all.
The funny thing is that, if you convert the proto to json string, you can actually tell whether the field exists or not. So the proto3 actually knows the existance of a Field, but it refuse to provide a way to tell that.
maybe ListFields is better solution:
Returns a list of (FieldDescriptor, value) tuples for present fields.
Most helpful comment
I think the question is: what are you trying to do? Why is it relevant to you whether a field is set or not and what do you intend to do with this information?
In proto3, field presence for scalar fields simply doesn't exist. Your mental model for proto3 should be that it's a C++ or Go struct. For integers and strings, there is no such thing as being set or not, it always has a value. For submessages, it's a pointer to the submessage instance which can be
NULL, that's why you can test presence for it.Please don't access
msg._fields. It's a private variable with no defined semantics.