Protobuf: Proto3 spec is incorrect w/r/t oneofs

Created on 23 Aug 2018 · 8Comments · Source: protocolbuffers/protobuf

What version of protobuf and what language are you using?

 $ protoc --version
libprotoc 3.3.0

What operating system (Linux, Windows, ...) and version?

$ uname -a
Linux Ben-Wolfson-Tower 4.2.0-27-generic #32~14.04.1-Ubuntu SMP Fri Jan 22 15:32:26 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux

What runtime / compiler are you using (e.g., python version or gcc version)

whatever apt-get gave me, man

What did you do?

Created a message with a grammatically correct oneof field per the grammar given here:
https://developers.google.com/protocol-buffers/docs/reference/proto3-spec

What did you expect to see

an exit code of 0

What did you see instead?

a non-zero exitcode

$ cat oneoftest1.proto
syntax = "proto3";
message OneofTest { oneof test { ; } }
$ protoc --cpp_out=. oneoftest1.proto
oneoftest1.proto:2:34: Expected type name.
$ cat oneoftest2.proto
syntax = "proto3";
message OneofTest { oneof test {  } }
$ protoc --cpp_out=. oneoftest2.proto
oneoftest2.proto:2:35: Expected type name.

Here are the relevant snippets of the grammar from the proto3 spec:

{}  repetition (any number of times)
emptyStatement = ";"
oneof = "oneof" oneofName "{" { oneofField | emptyStatement } "}"

This states that it's legal to have, within the curly braces delimiting the fields of a oneof, zero or more semicolons. In fact this is quite false! The correct grammar, judging from the behavior of protoc, is:

oneof = "oneof" oneofName "{"  oneofField { oneofField } "}"

i.e., emptyStatement is actually not allowed at all, and there must be at least one oneofField.

P3 bug syntax specification

Source

bwo

Most helpful comment

Having a strict specification is crucial for people implementing protocol buffers in other languages. A specification is supposed to be the authoritative truth on what is correct and what isn't. Unfortunately, often-times people have to turn to the C++ implementation to derive what is "correct".

dsnet on 25 Aug 2018

👍3

All 8 comments

Something similar is true of the grammar for enum.

bwo on 24 Aug 2018

Right now the syntax specification is more or less for information only. It's unlikely that we will make it fully match what protoc does. Barring that, what else can we do to help your use case? Are you using the syntax specification to create a protoc in a different language or something?

xfxyjwf on 24 Aug 2018

Not quite a protoc, but I am writing a parser for .proto files. There are some places where the spec is obviously off (e.g. according to the spec, syntax = 'proto3"; should be grammatical) and it's easy enough to fix those up, but others where it's less clear, as here (or as in the treatment of whitespace generally) and one has to go to protoc to see what it actually does.

If there's an official grammar other than the protoc source, that would be useful! The "spec" can't really be used for information if it is, you know, false.

bwo on 24 Aug 2018

dsnet on 25 Aug 2018

👍3

Here's something fun. According to the spec, the following file, which protoc in fact accepts without complaint, is syntactically invalid:

syntax = "proto3";
import "google/protobuf/descriptor.proto";

package foo;

extend google.protobuf.MessageOptions {
    string my_option = 1000;
}

message T {
    option (my_option) = "3";
    string x = 1;
}

Can YOU spot why?

bwo on 26 Aug 2018

ident = letter { letter | decimalDigit | "_" }

Actually, it's valid for an ident to begin with a leading underscore.

bwo on 1 Sep 2018

The incorrect grammar for ident is #4554.

dsnet on 1 Sep 2018

I suppose the failure to include any clause at all for extend should also be a separate ticket :/.

bwo on 1 Sep 2018

Was this page helpful?

0 / 5 - 0 ratings

Related issues

Allow defining constant values to be added

mitar · 37Comments

There's no 3.12.1 sdist on PyPI

sunpoet · 32Comments

Java compile version issue when upgrading to 3.12.4

kolea2 · 40Comments

<Java> JDK 9 warning: Illegal reflective access by com.google.protobuf.UnsafeUtil to field java.nio.Buffer.address

xenji · 27Comments

[CSharp] Allow Span<byte>-based parsing in CodedInputStream

mkosieradzki · 70Comments