I have a language where there are repeated instances of the same pattern where I only care about the first symbol. For example:
system OBJECT IDENTIFIER ::= { mib-2 1 }
interfaces OBJECT IDENTIFIER ::= { mib-2 2 }
at OBJECT IDENTIFIER ::= { mib-2 3 }
ip OBJECT IDENTIFIER ::= { mib-2 4 }
icmp OBJECT IDENTIFIER ::= { mib-2 5 }
tcp OBJECT IDENTIFIER ::= { mib-2 6 }
udp OBJECT IDENTIFIER ::= { mib-2 7 }
egp OBJECT IDENTIFIER ::= { mib-2 8 }
This simple example could be matched by this pattern (where _ is whitespace):
identifier _ "OBJECT IDENTIFIER" _ "::=" _ "{" _ identifier _ number _ "}"
This isn't such a big deal in this case (I already typed the pattern :-) But the language has a set of other big hairy constructs that don't warrant the full parsing (I only want the initial identifier on each line to do the job I have in mind).
I would like to type something like this pattern:
identifier _ "OBJECT IDENTIFIER" .*? "}"
where the ".*?" is non-greedy - it only consumes to the first occurrence of the terminal. Could this be on the list for PEG.js? Many thanks.
Update: This could be satisfied by a repetition count (which is a generalization of my initial thought) as suggested in Google Groups at: http://groups.google.com/group/pegjs/browse_thread/thread/2bea15581be45187
In PEG formalism, you can easily match until a terminator by using a predicate together with the .
metacharacter. Something like:
"OBJECT IDENTIFIER" (!"}" .)* "}"
Is that sufficient for you?
Yes, that works perfectly. Thanks!
@dmajda What's the recommended practice for stripping out the empty char returned by the !"}"
expression?
For example:
= chars:(!"-suffix" .)+ "-suffix"
"foo-suffix" => [[ '', 'f' ], ['', 'o' ], ['', 'o' ]] // result
"foo-suffix" => ['f', 'o', 'o' ] // desired result
I was able to achieve this by breaking !"-suffix" .
into its own rule that just returns the .
result, but I'm curious if there's a better way.
I think in the mean while you can use:
= chars:(!"-suffix" c:. {return c})+ "-suffix"
@islandr Please don't use issues as a place to ask questions about PEG.js usage. Especially when they are closed and especially when you are asking something that other people beside me can help you with. The proper channel is the Google Group.
Sorry David. Thought this would have been a good place since it was
directly related to the example you'd given.
On Wed, Jan 9, 2013 at 9:51 PM, David Majda [email protected]:
@islandr https://github.com/islandr Please don't use issues as a place
to ask questions about PEG.js usage. Especially when they are closed and
especially when you are asking something that other people beside me can
help you with. The proper channel is the Google Grouphttp://groups.google.com/group/pegjs
.—
Reply to this email directly or view it on GitHubhttps://github.com/dmajda/pegjs/issues/57#issuecomment-12083927.
Most helpful comment
In PEG formalism, you can easily match until a terminator by using a predicate together with the
.
metacharacter. Something like:Is that sufficient for you?