Dvc: Regular Expression support for --xpath

Created on 20 Dec 2019  路  7Comments  路  Source: iterative/dvc

Hello,

I believe --xpath does not support filtering by regular expressions.

Use case:
I have the following metrics file:

{
  "groups": [
    {
      "name": "group-A-1",
      "metrics": ...
    },
    {
      "name": "group-A-2",
      "metrics": ...
    },
    {
      "name": "group-B-1",
      "metrics": ...
    },
}

and would like to get all groups from the A segment, but not the B segment.
Ideally, this would work:
dvc metrics show -x $.groups[?(@.name =~ /group-A-*/)]

Right now, only the == operator is supported.

awaiting response enhancement feature request

All 7 comments

Hi @FredericoCoelhoNunes !

Could you try running

dvc metrics show -x '$.groups[?(@.name =~ /group-A-*/)]'

, please? Notice that i've added single quotes, so that it doesn't get evaluated by the shell.

Hey!
I tried running that, but I still get the Unexpected character: ~ error.

Are you sure it's in quotes/escaped so the shell env isn't trying to parse the expression?

@FredericoCoelhoNunes We are using jsonpath_ng internally and when trying to run your example:

from jsonpath_ng import jsonpath, parse

jsonpath_expr = parse('$.groups[?(@.name =~ /group-A-*/)]')

data = {
  "groups": [
    {
      "name": "group-A-1",
      "metrics": "m1"
    },
    {
      "name": "group-A-2",
      "metrics": "m2"
    },
    {
      "name": "group-B-1",
      "metrics": "m3"
    },
  ]
}

results = [match.value for match in jsonpath_expr.find(data)]
print(results)

I'm getting:

Traceback (most recent call last):
  File "test_2990.py", line 3, in <module>
    jsonpath_expr = parse('$.groups[?(@.name =~ /group-A-*/)]')
  File "/Users/efiop/.pyenv/versions/python3-dvc/lib/python3.7/site-packages/jsonpath_ng/parser.py", line 14, in parse
    return JsonPathParser().parse(string)
  File "/Users/efiop/.pyenv/versions/python3-dvc/lib/python3.7/site-packages/jsonpath_ng/parser.py", line 32, in parse
    return self.parse_token_stream(lexer.tokenize(string))
  File "/Users/efiop/.pyenv/versions/python3-dvc/lib/python3.7/site-packages/jsonpath_ng/parser.py", line 55, in parse_token_stream
    return new_parser.parse(lexer = IteratorToTokenStream(token_iterator))
  File "/Users/efiop/.pyenv/versions/python3-dvc/lib/python3.7/site-packages/ply/yacc.py", line 333, in parse
    return self.parseopt_notrack(input, lexer, debug, tracking, tokenfunc)
  File "/Users/efiop/.pyenv/versions/python3-dvc/lib/python3.7/site-packages/ply/yacc.py", line 1063, in parseopt_notrack
    lookahead = get_token()     # Get the next token
  File "/Users/efiop/.pyenv/versions/python3-dvc/lib/python3.7/site-packages/jsonpath_ng/parser.py", line 179, in token
    return next(self.iterator)
  File "/Users/efiop/.pyenv/versions/python3-dvc/lib/python3.7/site-packages/jsonpath_ng/lexer.py", line 35, in tokenize
    t = new_lexer.token()
  File "/Users/efiop/.pyenv/versions/python3-dvc/lib/python3.7/site-packages/ply/lex.py", line 386, in token
    newtok = self.lexerrorf(tok)
  File "/Users/efiop/.pyenv/versions/python3-dvc/lib/python3.7/site-packages/jsonpath_ng/lexer.py", line 167, in t_error
    raise JsonPathLexerError('Error on line %s, col %s: Unexpected character: %s ' % (t.lexer.lineno, t.lexpos - t.lexer.latest_newline, t.value[0]))
jsonpath_ng.lexer.JsonPathLexerError: Error on line 1, col 9: Unexpected character: ?

So there might be something not right with the regex or maybe jsonpath_ng has some limitations. Can't put my finger on anything specific right now 馃檨

Are you sure it's in quotes/escaped so the shell env isn't trying to parse the expression?

Yeap, I did put them in quotes.

@FredericoCoelhoNunes We are using jsonpath_ng internally and when trying to run your example:

I think you sre missing a dot before the asterisk in the regular expression, but I don'tthink that is the problem.

I think this is just a jsonpath limitation, but it's ok, I've found a workaround! Cheers

@FredericoCoelhoNunes Could you post the workaround, please? 馃檪 Closing for now, since it is resolved.

Sorry my bad, I wasn't clear enough. The workaround was specific to my use case in particular, so not relevant

Was this page helpful?
0 / 5 - 0 ratings