Jq: removal of nodes, by naming nodes to excise

Created on 25 Oct 2012  路  7Comments  路  Source: stedolan/jq

Input {"foo": 42, "bar": "less interesting data", "baz": "sdkjhsdf"}
Output {"foo": 42, "baz": "sdkjhsdf"}

The jq expression would cite bar, but not foo or baz. Possibly overloading the minus char somehow.

feature request

Most helpful comment

If you build from master, you can now do this:

del(.bar)

It's not documented yet, and the exact semantics of what happens when you replace .bar with something more complicated will probably change, but that should solve the problem.

All 7 comments

Yes, this is definitely a missing feature. Options include:

  1. . - "bar"
  2. . - ["bar"]
  3. remove("bar")
  4. .bar = empty

I think 1 or 2 might be what you suggested. I'm not sure which I prefer yet, but I'm kind of partial towards 4 since it fits with the current meanings of assignment and empty, rather than adding a new construct.

you're the domain expert, choose the most appropriate solution :)

If you build from master, you can now do this:

del(.bar)

It's not documented yet, and the exact semantics of what happens when you replace .bar with something more complicated will probably change, but that should solve the problem.

That's great dude. Is it recursive too ?

Now I'm happier with what del(complicated_thing) does. You can del() all sorts of junk:

  • del(.foo,.bar,.baz) deletes those three keys
  • del(.foo[0].bar.baz) removes the baz key from the object at .foo[0].bar
  • del(.foo[0,1,2]) deletes the first three items of the array at field "foo"
  • del(.foo[] | select(.score < 20)) deletes items in the array at field "foo" whose "score" field is below 20.

Not quite sure what you meant by "recursive", though.

By recursive, I mean that it's able to remove a node that repeats throughout a document structure. For example, Reddit's articles go fairly deep with repeating element/attr names:

Ah. You can in fact do this. First you need a means to recurse down to all of the children of a node, which you can do by defining a recursive function:

def all_children: . , (.data.children | select(. != null) | .[] | all_children);

Putting a node through all_children returns the node itself and all child nodes. Then, to get all of the author fields from any node in your dataset, you'd need:

.[] | all_children | .author

You can then delete all of the author fields with:

del(.[] | all_children | .author)

Defining a new recursive function each time you want to do this is effort, so I've added a recurse function to the standard library, which is defined as:

def recurse(f): . , (f | select(. != null) | recurse(f));

The program to delete all author fields anywhere in the document now becomes:

del(.[] | recurse((.data.children // []) | .[]) | .author)

(foo // [] returns [] if foo is null, or foo otherwise.)

Was this page helpful?
0 / 5 - 0 ratings

Related issues

LoganBarnett picture LoganBarnett  路  3Comments

ve3ied picture ve3ied  路  4Comments

ghost picture ghost  路  4Comments

thedward picture thedward  路  3Comments

lhunath picture lhunath  路  3Comments