I can't figure out how to remove null values (and corresponding keys) from the input JSON. I can get it to output only the values, but the keys are dropped. I've tried various combinations of the recurse, map, select and del functions...
For example, with this input:
[
{
"Item": "A",
"Price": 12.23,
"Qty": 123
},
{
"Item": "B",
"Price": null,
"Qty": 234
},
{
"Item": "C",
"Price": 23.2,
"Qty": null
}
]
I want to generate this output:
[
{
"Item": "A",
"Price": 12.23,
"Qty": 123
},
{
"Item": "B",
"Qty": 234
},
{
"Item": "C",
"Price": 23.2
}
]
[.[]| . as $a| [keys[]| select($a[.]!=null)| {(.): $a[.]}]| add] But I wish for a simpler way.
Awesome! Thank you!
My pleasure! Simpler: delpaths([path(.[][]| select(.==null))]) (could generalize .[][] to recursion)
Great - that's a little easier to interpret for a first-time user!
You can also do:
del(.[][] | select(. == null))
I should really get around to documenting del, it's quite handy :)
Even shorter!
del(.[][] | nulls)
Can this be combined to remove also entries that do not have their value .DMC == 0? I tried
del(.[][] | select(. == "" or .DMC==0))
But it gives me invalid path expression with result true
@pke what version of jq are you using? Can you give us a sample input?
I am using the newest win32 version just downloaded.
Sample input is
[
{
"name": "foo",
"FIELD1": "",
"FIELD2": "value2",
"DMC": 1,
},
{
"name": "bar",
"FIELD1": "",
"FIELD2": "value2",
"DMC": 0,
}
]
Expected result:
[
{
"name": "foo",
"FIELD2": "value2",
"DMC": 1,
}
]
@pke Ah, I'm not sure what the .=="" part of your program is meant to match, but what you want is something like this: del(.[] | select(.DMC==0)). Here the input to select() will be an object where you'd find .DMC and so on -- it's what you want deleted, so you'll want to have that as the input to select() and that as the output of the whole expression passed to del(). If you have a second condition like .name=="foo", add it like so: del(.[] | select(.DMC==0 or .name=="foo"))`.
@nicowilliams The `.=="" should remove all key, that have empty values. I modified this mentioned shortly above my post https://github.com/stedolan/jq/issues/104#issuecomment-167396874
Your del(.[] | select(.DMC==0)) gives me:
[
{
"name": "foo",
"FIELD1": "",
"FIELD2": "value2",
"DMC": 1
}
]
But I'd like it to also not include the FIELD1 key, because its value is empty.
I came up with del(.[] | select(.DMC==0)) | del(.[][] | select(. == "")) but I'm sure there is a more efficient way.
@pke
Your solution is correct.
Very well then ;) thanks
Here's an alternative that might be slightly faster:
map( select(.DMC != 0) | with_entries( select( .value != 0 )) )
@pkoppstein That's what I was thinking about ;) Thanks. Possible to remove the .DMC keys in the same go also?
@pke - How about:
map( select(.DMC != 0) | with_entries( select( .value != 0 and .key != "DMC" ) ))
@pkoppstein perfect! Thanks and sorry for hijacking the OP thread.
@pke - Yes, in future it would be better to ask usage qustions at stackoverflow.com with the jq tag.
BTW, all of these solutions do some amount of allocation. In principle we could optimize down the amount of allocation needed for the first approach (use del/1), but we can't really reduce the amount of allocation needed for the map/1 approach. So if you want the most efficient solution, then the first is it, at least in principle, and until we optimize it further, you could measure the two.
The only reason del/1 absolutely must allocate some memory is for the paths constructed by path/1, but that's OK because that array can be reused over and over. It also allocates because of depaths/1 wanting an array of paths, but we could add a delpath/1 that doesn't. Then provided . has only one refcount then no allocation would be needed.
But map/1 and with_entries/1 must allocate in all cases.
So in principle the del/1-using program can be more efficient, at least as to memory allocations. It still might not be in run-time anyways, naturally, but I think it would be if we further optimize it.
With newer versions of jq (1.6 and later per https://github.com/stedolan/jq/issues/963) you can use this expression to remove null-valued keys recursively:
'walk( if type == "object" then with_entries(select(.value != null)) else . end)'
I've tweaked this a bit to remove all empty items, and added this to my ~/.jq:
def remove_empty:
. | walk(
if type == "object" then
with_entries(
select(
.value != null and
.value != "" and
.value != [] and
.value != {}
)
)
else .
end
);
Here's a more general one without the redundant . |:
def remove_empty:
walk(
if type == "array" then
map(select(. != null))
elif type == "object" then
with_entries(
select(
.value != null and
.value != "" and
.value != [] and
.value != {}
)
)
else
.
end
);
One-liner: delpaths([path(..?) as $p | select(getpath($p) == null) | $p]):
$ jq -n '{a:{b:null},c:{d:1}} | delpaths([path(..?) as $p | select(getpath($p) == null) | $p])'
{
"a": {},
"c": {
"d": 1
}
}
A more online but slightly more complicated version:
jq -n '{a:{b:null},c:{d:1}} | reduce (path(..?) as $p | select(getpath($p) == null) | $p) as $p (.; delpaths([$p]))'
{
"a": {},
"c": {
"d": 1
}
}
Explanation:
path(..?) outputs all paths in .path(..?) as $p | saves the path in $p and leaves the input alonegetpath($p) extracts the value in . at path $pselect(getpath($p) == null) filters out cases where the value is not nullselect(getpath($p) == null) | $p then outputs just the paths that have null valuesdelpaths/1, or else we reduce over these paths and delpath/1 each one by one@nicowilliams Thank you for the great example! Note that it requires jq 1.6 to use it.
@nicowilliams What's the benefit to your second version? Is there any reason it can't be put in ~/.jq and used as a simple passthrough function, ie jq '. | delnulls | …' …?
@davidfetter Does this catch ""/[]/{} — or nested empties like [{””:[]},[]]?
if type == "array" then map(select(. != null)) elif type == "object" then
IME most of my cruft comes in from format conversions, eg CSV → JSON, or from external sources that don't know how to use proper null and instead use either empty string or various signal values.
So my version tries to catch all of those.
This should include nested ones (for which depth-first processing is necessary). E.g. [[""],[""]] definitely !=null (ask a number theorist 😉), but depth-first conversion will turn it into [[],[]] → [] → null.
Here's an updated version of mine:
~/.jq
# https://github.com/stedolan/jq/blob/master/src/builtin.jq#L284
def walk(f):
. as $in
| if type == "object" then
reduce keys_unsorted[] as $key
( {}; . + { ($key): ($in[$key] | walk(f)) } ) | f
elif type == "array" then map( walk(f) ) | f
else f
end;
def isempty(v):
(v == null or v == "" or v == [] or v == {});
def isnotempty(v):
(isempty(v) | not);
def remove_empty:
walk(
if type == "array" then
map(select(isnotempty(.)))
elif type == "object" then
with_entries(select(isnotempty(.value))) # Note: this will remove keys with empty values
else .
end
);
Usage:
$ echo '{"x":[[["",""],["a",""]],"","b"],"y":null,"z":[{},{"1":2}]}' | jq -c '. | remove_empty'
{"x":[[["a"]],"b"],"z":[{"1":2}]}
$ echo '""' | jq '. | remove_empty'
""
$ echo 'null' | jq '. | remove_empty'
null
@saizai - Please note that isempty/1 is already a built-in function. It checks whether its argument is an empty stream, so it would be helpful if you chose a different name for your def.
For future reference, one can easily check whether jq defines a NAME/ARITY filter using: jq -n builtins, e.g.
$ jq -n builtings | grep isempty
"isempty/1"
$ jq -n '"isempty/1" | IN(builtins[])'
true
Most helpful comment
Even shorter!