I'm doing something like this in a shell script:
API_URL=http://whatever
while [ "$API_URL" ]; do
#....stuff....
API_URL=$(jq -r .pagination.next_url $JSON)
done
When the key doesn't exist, it returns the string "null". Of course this is easy enough to work around, but it makes more sense to me that this would just return an empty string.
Empty string? Why? The value is not there, therefore it's null. I might consider an argument for it to return empty, but empty string makes no sense. Why not 0 or [] or {}?
It's common in shell programming to get an empty string when there's no match. For instance grep blahblah file doesn't output "not found", it outputs nothing, so capturing the output gives you an empty string. (Grep also exits with a different status when there's no match.)
It just seems weird to have a "magic value" in the shell script world.
I could write a "jq-grep" wrapper that has the semantics I'm looking for (display nothing and exit non-zero if there's no match, otherwise display the match).
Is there a better way to do this (looking for a single value in a shell script)?
Oh, you're seeing it from a shell script perspective. I usually use jq as a way to transform JSON documents into other JSON documents or explore them, hence my insistance on null as the appropriate value.
What you probably want, then, is for it to return empty, which, shell-wise, would be represented as an empty string. The problem with returning empty is that it's not a value at all. Right now, if JQ returns null, which you can turn into empty. However, the reverse operation can't be done, because empty is just... well, it isn't.
So, you have two work-arounds for this:
jq -e, which makes false and null report special exit statuses which you can detect on your shell script.null into empty using the alternative operator:jq '.pagination.next_url // empty'Oh okay, now I get what empty means. That works, thanks! (My copy doesn't have -e yet, but that's not as important). I wanted to be able to replace a giant mess of sed/awk JSON parsing with an equivalent jq invocation.
Though it does make sense to basically do "// empty" by default with "-r" because in that mode, the output has to always be treated as a string:
$ echo '{"a":false}' | jq -r '.a'
false
$ echo '{"a":"false"}' | jq -r '.a'
false
There's no way to tell the difference between false and "false".
Maybe have a "scripting" mode that's equivalent to adding "// empty" and -e and -r ?
Maybe have a "scripting" mode that's equivalent to adding "// empty" and -e and -r ?
That would make sense to me, although I'm not sure what the expected behaviour should be on -s (slurp) mode. Should it // empty each of the slurped values, or // empty the slurping array?
_[with -r]_ There's no way to tell the difference between false and "false".
Is that intended behaviour, or a bug? You can do | tojson, but it seems to me that the whole point of -r is to avoid quoting, isn't it?
:+1: I have a lot of code that checks for null in the raw output and converts it to an empty string. At the end, null could be a valid string value and empty string is just safer based on my over one year use of jq.
The easy thing to do is to add select (.!=null) at the end of your jq
program.
:+1:
IMHO, this should be the default for raw-output mode. To consider the counterexample, when would it be useful for raw-output to output "null" to stdout instead of an empty string?
To give my use case, I'm using raw-output mode to extract command line arguments for bash scripts. "null" is never helpful, so I'll have to append all my queries with "//empty". It feels unnecessary given the -r and jq's goal of being a shell tool.
--raw-output arguably should produce an error if any value other than string is output, say, or any value other than numeric, or other than either string or numeric. What about boolean? Fine, so any value other than string, numeric, or boolean. Wait, null too is a scalar...
You see how this ends :)
Now, we could have --raw-output-string and --raw-output-number and --raw-output-boolean. (There's no point in --raw-output-{array, object, null}.) But it's just easier to add this to the program: | select(type == "string"), or | if type != "string" then error("must output strings!" else . end.
Perhaps we should have --prefix-program EXP and --suffix-program EXP options, so that one could:
$ jq -f someprogram.jq --raw-output --suffix-program 'select(type=="string")'
Which, actually, would be super convenient, since then one could also mix --stream, --raw-input, and so on with programs that don't expect them by adding a prefix that massages the input back to whatever the program expects.
Of course, in a way we have this now: you can just pipe a jq program's output to another's input.
I don't really want to add lots of command-line options because @stedolan doesn't and because he doesn't for good reasons that I agree with.
I should also add that --raw-output is what it is, and making backwards-incompatible changes to it is unappealing. I'd sooner consider new command-line options, if well-designed.
E.g., if you can't modify the program then you can add prefixes/suffixes like this:
$ jq "$prefix" | jq -f "$main" | jq --raw-output 'select(type == "string")'
It's what the shell is good at :)
Keeping jq's command-line simple is important. It's already grown too many options, IMO, and I'd like to add builtins instead wherever possible.
Now, I can see that using shell pipelines has some problems (e.g., much redundant encoding and parsing), so I think prefix/suffix options might be sensible. I'll have to think about it.
Comments?
I can appreciate resisting command-line-option bloat :)
That said, as a shell JSON tool, there could be more explicit support for the shell scripting conventions. eg. a --shell-output mode, where "null" is output as an empty string, arrays are output as space-delimited strings, and -c is enabled by default. Objects would have to stay as JSON though.
Defaulting to --compact-output when stdout is not a tty would be sensible, yes. I'll probably take that (though there must have been a reason that @stedolan made --color-output/--monochrome-output work that way but not -c, so I'll have to think about it).
I completely agree that it's unfortunate that null doesn't map to empty when --raw-output, I do. I really want to make this change, but...
Let's think of it this way: is there any plausible breakage that would result from making this change?
(I think "no, not really", but would like to hear and think about it more.)
@nicolwilliams asked:
Comments?
It seems to me that trying to fix this molehill is likely to raise a mountain of complications.
As has already been pointed out, if the user has control over the jq program, then using //empty is most probably acceptable; conversely, if the user has control over the script but not the jq program, then it's easy enough to test for "null".
As for complications ... first as already mentioned, there's the issue of backward compatibility. How realistic is that? If I have n JSON entities as input, I might reasonably expect (or want) n outputs, especially in a pipeline.
Next, should a stream of n consecutive nulls become a stream of n new-lines or simply nothing-at-all?
As for a new command-line option -- my understanding is that at present, the general rule is that "if it can be done in jq, or if it could be done in an enhanced jq, then it shouldn't be a command-line option".
@nicowilliams wrote:
Defaulting to --compact-output when stdout is not a tty would be sensible, yes.
I believe the proposal was not this, but that a new "--shell-output" would imply -c (unless overridden by some other option).
In any case, in my opinion, this kind of backwards-incompatible change needs stronger justification than seems to exist here.
On Mon, Jan 12, 2015 at 04:01:00PM -0800, pkoppstein wrote:
@nicowilliams wrote:
Defaulting to --compact-output when stdout is not a tty would be sensible, yes.
I believe the proposal was not this, but that a new "--shell-output" would imply -c (unless overridden by some other option).
Ah, thanks, yes, I see now.
In any case, in my opinion, this kind of backwards-incompatible change needs stronger justification than seems to exist here.
Yes.
@nicowilliams wrote:
E.g., if you can't modify the program then you can add prefixes/suffixes like this:
$ jq "$prefix" | jq -f "$main" | jq --raw-output 'select(type == "string")'
It's what the shell is good at :)
This is howI've done it. In some cases I have three separate functions doing this, in others I have separate shell scripts. Most of my use of jq is in pipes from json files to other json files though. Either way, a flagg for // empty or similar seems unnecessary.
In any case, in my opinion, this kind of backwards-incompatible change needs stronger justification than seems to exist here.
I don't particularly mind the outcome, but I'll try to justify this. Comments & criticism welcome.
1) What is raw output? The description :
"With this option, if the filter鈥檚 result is a string then it will be written directly to standard output rather than being formatted as a JSON string with quotes. This can be useful for making jq filters talk to non-JSON-based systems"
Since it does not describe behaviour for non-string values, it is effectively undefined behaviour. Changing the output from "null" to empty string does not break any contract.
2) What is default behaviour for? If it is to continue the existing behaviour for the benefit of backwards compatibility, then only new tools can replace behaviour. IMHO, default settings are most beneficial to new users. Existing users who upgrade jq can always add an option to their existing scripts to use 1.4 style outputs if they need the old format behaviour.
On Tue, Jan 13, 2015 at 10:19:20AM -0800, Richard Geary wrote:
In any case, in my opinion, this kind of backwards-incompatible change needs stronger justification than seems to exist here.
I don't particularly mind the outcome, but I'll try to justify this. Comments & criticism welcome.
- What is raw output? The description :
"With this option, if the filter鈥檚 result is a string then it will be written directly to standard output rather than being formatted as a JSON string with quotes. This can be useful for making jq filters talk to non-JSON-based systems"Since it does not describe behaviour for non-string values, it is effectively undefined behaviour. Changing the output from "null" to empty string does not break any contract.
Thanks for checking the description. Yes, it says that. Therefore I could change the behavior to match the docs. It still might break existing code. Hmmm.
We'd have to tighten up the description as to non-string output types. What should be done about them?
If an array/object/number/boolean should be output anyways (as a JSON text), then it's a bit odd to treat null as special.
OTOH, if null is not to be output, then surely neither should array/object/boolean, but it's not like //empty -- should those raise errors?? And then what about numbers?
- What is default behaviour for? If it is to continue the existing behaviour for the benefit of backwards compatibility, then only new tools can replace behaviour. IMHO, default settings are most beneficial to new users. Existing users who upgrade jq can always add an option to their existing scripts to use 1.4 style outputs if they need the old format behaviour.
I don't know the answer to that question. @stedolan might. I don't think being more obvious/less-surprising to new users is a _sufficient_ justification for a backwards-incompatible change.
@rgeary1 wrote:
I don't particularly mind the outcome ...
Good, because apart from all the other considerations that have been mentioned, there is another reason for leaving things the way things are. Sometimes (or in my case, quite often), one wants to add comments in the form of raw strings to JSON output consisting of non-strings.
Changing the output from "null" to empty string does not break any contract.
Of course I understand that if the jq manual had been written for pettifogging lawyers or logicians, then your point would have some formal validity, but the manual was written (brilliantly) with succinctness and human intelligibility in mind. (More generally, it seems to me that in ordinary English at least, "if ... then ..." is not always the same as the logician's _modus ponens_. Of course if you've been taught that is, then it might seem so.)
@nicowilliams wrote:
Yes, it says that
It also says (of .):
This is a filter that takes its input and produces it unchanged as output.
As has often been lamented, the current behavior of jq w.r.t. some integers and floats clearly violates both the letter and spirit of the "contract" -- my point being that rather than spending time on trying to fix what isn't broken, it would be far better for everyone if we spent the available time trying to fix what is broken (or missing :-)
Just wanted to chime in - for scripters, the lack of a shell mode in jq is a huge UX wart. You don't want to have to suffix Every Single Command with //empty (cognitive overhead, maintainability problems, etc.) and this isn't something that's trivially fixable with an alias.
I really do think there's a use case for a CLI mode with roughly this behaviour:
null returns nothing\n-separated listsKEY=VAL \n-separated listsjq is drilling down into JSON structures. KEY1.KEY2.ARRAYINDEX1=VAL would work, but anything sane is fine.Such a mode would let *nix scripters use jq with test, grep, sed... This is a _very_ common workflow, and supporting it would give jq serious mass appeal.
(In the best of worlds, this shell mode would be re-entrant... But perfect is the enemy of the good, etc.)
@lilred This is true. Given jq is most widely-used as a CLI tool in *sh scripts, it's a pity that some features are designed around a different use case. In fact, I started to use a lot more JMESpath recently due to some of these disappointments.
@lilred
I understand wanting nulls filtered out when using --raw-output. I understand wanting strings and numbers to be treated as they are today in --raw-output. I don't understand the other suggestions you make. That is, I don't understand what the fourth and fifth bullet items in your list mean. Can you describe what you want to see with some actual examples?
FYI and FWIW, I use jq in shell scripts all the time.
@lilred - There are several aspects of your posting which are very puzzling, so
let me begin by emphasizing that in theory as well as in practice, jq
is very easy to use in a pipeline. This is indeed one of the reasons
for keeping jq focused on JSON and line-at-a-time text processing.
For example, there is no real need for jq to include a parser for CSV
or hjson, since one can easily use any-json or hjson as a pre-filter
for jq.
Your posting also suggests you may not know about the utility of jq
builtins such as select/1, which often make it unnecessary to use the
'//empty' idiom.
For example, if you have a stream of strings and you want to filter
out the empty string completely, you could write 'select(length>0)'.
Since the length of null is also 0, the same filter can be used on a
stream of strings-or-nulls. There are other builtins which make
selection a breeze.
If you were to provide examples of some of the tasks which you think fall
within the purview of jq but which cannot be conveniently accomplished
using jq, we may be able to address your concerns more effectively.
By the way, if your point about numbers being "returned as-is" is an allusion to the issues surrounding the conversion to IEEE 754 64-bit numbers, please be assured that the jq developers and maintainers are well aware of them.
Toy examples, but much closer than you'd think to real-world use cases.
echo "{foo: "bar"} | jq [...] ".qux" | test # should return a falsey value
echo "["a", "b", "lorem ipsum"] | jq [...] | grep "lorem" # should return lorem ipsum (unquoted)
echo "{foo: bar}" | jq [...] | grep "foo" # should return something like foo,bar or foo=bar
@lilred -
EXAMPLE 1:
echo "{foo: "bar"} | jq [...] ".qux" | test # should return a falsey value
I assume you want to be able to tell whether the input object contains a field named "qux".
One approach would be to test directly, e.g.
echo '{"foo": "bar"}' | jq 'has("qux")'
This returns a JSON boolean value (i.e. true or false), so one can
either capture the output, or set a return code. The former
is easy to do, and the latter requires using the "-e" option:
$ echo '{"foo": "bar"}' | jq -e 'has("qux")' && echo yes
false
$ echo '{"foo": "bar", "qux":1}' | jq -e 'has("qux")' && echo yes
true
yes
EXAMPLE 2:
echo "["a", "b", "lorem ipsum"] | jq [...] | grep "lorem" # should return lorem ipsum (unquoted)
In this case, it looks as though you want to print out all the elements in the input array that contain the string "lorem", but without quotation marks. The key here is to let jq perform the selections:
echo '["a", "b", "lorem ipsum"]' | jq -r '.[] | select(test("lorem"))'
lorem ipsum
EXAMPLE 3:
echo "{foo: bar}" | jq [...] | grep "foo" # should return something like foo,bar or foo=bar
One approach here would be based on with_entries/1 or to_entries/0. Consider:
$ echo '{"foo": "bar", "foobar": 0, "baz": 1}' | jq 'with_entries(select( .key | test("foo") ))'
{
"foo": "bar",
"foobar": 0
}
The output can of course be flattened as desired, e.g.
$ echo '{"foo": "bar", "foobar": 0, "baz": 1}' |
jq -r 'to_entries[] | select( .key | test("foo") ) | "\(.key)=\(.value)"'
foo=bar
foobar=0
This returns a JSON boolean value (i.e. true or false), so one can
either capture the output, or set a return code. The former
is easy to do, and the latter requires using the "-e" option
The idiomatic way to denote falsity in shell scripting is by outputting nothing. Exit codes denote execution errors.
The key here is to let jq perform the selections
The Unix way is about building small, composable tools. No one wants to learn the jq way of grepping when they already know grep.
@lilred - The documentation about jq is quite upfront that it's like awk. Both support shell scripting, and both require some learning. Neither is pedestrian.
I sure don't see a lot of awk scripts these days.
The idiomatic way to denote falsity in shell scripting is by outputting nothing. Exit codes denote execution errors.
Eh? No, that's not how grep(1) signals failure, it uses error codes.
Most Unix tools that I can think of use exit codes to denote failures,
not zero-length output.
The key here is to let jq perform the selections
The Unix way is about building small, composable tools. I don't want to have to learn more
jqsyntax than I need to if my tools already cover this use case.
The jq philosophy is no different, but you have to build those small
composable tools in jq itself.
I understand that you're not committed to supporting shell scripters as first-class users, but I would have appreciated if you'd been upfront about that instead of recommending half-baked workarounds.
I use jq in scripts all the time. I have been for a couple of years.
I don't know where this "not committed to supporting shell scripters"
assertion comes from. On the contrary, I've added command-line options
to make scripting easier. That assertion is offensive; please be kind.
echo "{foo: "bar"} | jq [...] ".qux" | test # should return a falsey value
The test(1) command on Unix does not check its input.
echo "["a", "b", "lorem ipsum"] | jq [...] | grep "lorem" # should return lorem ipsum (unquoted)
But how would the structure of the input (in this case it's an array) be denoted? Sure, it's --raw-output that you want, but still, what should be done with anything (not numbers and not strings) for which there's no obvious raw text representation? Why is one answer better than another?
But jq lets you provide those different answers yourself! For example:
$ echo '["a", "b", "lorem ipsum"]' | jq -r '..|scalars'|
a
b
lorem ipsum
$
and now you can grep for lorem if you like and get exactly what you're looking for.
Someone else might want this instead:
$ echo '["a", "b", {"test":"lorem ipsum"}]' | jq -cr 'tostream|@text "/\([.[0][]|tostring]|join("/"))=\(.[1])"'
/0=a
/1=b
/2/test=lorem ipsum
/2/test=null
/2=null
$
And someone else might want... We can't predict what all they might want, and it's not clear what is the Right Thing To Do here. But because jq is a programming language, _you_ get to decide. We can add command-line options that make jq do what you think is right, but it won't help others, and if we add one such option for everyone who has an opinion as to what is right here, well, we'll end up with a huge number of options. Because jq is a programming language we can avoid this problem. But you may have to invest some time (probably not much) in learning the jq programming language.
echo "{foo: bar}" | jq [...] | grep "foo" # should return something like foo,bar or foo=bar
Ah, see, already you have more than one option. See above though.
For what is worth... ;-)
sample usage: SPACE_UUID=$(jq_grep '.SpaceFields.GUID' < ~/.cf/config.json)
jq_grep(){
local jq_query=$1
local pipe_data=""
while read line || [ -n "$line" ]; do
pipe_data="$pipe_data$line"
done
#echo "jq_query: ${jq_query}"
#echo "pipe: ${pipe_data}"
local result=$(echo ${pipe_data} | jq -r "${jq_query}")
if [ "$result" = "null" ]; then
echo "-"
else
echo ${result}
fi
}
Since this issue was opened in 2015, it's probably worth mentioning that gron is now a thing that exists, and you can use it.
I asked for something like this on the Gron issue tracker and @tomnomnom wrote gron2shell. A+++ what a guy.
I use gron2shell in my build automation. Works well with Linux core tools.
<filter> | values will output nothing if the result of <filter> is null.
Additionally, adding @sh to the filter should be considered when working with the shell.
Your jq invocation would look like this:
API_URL=$(jq -r '.pagination.next_url | values | @sh' $JSON)
@nicowilliams I read all of this thread, and I don't understand how the suggestion about --shell-output got shutdown? It seems to me that the discussion got distracted at some point.
This (slightly altered) behavior:
null returns nothing
strings are returned unquoted
numbers are returned as-is
top-level arrays are returned as NUL-separated lists
top-level objects are returned as KEY\nVAL NUL-separated lists
the behaviour of nested objects/arrays isn't that important, since the entire point of jq is drilling down into JSON structures.
makes a lot of sense to me, who am an inexperienced user of jq but a somewhat experienced user of zsh. It is easy to work with NUL, \n and the empty string in the shell.
I know this mode might seem useless to you guys who know jq very well, but I certainly find it very userfriendly. I'll have to create wrappers around jq to add this behavior myself, which I will as usecases arise, but wrappers are generally bad for portability and readability of code.
Having lots of commandline options doesn't deter inexperienced users; We just search the manpage for options we might find useful. One of the hallmarks of mature CLI tools I love is their wealth of options that allow you to accomplish common usecases by a simple search in the manpage.
BTW, my current wrapper:
jqm () {
jq -re "$@[1,-2]" "$@[-1] // empty"
}
The idiomatic way to denote falsity in shell scripting is by outputting nothing. Exit codes denote execution errors.
Eh? No, that's not how grep(1) signals failure, it uses error codes.
Most Unix tools that I can think of use exit codes to denote failures,
not zero-length output.
you seem to be confused as to what the person initially said, as you just repeated their statement
jq really needs to have an empty string on null and false. not only for bash but many linux tools. it's not like this is a huge a request
this is absolutely a gaping hole in jq which should be absolutely centered around shell usage.
"jq is like sed for JSON data" -- so says the first statement on jq's website
no it isn't.
looping over results and piping results into other tools is needlessly annoying and needlessly contrary to common unix standards. jq is an amazing tool that for some reason doesn't seem to understand it's being used in an environment with lots of bulk data handling. it's going to be used for tasks like sending API results to mysqlimport, scraping logs to create cronjobs, filtering queues to generate command sequences, etc
moreover, listen to your users. don't make them beg you for four years only to be moderated into obscurity. cmon.
a combination of the -e option and adding | select (.!=null) at the end of the command does a pretty good job when piping stuff into each other :)
hi! i have this and return null... way?
while IFS= read -r line; do
array[i]=$line
let "i++"
done < $fe
for line in "${array[@]}"
do
rjq=$(jq -re --raw-output --arg nlu $line '.[] | select(._ref==$nlu) | .name' $fx)
echo $rjq
done
@Eddie2k06 quote the $line bash variable in --arg nlu $line.
Without quotes, $line will expand to multiple arguments when it has whitespace.
In any case, you should check that script with ShellCheck.
@Eddie2k06 quote the
$linebash variable in--arg nlu $line.
Without quotes,$linewill expand to multiple arguments when it has whitespace.In any case, you should check that script with ShellCheck.
Thank u!
I just started working with a lot of json on the commandline a few weeks ago and jq is a great tool to use, but of course I hit the 'null' in the output problem.. It's a pity this issue was closed as wontfix :-(
I wanted to exclude null, and the highest-ranked comment does that:
Most helpful comment
The easy thing to do is to add
select (.!=null)at the end of your jqprogram.