Describe the bug
This is _possibly_ related to #1433 and #1110, and peripherally to #105. If jq's input file is the same as its output file (via a pipe), jq will emit nothing, or perhaps even make a partial write (more below). I have only noticed this on write operations though it is possible a filter could produce this as well.
To Reproduce
This is a expected scenario, where the input and output are separate:
echo "with separate out file"
echo '{"foo": 1, "bar": ["baz", "qux"]}' > in.json
jq '.foo=2' in.json > out.json
cat out.json
The output of the expected scenario will be the entire document with foo set to 2.
Here is the output of breaking scenario, where jq reads in in.json and also pipes its output there:
echo "jq pipe to same file with set from file"
echo '{"foo": 1, "bar": ["baz", "qux"]}' > in.json
jq '.foo=2' in.json > in.json
cat in.json
The output of this breaking scenario is that in.json is simply an empty file. No stderr is seen either.
Environment (please complete the following information):
sysctl kern.version: kern.version: Darwin Kernel Version 18.7.0: Mon Feb 10 21:08:45 PST 2020; root:xnu-4903.278.28~1/RELEASE_X86_64)Additional context
This was a head scratcher for me. In hindsight, I'm not even sure this should be a bug. If jq streams (I didn't think you could get away with that in a JSON document), then I could easily understand how this breaks because the output is being written as it is being read - what should the outcome be in that case? Undefined, I imagine.
For #1433 and #1110, this initially appeared in my tests as an issue with isatty but that was a poor rabbit hole. I was only flipping between watching stdout and writing to my file (which was also my input).
I did mention this is peripherally related to the dead horse that is #106. I am not in a position to put together such a pull request, but I can record my findings and hopefully this report will help someone else who tries to get around it as I did. The workarounds listed in #106 are numerous - I went with writing to a temporary file and then moving it. No biggie.
If there were one ask from this ticket (I hesitate to call it a bug), it might be to either allow jq to buffer everything with some kind of flag(-s doesn't seem to do that, or I misunderstand it), or produce an error when the input gets funky. Though such a case seems very niche. I would not press the maintainers to pursue a fix. I will allow the maintainers to do as they wish with though, and close it. It'll be indexed by web crawlers and that's my primary goal here :)
Thanks for maintaining such a delightful, self-standing tool! I've enjoyed jq for many years, and I've recently come across oq which adds YAML and XML to the mix.
So this isn't actually a jq bug, it's a function of how shells work. The stream redirection you did for outputting (> in.json) causes your shell to open the output file and truncate it before running the command. This means that by the time jq gets to look at whatever its input is, the file is already empty.
In general, we recommend using a tool like sponge (https://linux.die.net/man/1/sponge) to handle this- it consumes the entire input stream before writing to the output file, which avoids the truncation issue above. In your case, the command would become jq '.foo=2' in.json | sponge in.json. On macOS, you can find sponge in the moreutils brew package.
That said, re: your comments about streaming-
jq "streams" by default in that input files need not be a _single_ JSON value, but could be multiple JSON values (imagine a file with multiple JSON objects in it- structured logging is a common example). So it passes each of those JSON values one-at-a-time to your jq program.
The -s flag modifies this default behavior where it slurps up all JSON values in the input into a single array and passes that whole array to your jq program.
It also has a streaming mode (--stream), which is mainly meant for when you have _very large_ objects that won't fit well in memory. In that case, the parser emits values to your jq program as it parses through the JSON values. This mode is admittedly a bit complicated and tricky to use, however.
So this isn't actually a jq bug, it's a function of how shells work. The stream redirection you did for outputting (> in.json) causes your shell to open the output file and truncate it before running the command. This means that by the time jq gets to look at whatever its input is, the file is already empty.
TIL. Thanks for the clarification!
he -s flag modifies this default behavior where it slurps up all JSON values in the input into a single array and passes that whole array to your jq program.
It also has a streaming mode (--stream), which is mainly meant for when you have very large objects that won't fit well in memory. In that case, the parser emits values to your jq program as it parses through the JSON values. This mode is admittedly a bit complicated and tricky to use, however.
Thankfully the documentation was pretty good here on explaining its nuances from what I was thinking jq was doing.
Thanks again! I'll close this out now.
Most helpful comment
That said, re: your comments about streaming-
jq "streams" by default in that input files need not be a _single_ JSON value, but could be multiple JSON values (imagine a file with multiple JSON objects in it- structured logging is a common example). So it passes each of those JSON values one-at-a-time to your jq program.
The
-sflag modifies this default behavior where itslurps up all JSON values in the input into a single array and passes that whole array to your jq program.It also has a streaming mode (
--stream), which is mainly meant for when you have _very large_ objects that won't fit well in memory. In that case, the parser emits values to your jq program as it parses through the JSON values. This mode is admittedly a bit complicated and tricky to use, however.