Jq: Merge json objects/files

Created on 23 Oct 2013  路  8Comments  路  Source: stedolan/jq

Is there a way to merge to json inputs?

test.json:

{"title": "Frist psot", "author": "Anonymous Coward"}

test2.json:

{"title2": "A well-written article", "author2": "Person McPherson"}

Command:

$ jq "." test.json test2.json

Output:

{
  "author": "Anonymous Coward",
  "title": "Frist psot"
}
{
  "author2": "Person McPherson",
  "title2": "A well-written article"
}

Expected Output:

{
  "author": "Anonymous Coward",
  "title": "Frist psot",
  "author2": "Person McPherson",
  "title2": "A well-written article"
}
feature request support

Most helpful comment

If you use --slurp or it's short form -s, jq will read all of the inputs into one array:

./jq -s . test.json test2.json 
[
  {
    "author": "Anonymous Coward",
    "title": "Frist psot"
  },
  {
    "author2": "Person McPherson",
    "title2": "A well-written article"
  }
]

From there, you can use add to get what you want:

./jq -s add test.json test2.json 
{
  "title2": "A well-written article",
  "author2": "Person McPherson",
  "author": "Anonymous Coward",
  "title": "Frist psot"
}

All 8 comments

:+1:

If there was a counter of inputs it's be easier. Anyways, something like
this worked for me:

jq -s --argfile b b '. as $a|range(0;$a|length)|$a[.] + &b[.]'

If there was a way to count inputs from stdin then only the second file
would have to be slurped first. It be even better if there was a function
to read one input each at a time from N streams and output an array of
those inputs. Hmmm, probably not too difficult to add.

We've talked about this in past issues.

I think the right thing to do would be to have a built-in for opening files which binds the opened file handle to a given handle name. Add to this built-ins for reading one text from N handles... and you're good to go. This should be easy enough to implement.

The same pattern would work for using, say, SQLite3, databases.

Something like:

jq --null-input --file-handle /tmp/foo fooh --file-handle /tmp/bar barh 'read_all("fooh", "barh")|.'

and

jq --null-input 'open("/tmp/foo") as $fooh|open("/tmp/bar") as $barh|read_all($fooh, $barh)|.'

which should output arrays of up to two JSON values: one from /tmp/foo and one from /tmp/bar.

And

jq --load-extension jqsqlite.so jsqlite_init --null-input 'sqlite3("/tmp/foo") as $db|....|sqlite_exec_fetch_one($db, "SELECT ...;")|...'

with query parameters bound to the exec functions' . input values.

If you use --slurp or it's short form -s, jq will read all of the inputs into one array:

./jq -s . test.json test2.json 
[
  {
    "author": "Anonymous Coward",
    "title": "Frist psot"
  },
  {
    "author2": "Person McPherson",
    "title2": "A well-written article"
  }
]

From there, you can use add to get what you want:

./jq -s add test.json test2.json 
{
  "title2": "A well-written article",
  "author2": "Person McPherson",
  "author": "Anonymous Coward",
  "title": "Frist psot"
}

Can somebody close this issue? The https://github.com/stedolan/jq/issues/200#issuecomment-28058485 shows how to solve the given problem.

This doesn't recursively merge because add uses + rules, not * though?

@chancez,

It works at the root level of the object and in the case of an array, it will take the objects from each array and return a new array with all of the objects.

Real-life example: I have 8 files and each are arrays of objects like:

[
    {object},
    {object},
    {object},
    {object},
    ...
]

Each has a different number of objects:

$ ls -1 *.json | while read f; do echo "=> $f"; jq '. | length' $f; done
=> A.json
5899
=> B.json
100000
=> C.json
2919
=> D.json
55015
=> E.json
17432
=> F.json
100000
=> G.json
5534
=> H.json
100000

Merging took less than 60 seconds on my busy MacBook Pro to wait until all input streams are read, merge arrays, and pretty-print a sorted output:

$ time jq -Ss 'add' signalflow_metadata_raw-ish_objects.* > merged.json
jq -Ss 'add' signalflow_metadata_raw-ish_objects.* > merged.json  48.40s user 2.54s system 98% cpu 51.758 total

The merged JSON is a single array with all of the objects:

$ jq '. | length' merged.json
386799

Starting with a base object and then reading all of the input streams will result in that base object being an array with each file's array being added to the base object -- that results in an array of 8 arrays:

$ jq -Ss '.' signalflow_metadata_raw-ish_objects.* > merged.json
$ jq '. | length' merged.json
8

And that would then look like:

[
    [
        {object},
        {object},
        ...
    ],
    [
        {object},
        {object},
        ...
    ],
    [
        {object},
        {object},
        ...
    ],
    [
        {object},
        {object},
        ...
    ]
]

Can the jq gods assist me in merging a slurp?

Say for example merging deps and dev deps in package.json files. I can do it over multiple runs:

    jq -s 'map_values(.devDependencies) | add | { devDependencies: . }' \
        ./packages/*/package.json > .docker/devDependencies.json
    jq -s 'map_values(.dependencies) | add | { dependencies: . }' \
        ./packages/*/package.json > .docker/dependencies.json
    jq -s '. | add' .docker/*.json > .docker/package.json
Was this page helpful?
0 / 5 - 0 ratings

Related issues

neowulf picture neowulf  路  3Comments

kaihendry picture kaihendry  路  4Comments

rubensayshi picture rubensayshi  路  3Comments

ve3ied picture ve3ied  路  4Comments

LoganBarnett picture LoganBarnett  路  3Comments