Jq: Merge arrays in two json files

Created on 24 Jul 2014  路  8Comments  路  Source: stedolan/jq

Hi,

It there a way to merge two json files with arrays so that the array is merged instead of overwritten? Ie:

File1:

{  
   "network":{  
      "servers":[  
         "logstash:5000"
      ],
      "timeout":15,
      "ssl ca":"/etc/pki/tls/certs/logstash-forwarder.crt"
   },
   "files":[  
      {  
         "paths":[  
            "/var/log/syslog",
            "/var/log/auth.log"
         ],
         "fields":{  
            "type":"syslog"
         }
      }
   ]
}

file2:

{
  "files": [
    {
      "paths": [
          "/var/log/nginx/kibana.access.log"
         ],
        "fields": { "type": "nginx-access" }
    },
    {
      "paths": [
          "/var/log/nginx/error.log"
         ],
        "fields": { "type": "nginx-error" }
    }
   ]
}

When I run

jq -s add file1 file2

I get

{
  "files": [
    {
      "fields": {
        "type": "nginx-access"
      },
      "paths": [
        "/var/log/nginx/kibana.access.log"
      ]
    },
    {
      "fields": {
        "type": "nginx-error"
      },
      "paths": [
        "/var/log/nginx/error.log"
      ]
    }
  ],
  "network": {
    "ssl ca": "/etc/pki/tls/certs/logstash-forwarder.crt",
    "timeout": 15,
    "servers": [
      "localhost:5000"
    ]
  }
}

when wat i want is

{
  "files": [
 {
      "paths": [
          "/var/log/nginx/kibana.access.log"
         ],
        "fields": { "type": "nginx-access" }
    },
    {
      "paths": [
          "/var/log/nginx/error.log"
         ],
        "fields": { "type": "nginx-error" }
    },
    {
      "fields": {
        "type": "nginx-access"
      },
      "paths": [
        "/var/log/nginx/kibana.access.log"
      ]
    },
    {
      "fields": {
        "type": "nginx-error"
      },
      "paths": [
        "/var/log/nginx/error.log"
      ]
    }
  ],
  "network": {
    "ssl ca": "/etc/pki/tls/certs/logstash-forwarder.crt",
    "timeout": 15,
    "servers": [
      "localhost:5000"
    ]
  }
}

Can't find a way to accomplish this. Thanks in advance.

support

Most helpful comment

@alrayyes No, apparently it wasn't. My bad for writing something using the bleeding-edge version.
@pkoppstein's solution will also work, but if you're trying to work with more than two files, it will rapidly get unwieldy.
If you want to use the version with flatten, then

jq -s 'def flatten: reduce .[] as $i([]; if $i | type == "array" then . + ($i | flatten) else . + [$i] end);
 [.[] | to_entries] | flatten | reduce .[] as $dot ({}; .[$dot.key] += $dot.value)' file1 file2

Flatten is written entirely in jq, so we can just stick it in the command-line program.
The benefit to the reduce based versions is that they will take arbitrary numbers of input files. Either way, options abound.

Also, @pkoppstein, another solution in the same vein as yours is to use --argfile.
jq --argfile f1 file1 --argfile f2 file2 -n '$f1 + $f2 | .files = $f1.files + $f2.files'
Skips some of the assignments.

All 8 comments

It looks to me like the output you say you're wanting doesn't quite make sense...You appear to have the same items in files twice. Did you mean

{
  "network": {
    "servers": [
      "logstash:5000"
    ],
    "timeout": 15,
    "ssl ca": "/etc/pki/tls/certs/logstash-forwarder.crt"
  },
  "files": [
    {
      "paths": [
        "/var/log/syslog",
        "/var/log/auth.log"
      ],
      "fields": {
        "type": "syslog"
      }
    },
    {
      "paths": [
        "/var/log/nginx/kibana.access.log"
      ],
      "fields": {
        "type": "nginx-access"
      }
    },
    {
      "paths": [
        "/var/log/nginx/error.log"
      ],
      "fields": {
        "type": "nginx-error"
      }
    }
  ]
}

Either way, what you're looking for is likely reduce.
The command I came up with for this is jq -s '[.[] | to_entries] | flatten | reduce .[] as $dot ({}; .[$dot.key] += $dot.value)' file1 file2.
This has the side effect of adding any keys with the same name in any of the files. If more than one of them have network keys, the values for them get added. Since only one of the samples you posted have a network key, I'm assuming that isn't a problem. If you're okay with only joining the files arrays, then jq -s 'reduce .[] as $dot ({}; .files += $dot.files)' file1 file2 is the best way to do it.

That's exactly what I mean, thanks you! Spent far too many hours trying to figure this out and the solution (as usual) is simple. Am I correct in assuming that flatten isn't supported by 1.4?

@alrayyes -- If you just want to add the two "files" arrays together (as seems to be the case here), you could write:

jq -s '.[0] as $o1 | .[1] as $o2 | ($o1 + $o2) | .files = ($o1.files + $o2.files)'  file1 file2

Here, the two objects are "slurped" into an array, which we extract as $o1 and $o2. The expression ".files = ($o1.files + $o2.files)" then ensures the "files" property is properly set.

p.s. You're right. flatten was added after version 1.4 was released.

@alrayyes No, apparently it wasn't. My bad for writing something using the bleeding-edge version.
@pkoppstein's solution will also work, but if you're trying to work with more than two files, it will rapidly get unwieldy.
If you want to use the version with flatten, then

jq -s 'def flatten: reduce .[] as $i([]; if $i | type == "array" then . + ($i | flatten) else . + [$i] end);
 [.[] | to_entries] | flatten | reduce .[] as $dot ({}; .[$dot.key] += $dot.value)' file1 file2

Flatten is written entirely in jq, so we can just stick it in the command-line program.
The benefit to the reduce based versions is that they will take arbitrary numbers of input files. Either way, options abound.

Also, @pkoppstein, another solution in the same vein as yours is to use --argfile.
jq --argfile f1 file1 --argfile f2 file2 -n '$f1 + $f2 | .files = $f1.files + $f2.files'
Skips some of the assignments.

Thanks. Using the last one where flatten is defined works perfectly. Maybe you should put a couple of these sorts of examples in the tutorial? It would make life a lot easier for people using jq for the first time.

I agree that the documentation should probably have some more detailed tutorials. Maybe we put them on the wiki?

@wtlangford wrote:

Maybe we put them on the wiki?

Yes. It wouldn't take much to reorganize the github wiki so that it had two sections: "Internals" and something like "Examples" or "Snippets".

@wtlangford my hero

Was this page helpful?
0 / 5 - 0 ratings

Related issues

rclod picture rclod  路  4Comments

kelchy picture kelchy  路  4Comments

ghost picture ghost  路  4Comments

lbrader picture lbrader  路  3Comments

rubensayshi picture rubensayshi  路  3Comments