Elasticsearch: SSH-based snapshot repository type

Created on 30 Sep 2014  路  11Comments  路  Source: elastic/elasticsearch

Sometimes a shared filesystem is too much work to set up. The most convenient way to transfer files securely between remote nodes is usually SCP.

A snapshot repository type that used SSH to transfer files to a remote host would be a great addition to Elasticsearch, and would make snapshots more useful out of the box.

Configuration could be something like this:

{
    "type": "ssh",
    "settings": {
        "location": "[email protected]:/backups/elasticsearch",
        "ssh_key": "/home/elasticsearch/.ssh/id_rsa"
    }
}

What do you think @imotov?

:DistributeSnapshoRestore discuss help wanted

Most helpful comment

I'm glad someone else needs this feature as well ! :+1:

All 11 comments

I'm glad someone else needs this feature as well ! :+1:

I created SSH repository for elasticsearch 1.4:
https://github.com/marevol/elasticsearch/commit/bf0348c732d3d35e2634c8db3f42c2112d76e4da
If we have a chance to merge it, I'll send PR.
The configuration to create ssh repository is below.

curl -XPUT 'http://localhost:9200/_snapshot/backup_ssh' -d '{
    "type": "ssh",
    "settings": {
        "location": "/somewhere/snapshot_dir",
        "host": "123.123.123.123",
        "port": 22,
        "username": "taro",
        "password": "xxxxxxxx",
        "compress": true
    }
}'

I like the idea. Just wondering if this should come as a built-in feature or as a plugin.
@imotov WDYT?

Isn't it easy enough to mount a remote filesystem using sshfs?

@jpountz It could be enough. But I think it requires more work (administrative task) as you need to mount this on every single node. With SSH, you just have to use it! :)

But then that work is done in the right place, by the user.

Otherwise, elasticsearch has to interact with ssh directly like here:

  • i dont like private keys read in as string
  • what if i want ssh-agent support?
  • what if my private key requires a pass phrase?
  • why are options like StrictHostKeyChecking automatically turned off?!
  • what about security vulnerabilities or problems in this 'jsch' ssh implementation, now or in the future?

This is too scary IMO.

I agree we should not deal with credentials here. I think we should rather make it super simple to configure this and / or make it simpler to build your own plugin if you really wanna do that as a build in option.

@jpountz sshfs requires fuse support and is unreliable.

A more general solution is a command executor that spawns one or more processes with arguments similar to scp. The command returns success if the files are successfully handled.

{
    "type": "process",
    "settings": {
        "command": "/usr/local/bin/scp_to_remote"
    }
}

There are a lot of details to work out (safely spawning processes, timeouts, retries, argument formatting) but if it worked it could be useful for integrating with existing backup solutions.

There are a lot of details to work out (safely spawning processes, timeouts, retries, argument formatting) but if it worked it could be useful for integrating with existing backup solutions.

I am not a huge fan of external proceses. This is so painful in java I don't think this will be an option here to be honest.

Thank you for your comment/feedback.
I think that it's better to avoid any security concerns in elasticsearch.
I'll modify/provide it as one of plugins in https://github.com/codelibs

Ok. So we can close this thread. When you're done, feel free to update the plugins page. Thanks!

Was this page helpful?
0 / 5 - 0 ratings