Restic: Using rclone as a backend

Created on 17 Jan 2018  路  63Comments  路  Source: restic/restic

This issue is there to discuss prototype implementations which enable restic to use rclone as a backend. This makes a lot of sense since rclone already has all the cloud backends implemented (including user configuration and so on). I've had a very productive discussion with @ncw, who is the author of rclone, about how to make restic use rclone to access data, and we've decided to continue our discussion in a public issue here on GitHub.

My idea was to add a new backend to restic which talks to a second process via stdin/stdout, using a protocol that is still to be defined. I read nice things about https://github.com/hashicorp/yamux, which allows mixing several streams in parallel over a single "connection" (very similar to what HTTP2 does). We'd need to define a protocol for accessing files on top of that, with just the basic operations that restic needs.

This backend can then be implemented in rclone, so that restic just starts rclone serve restic-backend (or whatever command line we need), talks to rclone, and rclone takes care of saving/loading/removing the files somewhere.

Taking this a step further, we could also implement a server side for this backend in restic itself, which can then be started e.g. via SSH on a remote server, so restic runs ssh user@host restic serve backend, and use that connection, which will be way more efficient than sftp. If we make the protocol extendable, we could also add features such as remote repacking (if supported by the "server"), so that data repacked during prune does not need to be downloaded and re-uploaded.

The protocol could then be also implemented in other programs, so we can use them as "plugins" to access data stored anywhere.

A slightly different approach would be to speak HTTP2 over stdin/stdout with a process like rclone, or use something like https://github.com/hashicorp/go-plugin and implement the backend in Go as a plugin.

So, this issue is to advanced the discussion further and play with a few sample implementations, so we can get a feel for what works best.

Most helpful comment

I've played around with it, looks great already!

I've added two commits here: https://github.com/fd0/rclone/tree/serve-restic, the second one adds an http2/server mode on stdin/stdout. It can be used with my branch https://github.com/restic/restic/tree/rclone-backend like this:

$ ./restic -r rclone:b2:restic-dev-an/path/to/repo snapshots

It'll automatically start rclone in http2/server mode with --stdin, you can also pass the command line as an option:

$ ./restic \
  -r rclone:b2:restic-dev-an/path/to/repo \
  -o rclone.command="rclone serve restic --stdin --bwlimit 1M --verbose b2:restic-test-an/foo" \
  backup .

Internally, I'm using the REST backend and just add a bit of glue to start the rclone process and tear it down correctly.

This can, by the way, easily be started via ssh on a remote host, just set -o rclone.command='ssh user@host rclone serve restic --stdin b2:restic-test-an/foo'.

Please let me know what you think!

All 63 comments

What's been said about the redundancy that will arise with the existing backends, e.g. S3, Swift, etc?

There hasn't been a decision yet. I don't see the point in duplicating all the work and support all the thousands of cloud services, but for the existing backends I also don't see any point in removing them. The existing backends have the nice property of not requiring the setup of a third-party tool, you can get away with restic alone, which is also desirable.

For now this is just an experiment, we'll see how it goes.

I made a little experiment in the serve-restic branch for rclone. I took the restic/rest-server repo and mashed it into rclone with great force and little subtlety ;-)

Run rclone serve restic remote: where remote: can be any rclone remote, or a local directory.

I've tested it and it seems to work - I tried a few operations. It doesn't have any tests. Note I didn't vendor the dependencies either.

It isn't ideal as we have the whole listening socket thing to contend with, ie it isn't secure from other applications. It would be relatively easy to pass a password in the environment when it is started and require that. Choosing a random port would probably be necessary also.

Thoughts?

I've just tried it: Needs some polishing ;)

I've just pushed a few commits to the ext-rest-backend-test branch, which allows you to run our backend tests against a REST server specified in an environment variable, like this:

$ RESTIC_TEST_REST_REPOSITORY=rest:http://localhost:8000/foo go test ./internal/backend/rest

There seems to be an issue with listing files:

FAIL |         --- FAIL: TestBackendRESTExternalServer/TestList/max-11 (1.48s)
     |          tests.go:294: loaded 39 IDs from backend
     |          tests.go:297: lists are not equal, list1 38 entries, list2 39 entries

And with byte ranges:

FAIL |     --- FAIL: TestBackendRESTExternalServer/TestLoad (81.05s)
     |      tests.go:28: rand initialized with seed 1516439868654186529
     |      tests.go:142: saved 13138440 bytes as <data/39caa259bf>
     |      tests.go:206: Load, l 2668197, o 8157054, len(d) 2668197, getlen 2668197
     |      tests.go:207: Load(2668197, 8157054) wrong number of bytes read: want 2668197, got 13138440

From our conversation via email:

  • We can easily set a Content-Length header, restic uploads temp files (so it can retry) and knows the size beforehand
  • I'd prefer if rclone would not retry an upload or buffer in memory, that's already done by restic

I've just tried it: Needs some polishing ;)

Hehe!

Thanks for giving instructions on how to run the tests.

There seems to be an issue with listing files:

Not sure what that is about. I'm sure it will become clear though!

And with byte ranges:

I didn't implement byte ranges as I didn't think they were needed. A mistake I now see!

We can easily set a Content-Length header, restic uploads temp files (so it can retry) and knows the size beforehand

I think that is probably a good idea. Most of the backends rclone supports need to know the length of the file in advance of the upload, so that will make rclone's life easier. It will also change the format of the POST from chunked transfer encoding to straight forward POST (which probably makes no difference). It will allow rclone to check that the right amount of data arrived in the POST which is good.

I'd prefer if rclone would not retry an upload or buffer in memory, that's already done by restic

I was going to ask you about that - whether restic did retries or not. rclone will still do low level retries when it tries to open the objects for read, doing listings etc, but I'll make it so it doesn't buffer or retry the data.

There is one other thing that I've been thinking about... I'd quite like to cache the listings that restic does. Rclone has a vfs layer which will do this - it will make list a directory, fetch objects much more efficient as rclone won't have to look up each object again. The cost is a small amount of memory. The VFS layer will make implementing Range requests trivial too.

If in the future restic uses the rclone backends directly then I'd expect restic to cache the listings itself and we could stop using the VFS layer.

I'll have a go and create v2 in the next day or two and I'll post an update here.

Have you considered embedding rclone as a library into restic?

As a home user I am unlikely to benefit from rclone server but will it will be more difficult for me to install, configure and maintain two pieces of software. I do understand how separate rclone server can be attractive in enterprise environment, so guess it really depends on who your target audience is.

@ifedorenko wrote:

Have you considered embedding rclone as a library into restic?

I'm sure we'll get to that point eventually - we are just experimenting at the moment :-)

Have you considered embedding rclone as a library into restic?
As a home user I am unlikely to benefit from rclone server but will it will be more difficult for me to install, configure and maintain two pieces of software.

Yes, but at the moment, restic doesn't support all the services rclone does, which even has a nice dialogue system for configuration.

From my perspective it's also about using the resources (mainly developer time) wisely, using rclone for accessing services we don't support and providing a nice way of configuration is great, even if it comes with the additional work of having to install rclone. :)

@fd0 wrote

I've just pushed a few commits to the ext-rest-backend-test branch

I couldn't find that branch on the restic repo? Am I looking in the wrong place?

Oh, sorry, I've already merged the patches into master in #1569, so you can just use the master branch.

I see what was happening with my previous test... I was running the command you gave which was attempting to start rest-server on the same port that rclone was running hence causing confusion!

I've settled on this as a test routine

Run this in one terminal window

rclone -vv serve restic /tmp/restic-test

And run this to test

RESTIC_TEST_REST_REPOSITORY=rest:http://localhost:8000/ go test ./internal/backend/rest -run TestBackendRESTExternalServer

Does that seem sensible?

I'll attempt to make that pass!

I'd prefer if rclone would not retry an upload or buffer in memory, that's already done by restic

I've realised I'm going to need restic to supply Content-Length headers to completely remove the buffering.

I've updated the branch with some code which now passes all the tests :-)

I've also sent a PR which would have saved me loads of time had it been in place!

Awesome, I just had a quick look and it works very well! Thank you!

So, what's the next step? From my point of view it'd be best to try out the go-plugin library by hashicorp, so starting/stopping rclone can be integrated into restic. Thoughts?

Or maybe go-plugin is not the right choice, it is not designed to work over e.g. ssh. Hm. Maybe build something ourselves?

I've played around with gRPC over yamux today, works quite well. The pros are:

  • Machine-readable interface specification
  • Code generation for most languages, including Go
  • Can be spoken over stdin/stdout (only with a stream muxer like yamux)

Cons:

  • Depends on yamux, which is a battle-tested but uncommon protocol
  • Sending large amounts of data requires sending the data in chunks, which need to be acknowledged from the other side, so it's very latency-dependent. Saturating a high-latency link requires large chunks or huge parallelism.

We could ditch yamux and use HTTP2 (without TLS) instead, we could then use either HTTP directly or use gRPC again. I wonder if there's a simpler solution. Hm. Not very satisfying.

From what I understand, Yamux is inspired by SPDY (the predecessor of HTTP2, IIRC) but incompatible with it. What would be the advantage of using a custom protocol over a standard one?

It seems to me HTTP2 would be a lot of advantages in terms of interoperability, design, portability and security (why no TLS?). Could you expand on why you are hesitant in using it?

Regarding latency, Google is working on QUIC to resolve those problems (LWN has a good series about it, although APNIC has a caveat as well), so there's a standard way out there as well.

I've played around with gRPC over HTTP2 via stdin/stdout (without TLS) today, also works quite well. Although I had to fiddle with it a bit to get HTTP2 over stdin/stdout working, but that's done now. It requires golang.org/x/net/http2, but we've vendored that anyway. It makes the backend implementation even more flexible: We could have a backend via HTTP2 via stdin/stdout (e.g. local rclone process, or running a process via ssh) or even use the HTTP2 backend directly. Neat.

Also, saturating a high-latency link is better when using HTTP2 (compared to yamux)

@ncw do you have any preferences or experiences in regards to an RPC or stream muxing framework?

I've been working on integrating the REST API server into rclone and I've now got something I'm happy with! I had to make a common HTTP serving layer for the two existing http servers in rclone (!) before I added another one! I ended up re-writing the restic http server almost from scratch.

You can find the code in the serve-restic repo on the rclone repo. It passes the restic backend tests and a bit of light manual testing - feedback would be appreciated :-)

If you have the time I'd appreciate some feedback on:

  • the docs - do they make sense restic-wise?

    • the backend tests is that a sensible integration test (it will only run on the integration server not as a unit test)?

I re-wrote the mux from first principles and managed to make it almost policy free. The only bit of policy is here where config objects can be overwritten but nothing else can. That could maybe be replaced with a parameter allow_overwrite...

The REST API for restic seems to do the job well. The only thing that would make rclone's life easier would be to have the Content-Length on the POST a new object call. I looked at patching restic to add that but my conclusion was that it was more difficult that I first thought as the backend Save() call doesn't know the length of the object it is POSTing. A hint here would be helpful :-)

The consequences of rclone not knowing the length of the thing it is saving is that on some remotes (any which don't support stream upload), rclone will have to spool it to memory/disk before uploading it.

I'd quite like to merge the serve-restic branch into master so our users have got a version 1.0 to play with - what do you think?

@fd0 wrote

I've played around with gRPC over HTTP2 via stdin/stdout (without TLS) today, also works quite well. Although I had to fiddle with it a bit to get HTTP2 over stdin/stdout working, but that's done now. It requires golang.org/x/net/http2, but we've vendored that anyway. It makes the backend implementation even more flexible: We could have a backend via HTTP2 via stdin/stdout (e.g. local rclone process, or running a process via ssh) or even use the HTTP2 backend directly. Neat.

Servicng HTTP2 over stdin/stdout sounds very interesting... Do you have some code I could look at so I can implement that in rclone? The next evolution for rclone serve restic might be to make it serve over HTTP2 and via stdin/stdout. That would then let restic start and stop rclone.

I don't have a strong opinion on gRPC having not used it in earnest. It certanly looks industrial strength though :-) That said, do we need it if HTTP2 is doing the job for us? As I understand it HTTP2 will multiplex connections over a single socket so I would have thought it would work quite well without gRPC and we could get the existing REST backend to use it?

@ncw I'm not very familiar with gRPC either, but from what I gather, it's basically HTTP2 + protobufs + magic sauce (wikipedia specifically says "authentication, bidirectional streaming and flow control, blocking or nonblocking bindings, and cancellation and timeouts")... "REST" doesn't say much: what's the actual serialization format? JSON? XML-RPC? how are actual RPC calls made?

gRPC has the advantage of standardizing all of this with protobufs and a predefined way of calling functions and so on.

good job!

Hey Nick, that's great news! I'll have a look now. Could you maybe open a Pull Request in the rclone repo so we can attach comments to the code and iterate on it (if necessary)? Or what's your preferred way of communication here? Shall I send you patches? ;)

Random issues I see while scrolling through the code:

  • Maybe we should add a link to https://restic.net in the help? So that users can find what this is all about?
  • Is authentication for the HTTP server supported? If so, shall we maybe force that for listening on something else than localhost (eventually)?
  • The minimal restic version needed is probably 0.8.0, it may even work with earlier versions. Or is there a specific reason why the help says 0.8.2 is required?
  • What's the reason that the URL passed to restic must end with a slash? Is that a limitation restic has (we should correct that...)?
  • We've relaxed the requirement for new files, returning an error when writing existing files isn't necessary any more, see https://github.com/restic/restic/pull/1623
  • The integration tests sound good so far, although I cannot guarantee that the file and path names stay that way :)
  • I'll help adding the Content-Length header short-term, we need to rework the backend interface in restic a bit. I have plans for that, but no code yet. It's awesome that the code in rclone already handles both cases, good work!

During restic development, we had several different repository layouts, that is described here: https://restic.readthedocs.io/en/latest/100_references.html#repository-layout We've now settled on the default layout, which means the files in data/ are saved in subdirs, e.g. data/21/2159dd48f8a24f33c307b750592773f8b71ff8d11452132a7b2e2a6a01611be1. But this file is accessed via REST as http://localhost:8080/repo/path/data/2159dd48f8a24f33c307b750592773f8b71ff8d11452132a7b2e2a6a01611be1.

At the moment, rclone will put those files into data/ directly. Which still is supported transparently should restic ever access the directory directly, but will cause problems when a repo is created on s3 or b2 via rclone, and then accessed with restic directly at the service.

This is one of the reasons I'm not really a fan of implementing the REST backend in rclone, but rather build something else, e.g. based on gRPC (see below). But now that we have it, we should fix it and get it working. This is the only critical issue I can see. I'd like to move to using the default repo layout everywhere, that's the only sensible way IMHO.

What needs to happen from my point of view is that rclone needs to implement (or copy) the so-called DefaultLayout, which assigns a path to a file name accessed via HTTP: https://github.com/restic/restic/blob/master/internal/backend/layout_default.go The intermediate dirs need to be created on demand, if they don't exist.

When we add a backend which uses e.g. gRPC via HTTP2 via stdin/stdout, there are user interface design issues to consider. How would the new backend be called? How would users specify which command is to be run (like ssh user@host restic serve-backend or rclone serve restic-http2-grpc)? The new backend would then also not care about any file paths at all, so restic would take care of the data/ subdirs and supply the backend with a complete path (e.g. save data to file /foo/bar/restic-repo/data/21/21123123123[...]). This isn't possible for the REST backend, since we've already defined the protocol and there are implementations in use in the wild...

So much for now :)

@fd0 sorry for another long delay - swapping badly at the moment ;-)

Hey Nick, that's great news! I'll have a look now. Could you maybe open a Pull Request in the rclone repo so we can attach comments to the code and iterate on it (if necessary)? Or what's your preferred way of communication here? Shall I send you patches? ;)

Patches - how quaint! I'll open a PR with the next iteration and we can see how that works! You can send PRs against a branch too (though if you want to do that I'll need to stop rebasing the branch!)

Maybe we should add a link to https://restic.net in the help? So that users can find what this is all about?

I've done that :-)

Is authentication for the HTTP server supported? If so, shall we maybe force that for listening on something else than localhost (eventually)?

Yes you can set lots of exiting stuff for the server! Not keen on forcing the user to set a password, but suggesting strongly in the docs is a good idea! I've put a note in the docs about that.

Flags:
      --addr string                     IPaddress:Port or :Port to bind server to. (default "localhost:8080")
      --cert string                     SSL PEM key (concatenation of certificate and CA certificate)
      --client-ca string                Client certificate authority to verify clients with
      --htpasswd string                 htpasswd file - if not provided no authentication is done
      --key string                      SSL PEM Private key
      --max-header-bytes int            Maximum size of request header (default 4096)
      --pass string                     Password for authentication.
      --realm string                    realm for authentication (default "rclone")
      --server-read-timeout duration    Timeout for server reading data (default 1h0m0s)
      --server-write-timeout duration   Timeout for server writing data (default 1h0m0s)
      --user string                     User name for authentication.

The minimal restic version needed is probably 0.8.0, it may even work with earlier versions. Or is there a specific reason why the help says 0.8.2 is required?

Err, I used the v2 REST API 7e6bfdae7909da7a1f9da76e1be063001c8b34c3 which was only released in v0.8.2 I think.

What's the reason that the URL passed to restic must end with a slash? Is that a limitation restic has (we should correct that...)?

It is more of a limitation of the rclone backend. It tells files and directories apart as to whether they end in a / or not.

We've relaxed the requirement for new files, returning an error when writing existing files isn't necessary any more, see #1623

Done!

The integration tests sound good so far, although I cannot guarantee that the file and path names stay that way :)

Sure! Happy to adjust them when they break!

I'll help adding the Content-Length header short-term, we need to rework the backend interface in restic a bit. I have plans for that, but no code yet. It's awesome that the code in rclone already handles both cases, good work!

:-)

At the moment, rclone will put those files into data/ ...

Ah. I missed that bit... Easy to fix though

The intermediate dirs need to be created on demand, if they don't exist.

rclone will create intermediate directories as it goes along

Ideally I'd like to merge this in time for the next rclone release which should be in a couple of weeks...

I put the next version in the serve-restic branch and I made a pull request this time for easier commenting: https://github.com/ncw/rclone/pull/2116

Cool, I'll have a look! Btw, with the recent change in #1639, now the restic backend will set the content-length header. :)

Err, I used the v2 REST API 7e6bfda which was only released in v0.8.2 I think.

Ah, okay. Does the code return an error when the old REST protocol is used? That'd be a nice thing to have, so people don't run into issues, we spend time debugging this, and it turns out to be just an outdated version of restic...

It is more of a limitation of the rclone backend. It tells files and directories apart as to whether they end in a / or not.

Hm, we can just do this in restic, the rest-server is compatible with it, so users don't need to care about that. Thoughts?

Ideally I'd like to merge this in time for the next rclone release which should be in a couple of weeks...

Fine with me, I'll work with you to get it merged!

I put the next version in the serve-restic branch and I made a pull request this time for easier commenting: ncw/rclone#2116

I'll check it out!

Can we add code to run the server on stdin/stdout with HTTP2? I've got code ready for that, shouldn't be hard to do. This way, we could just start rclone from restic, and tear it down properly, without TCP...

I'll put up some sample code.

Cheers!

  • Alex

I've played around with it, looks great already!

I've added two commits here: https://github.com/fd0/rclone/tree/serve-restic, the second one adds an http2/server mode on stdin/stdout. It can be used with my branch https://github.com/restic/restic/tree/rclone-backend like this:

$ ./restic -r rclone:b2:restic-dev-an/path/to/repo snapshots

It'll automatically start rclone in http2/server mode with --stdin, you can also pass the command line as an option:

$ ./restic \
  -r rclone:b2:restic-dev-an/path/to/repo \
  -o rclone.command="rclone serve restic --stdin --bwlimit 1M --verbose b2:restic-test-an/foo" \
  backup .

Internally, I'm using the REST backend and just add a bit of glue to start the rclone process and tear it down correctly.

This can, by the way, easily be started via ssh on a remote host, just set -o rclone.command='ssh user@host rclone serve restic --stdin b2:restic-test-an/foo'.

Please let me know what you think!

Ah, okay. Does the code return an error when the old REST protocol is used? That'd be a nice thing to have, so people don't run into issues, we spend time debugging this, and it turns out to be just an outdated version of restic...

I think the only difference is what format the list command returns. I could easily return the old format too - I just had a squint at the rest-server source and it uses an Accept header to do that. Is it worth doing that do you think or just returning an error if the old format is detected?

Hm, we can just do this in restic, the rest-server is compatible with it, so users don't need to care about that. Thoughts?

If you wanted to fix it in restic (and I guess the API specs) then I could take that sentence out of the docs which might be neater.

I've added two commits here: https://github.com/fd0/rclone/tree/serve-restic, the second one adds an http2/server mode on stdin/stdout. It can be used with my branch https://github.com/restic/restic/tree/rclone-backend like this:

Lovely :-) I'll merge these into my branch shortly and have a go with them!

I might at some point move the http2 serving and the stdin into cmd/serve/httplib but I think leaving it where you've put it for the moment is a good idea.

I think the only difference is what format the list command returns. I could easily return the old format too - I just had a squint at the rest-server source and it uses an Accept header to do that. Is it worth doing that do you think or just returning an error if the old format is detected?

Ah, hm. I think it's better to return an error, we don't want the old format anyway. It requires way more HTTP round trips (that's why we introduced the new one).

I might at some point move the http2 serving and the stdin into cmd/serve/httplib but I think leaving it where you've put it for the moment is a good idea.

Sure, please do move it to the right place. It was just an experiment. I'm glad that you like it. Surprisingly, it was really easy to implement once I had the StdioConn thing in place.

If you wanted to fix it in restic (and I guess the API specs) then I could take that sentence out of the docs which might be neater.

That sounds like a good plan, I'd like to do that. What do we need to do? Specify that the base URL for the repo always ends with a slash?

We can write this into the REST protocol, and have restic automatically append a slash if there's none. Anything else I didn't think about?

I've played around with the stdin HTTP2 server, and I really like that it's working so well. What I don't like is how clumsy it feels starting it. Do you have a better idea what to do instead of --stdin? I thought about setting an environment variable to activate stdin/stdout mode, but that isn't easily transmitted over SSH, meh.

For a local rclone process, it's not so bad:

$ restic \
  -r rclone:b2:restic-dev-an/path/to/repo \
  backup .

Starting a remote rclone process on the other hand is really clumsy, I don't like the UI at all:

$ restic \
  -r rclone:b2:restic-dev-an/path/to/repo \
  -o rclone.command="rclone serve restic --stdin --bwlimit 1M --verbose b2:restic-test-an/foo" \
  backup .

The remote specification given in the repo spec (rclone:b2:restic-dev-an/path/to/repo) isn't actually passed to rclone because the command-line is overwritten by -o rclone.command. Hm. I don't have a better idea on how to solve this right now.

I've opened PR #1657 for discussion, it adds the rclone backend and also proper tests.

One odd issue I came across while running the integration tests is this one:

$ go test -v -run /TestLoad
=== RUN   TestParseConfig
--- PASS: TestParseConfig (0.00s)
=== RUN   TestBackendRclone
2018/03/07 22:57:49 Server.Close()
=== RUN   TestBackendRclone/TestLoad
2018/03/07 22:57:49 ERROR : data/90/900a687bc2669f36e960c860f1e506daeb5753dd960b062b99b5716e2354d19c: Didn't finish writing GET request: http2: stream closed
2018/03/07 22:57:49 Server.Close()
--- PASS: TestBackendRclone (0.46s)
    backend_test.go:18: create new backend at /tmp/restic-test-822204889
    --- PASS: TestBackendRclone/TestLoad (0.45s)
        tests.go:27: rand initialized with seed 1520459869531256925
        tests.go:142: saved 12335322 bytes as <data/900a687bc2>
    backend_test.go:39: cleanup dir /tmp/restic-test-822204889
PASS
ok      github.com/restic/restic/internal/backend/rclone    0.462s

On the first test of Load() in the test suite (here), rclone complains that the stream for an HTTP GET request was closed before the data was finished. That is very odd, I don't have any idea why that happens. For the subsequent requests in the same function it succeeds.

Also a bit annoying is the error output from rclone, mangled with the usual restic output...

Ah, hm. I think it's better to return an error, we don't want the old format anyway. It requires way more HTTP round trips (that's why we introduced the new one).

OK I've done that. I'm returning an HTTP errror 400. I found a small bug in restic doing this - see #1660

That sounds like a good plan, I'd like to do that. What do we need to do? Specify that the base URL for the repo always ends with a slash?
We can write this into the REST protocol, and have restic automatically append a slash if there's none. Anything else I didn't think about?

I think that would be perfect!

I've played around with the stdin HTTP2 server, and I really like that it's working so well. What I don't like is how clumsy it feels starting it. Do you have a better idea what to do instead of --stdin? I thought about setting an environment variable to activate stdin/stdout mode, but that isn't easily transmitted over SSH, meh.

You can set any rclone command line params with environment variables, but I take your point about ssh.

I'm ok wth the --stdin option - I think it says exactly what we are trying to do here!

Starting a remote rclone process on the other hand is really clumsy, I don't like the UI at all:

I see what you mean... How easy would it be to process the command line to add the remote:path on the end of it so you don't have to specify it twice?

Cool, I'll have a look! Btw, with the recent change in #1639, now the restic backend will set the content-length header. :)

I'm not seeing that in rclone... If you uncomment the debug in the postObject function it always seems to be -1, ie unspecified

One odd issue I came across while running the integration tests is this one:

I will investigate that in a bit!

Hm, the content-length isn't set, you're right. But I set them in the code: https://github.com/restic/restic/blob/master/internal/backend/rest/rest.go#L122

And I've just verified that it really is something > 0. Do you have any idea what's going on?

Ha, got it, see #1661 :)

Yes, that's the one! I've made that mistake before :-)

I've reworked the backend and now the UI is much better. There are two options now, rclone.command, which contains the command used to run the rclone process itself, and rclone.args, which by default is serve restic --stdio. The command and arguments are built by first splitting rclone.command and rclone.args into "shell words" (respecting quoting), and then joining all together, and appending the remote name from the repository specification.

Local example:

$ restic \
  -r rclone:b2:restic-dev-an/path/to/repo \
  -o rclone.args="serve restic --stdin --bwlimit 1M --verbose" \
  backup .

Running rclone via SSH:

$ restic \
  -r rclone:b2:restic-dev-an/path/to/repo \
  -o rclone.command="ssh user@foo '/path/with spaces/rclone'" \
  -o rclone.args="serve restic --stdin --bwlimit 1M --verbose" \
  backup .

That's much better IMHO.

Did you find out anything about the strange error message during the tests?

What I find a bit odd is that rclone prints a message during teardown which isn't even disabled with -q:

2018/03/13 22:39:03 Server.Close()

On 2018-03-13 21:44:20, Alexander Neumann wrote:

I've reworked the backend and now the UI is much better. There are two options now, rclone.command, which contains the command used to run the rclone process itself, and rclone.args, which by default is serve restic --stdio. The command and arguments are built by first splitting rclone.command and rclone.args into "shell words" (respecting quoting), and then joining all together, and appending the remote name from the repository specification.

[...]

Running rclone via SSH:

$ restic \
  -r rclone:b2:restic-dev-an/path/to/repo \
  -o rclone.command="ssh user@foo rclone" \
  -o rclone.args="serve restic --stdin --bwlimit 1M --verbose" \
  backup .

Be careful when passing shell-quoted long string arguments to SSH. I've
always been surprised by what happens on the other side.. For example,
instead of just passing arguments to execve(2) as golang would probably
do if it calls just "rclone serve restic --stdin --bwlimit 1M --verbose"
locally, you would actually call sh -c on the other end of that ssh
pipe, which means your shell characters need to be quoted again.

I'm not sure what the impact of this is, but "rclone" and "ssh foo
rclone" is not the same, unless you call "rclone" (and not "ssh foo
rclone") with "sh -c rclone".

Confused yet? :)

Ah, thanks for the reminder. The code doesn't pass the long string to ssh as a single argument, but rather parses it correctly and passes it to ssh as separate arguments.

So, for the command

$ restic \
    -r rclone:b2:restic-dev-an/path/to/repo \
    -o rclone.command="ssh [email protected] '/path/with spaces/to/rclone'" \
    -o rclone.args="serve restic --stdin --bwlimit 1M --verbose" \
    backup .

restic will call (in pseudo-code):

exec("ssh", "[email protected]", "/path/with spaces/to/rclone", "serve", "restic", "--stdin", "--bwlimit", "1M", "--verbose", "b2:restic-dev-an/path/to/repo")

instead of:

exec("ssh [email protected] '/path/with spaces/to/rclone'", [...])

Passing the strings as separate arguments to ssh has served me very well in other projects, so ssh handles passing them correctly to the other side.

There are a few tests which show that the code also handles e.g. escaped string delimiters correctly, e.g.:
https://github.com/restic/restic/blob/0279fd72123adc52686156332ed7c00a3a07d6e9/internal/backend/sftp/split_test.go#L39-L44

interesting... i've always struggled with this like crazy, i guess it's because i'm calling ssh on the shell commandline... nevermind me, you seem to have things in order. :)

I can relate to that, string handling becomes so much more complicated when shells are involved, so I'm trying hard to not include any shells if not absolutely necessary ;)

@fd0 wrote:

I've reworked the backend and now the UI is much better.

Very nice :-)

Did you find out anything about the strange error message during the tests?

What it looks like is that rclone is in the middle of serving a GET request when restic closes the connection. I added a bit more debugging to see how far it gets.

I also ran this with a static rclone (not started by rclone's tests) and saw the same thing.

2018/03/14 15:10:04 ERROR : data/39/39bb8b2bacd307fab053ef721278630aa1c94ce716ce4236cbbab4c225998ec9: Didn't finish writing GET request (wrote 32768/13575812 bytes): write tcp 127.0.0.1:8080->127.0.0.1:56214: write: broken pipe

It looks like restic is closing the connection without reading all the data

After a bit of experimentation I found that if I commented out this code the error goes away. I wasn't quite sure how to fix it, or even if it is fixable, but the cause is that restic doesn't read the whole response.

diff --git a/internal/backend/test/tests.go b/internal/backend/test/tests.go
index caf0a9ac..643f4d1a 100644
--- a/internal/backend/test/tests.go
+++ b/internal/backend/test/tests.go
@@ -146,15 +146,15 @@ func (s *Suite) TestLoad(t *testing.T) {
        t.Fatalf("Load() returned no error for negative offset!")
    }

-   err = b.Load(context.TODO(), handle, 0, 0, func(rd io.Reader) error {
-       return errors.Errorf("deliberate error")
-   })
-   if err == nil {
-       t.Fatalf("Load() did not propagate consumer error!")
-   }
-   if err.Error() != "deliberate error" {
-       t.Fatalf("Load() did not correctly propagate consumer error!")
-   }
+   // err = b.Load(context.TODO(), handle, 0, 0, func(rd io.Reader) error {
+   //  return errors.Errorf("deliberate error")
+   // })
+   // if err == nil {
+   //  t.Fatalf("Load() did not propagate consumer error!")
+   // }
+   // if err.Error() != "deliberate error" {
+   //  t.Fatalf("Load() did not correctly propagate consumer error!")
+   // }

    loadTests := 50
    if s.MinimalData {

I think this is a bug in backend.DefaultLoad, it does not fully drain rd io.Reader in case of error. Need to add io.Copy(ioutil.Discard, rd) there as per https://stackoverflow.com/questions/17948827/reusing-http-connections-in-golang.

@ifedorenko wrote:

I think this is a bug in backend.DefaultLoad, it does not fully drain rd io.Reader in case of error. Need to add io.Copy(ioutil.Discard, rd) there as per https://stackoverflow.com/questions/17948827/reusing-http-connections-in-golang.

Hmm, possibly though draining a non HTTP reader (eg a disk file) doesn't make sense.

I don't think this will happen in normal operation though and if it does rclone prints an error and we have to remake the persistent http connection which shouldn't be a big deal.

I think this is more of a cosmetic issue for the tests.

Yeah, was thinking about this too. Individual HTTP-based backends should return readers that drain streams on close.

Also, I just realized rclone backend is using http2, which supports cancellation of in-progress streams via RST_STREAM frames. Wonder if Go http2 client/server implementations actually take advantage of that.

I think this is a bug in backend.DefaultLoad, it does not fully drain rd io.Reader in case of error.

Ah, I used to think that this is the right way, but now I think we should allow this and not drain the reader completely in DefaultLoader. The net/http code has something which tries to read up to 2KiB so that a connection can be reused, but if more data could be read, it discards the connection:

https://github.com/golang/go/blob/aff222cd185d10400b9177fe26ec06eb647b092d/src/net/http/client.go#L590-L598

// Close the previous response's body. But
// read at least some of the body so if it's
// small the underlying TCP connection will be
// re-used. No need to check for errors: if it
// fails, the Transport won't reuse it anyway.
const maxBodySlurpSize = 2 << 10
if resp.ContentLength == -1 || resp.ContentLength <= maxBodySlurpSize {
    io.CopyN(ioutil.Discard, resp.Body, maxBodySlurpSize)
}

In HTTP 1.1, closing a connection is the only reliable way to to tell the server to stop sending data, so depending on the amount of data it's oftentimes more efficient to close the connection and create a new one instead of loading megabytes of data just to be able to reuse the connection.

So, I've fixed the test to first drain the reader and then return the custom error, that was really easy. In retrospect it's also easy to see what's happening ;)

--- a/internal/backend/test/tests.go
+++ b/internal/backend/test/tests.go
@@ -147,6 +147,10 @@ func (s *Suite) TestLoad(t *testing.T) {
    }

    err = b.Load(context.TODO(), handle, 0, 0, func(rd io.Reader) error {
+       _, err := io.Copy(ioutil.Discard, rd)
+       if err != nil {
+           t.Fatal(err)
+       }
        return errors.Errorf("deliberate error")
    })
    if err == nil {

I found the cause for the strange message printed on server close, I left a call to log.Printf() in the StdioConn code :)

https://github.com/ncw/rclone/pull/2139

Thanks for the fix :-) I had a go with your draining fix and that works fine too :-)

I'm just going to have a quick whizz through the rclone-backend branch and comment on anything I see there.

So, the code is basically done, it just needs some docs. I could use some help testing the backend (in the branch rclone-backend) :)

Also, I've amended the REST protocol and the REST backend, so that the base URL always ends in a slash.

Docs are done, please give it a try! Ping @mholt

Hi @fd0! thank you very much for putting in the work to integrate rclone.

I have an issue with a restic repository on Google Drive. Creating a new repository works fine, but when I try to access an existing repository (with around 7 TiB of data), restic fails with an error:

./restic backup -o rclone.program="path/to/rclone" -r rclone:gdrive:restic/test --tag initial /path/to/data
Fatal: unable to open repo at rclone:gdrive:restic/test: error talking HTTP to rclone: Get http://localhost/: net/http: request canceled (Client.Timeout exceeded while awaiting headers)

The is a local repository that has been synced to Google Drive. Using a custom restic build I can access the repository just fine.

@fd0 Thanks for the ping! I've been getting the emails about this and am _super_ excited to try it out! Just came at a bad time for me as I'm really busy wrapping up ACMEv2 + wildcard support in Caddy right now. But I'll get back to this ASAP as I am anxious to see how to use it, especially the proxying to avoid giving cloud credentials to backup clients, as we talked about in the other issue. Keep up the great work!

@mathiasnagler

You can get rclone to log much more stuff for debugging

export RCLONE_VERBOSE=2

before the restic run.

It might be that the timeouts in restic for the rest backend are too short - what do you think @fd0?

Also note that drive can be really slow! Have you got your own credentials or are you using rclone's. If the latter then I recommend making your own.

Ah, that's a special timeout: restic starts rclone in the foreground process group (so that things like password prompts work), and in the background it tries to establish the HTTP2 connection. Once that is done, restic moves rclone into the background. At the moment, the timeout for the first HTTP request to complete is quite low (5s), I'll update this to 60s. After 60s, the process should have booted.

It might be that the timeouts in restic for the rest backend are too short - what do you think @fd0?

Probably, but not this one: This one is restic's internal timeout ;)

I'll add hints for debugging rclone to the docs. Being able to configure rclone indirectly via inherited environment variables is awesome :)

@mathiasnagler can you retry please?

Thank your for your feedback!

@ncw I am using my own credentials. Using the build from the commit I linked above, restic can fully saturate my upload bandwidth with those credentials (about 40Mbit/s).

@fd0 Will do asap.

Update:

Unfortunately, even with 60 seconds the same issue occurs. I think this is caused by the repository size / amount of files in the repository. I even increased the timeout to 180 seconds, but still no luck.

What actually happens during the bootup period? rclone debug output suggests, that a ls is executed:

rclone: 2018/03/18 14:14:49 DEBUG : : list request

Can I manually check how long this takes by running rclone ls gdrive:restic/test?

Update:

I increased the timeout to 1800 seconds (30 minutes) to see how long it will take. After 7 minutes, restic prompted for the repository password. From there on, listing the snapshots worked as expected.

@mathiasnagler

Can I manually check how long this takes by running rclone ls gdrive:restic/test?

Yes that should do it. Note that listing in drive is relatively expensive :-(

I'm interested in testing (using Google Drive as a backend), but I'll need precompiled win64 editions of both restic and rclone to do so..

Currently I'm getting random timeouts on restic check --read-data runs with a few repositories using the Drive FS client (as a paying user for GSuite for Business)

rclone ls gdrive:restic/test takes between 4 and 7 minutes to finish for me. The repository size is around 7 TiB, so this is probably not the norm.
Still I feel restic should not fail here. I am ok with it taking some time.

Oh, that's my fault. I thought it'd be a good idea (as a test when rclone is ready to accept HTTP requests) to issue an HTTP GET / request. Which, as I've just discovered, yields a list of all files in the remote. That's the reason startup took so long. I'll change that, please retry!

I've changed the code so it'll try to request a random file name, which does not exist. We just want to make sure rclone is ready to respond to HTTP requests.

Startup is quick now. Thanks for the update!

I started a new backup to test some more.

@fd0 Upon further inspection, I have another issue. To test the new feature I created a fresh repository on gdrive (Google Drive). Creating the repository worked as expected and I can see the files and folders using gdrive webinterface.
I then started my first backup to the new repository:

restic -o rclone.program="/path/to/rclone" -r rclone:gdrive:test backup --tag initial /path/to/data

restic reports that the backup is running and seems to have backed up some amount of data:

[4:00] 0.12%  99.099 MiB / 82.369 GiB  49 / 13646 items  0 errors  ETA 56:40:29 

The issue is that the backup will never actually continue. I can observe that no data leaves the machine. There is no outgoing traffic at all. Using export RCLONE_VERBOSE=2 I see some errors:

rclone: 2018/03/19 17:59:39 ERROR : locks: error listing: directory not found
rclone: 2018/03/19 17:59:39 DEBUG : Google drive root 'test': POST /locks/a23337ca935c44307b36f1f7146da9e152dbc3d919f5a254c78921fa986fae1a
rclone: 2018/03/19 17:59:42 DEBUG : Google drive root 'test': GET /locks/
rclone: 2018/03/19 17:59:42 DEBUG : locks: list request
rclone: 2018/03/19 17:59:42 DEBUG : Google drive root 'test': GET /index/
rclone: 2018/03/19 17:59:42 DEBUG : index: list request
rclone: 2018/03/19 17:59:42 ERROR : index: error listing: directory not found
rclone: 2018/03/19 17:59:42 DEBUG : Google drive root 'test': GET /snapshots/
rclone: 2018/03/19 17:59:42 DEBUG : snapshots: list request
rclone: 2018/03/19 17:59:42 ERROR : snapshots: error listing: directory not found

I think those are fine because the folders will be created when the first backup happens and not during repo creation, but I am not entirely sure.
Apart from those 3 errors, there is nothing that looks wrong in the rclone output. Also, rclone seems to be able to talk to google drive because the basic repo structure has been created just fine:

rclone ls gdrive:test
      155 config
      452 keys/452257875003e39aed0b95f7f75ead868dcc2dc0c14577dd24fd144d6895e2ce

Is anybody else able to reproduce this?

@fd0 Any chance for a restic beta with rclone support, so I can do some testing?

Another comment - not directly related to restic (but it might be interesting) and could possibly lead to issues/support:

As I am using G Suite for business I am using a service account with rclone, and in that regard it is extremely important to remember to use rclones --drive-impersonate option if you want to be able to see the files that restic/rclone uploads through the normal web-UI in a browser.

Its described in details here: https://rclone.org/drive/#use-case-google-apps-g-suite-account-and-individual-drive

Without using the --drive-impersonate option with rclone, all files are invisible to my regular user - however this might be considered a security inhancement as it makes it impossible to damage or delete restic repos through other means

@mathiasnagler I'll have a look

@naffit I can upload a build later today, but it's really easy for you to build restic yourself once you've installed Go >= 1.8:

git clone https://github.com/restic/restic
cd restic
git checkout rclone-backend
go run build.go

Then you have a working restic binary in the current directory.

@fd0 Sorry if I'm blind, but where are the docs? 馃槄 I'm ready to try this out. Got it cloned down and everything.

There's documentation in the manual in doc/030_preparing_a_new_repo.rst, in the section about rclone:
https://github.com/restic/restic/blob/2d756ce7278cdd598ff08fc696cabded5f1630bf/doc/030_preparing_a_new_repo.rst#other-services-via-rclone

This helps you getting started with rclone as run by restic. You can also just call rclone serve restic and then use restic's REST backend to access the server, like restic -r rest:http://localhost:8080/foo init

Or did you expect something else?

That's perfect, thanks!

Was this page helpful?
0 / 5 - 0 ratings

Related issues

middelink picture middelink  路  48Comments

xor-gate picture xor-gate  路  93Comments

the-destro picture the-destro  路  84Comments

fd0 picture fd0  路  51Comments

gebi picture gebi  路  34Comments