Client: Never ever abort sync runs / check that the connection is indeed broken before aborting on a timeout

Created on 26 Jun 2017  ·  25Comments  ·  Source: owncloud/client

Some server side errors cause the client to abort the current sync run. If the ownCloud admin or user is not able to fix the error, EVERY sync run will abort at the same position, and following files that don't have an error will NEVER sync because of this.

@SamuAlfageme We discussed this a time ago. Please link existing issues here or open new ones that could could cause this behaviour. Please also elaborate on the server improvements.

Discussion ReadyToTest Server Involved p3-medium research technical debt

All 25 comments

Please report only if this happens with latest client version as this is something which changed a lot back and forth :-)
(CC @ckamm @ogoffart )

WIP

Sync aborted - on server/network error

Sync aborted - on client/protocol error

  • [ ] csync errors - #5888

Related Fixes

Incomplete list of potential sync run aborts during propagation stage, by looking at FatalError in the code

  • can't write to database
  • network error > NoError and <= UnknownProxyError
  • network error 503
  • timeout on download
  • critical disk space error on download

I believe the current behavior is the correct one: We only abort the sync run if we experience fatal error (network down, server on maintainnence, ...)

@michaelstingl Do you have any concrete example of error that causes the sync to abort while it should not?

I believe the current behavior is the correct one: We only abort the sync run if we experience fatal error (network down, the server on maintenance, ...)

Wenn a user has some shares and/or has some external storages and/or federated shares and/or some inconsistency in the oc_filecache because the database has a problem (in some cases is normally fixed with occ files:scan to the user, the user that shares, etc) the synclient just stops on the error and does not sync the new files and folders (even if they are in a different path)

IMO, the expected behavior, the sync-client should mark those files, and try to upload the new files (because those files are probably needed for someone else) and at the end send a summary the main errors.

Like:
files x, y, z in folder w could not be downloaded because the server reported: connection timeout. ->Contact your administrator
files a, b, c could not be uploaded because chunks weren't received correctly, we will try to upload later. -> if the problem persist please contact your administrator.
files m,n,o were successfully uploaded. -> Synced after the errors occurred.

files x, y, z in folder w could not be downloaded because the server reported: connection timeout. ->Contact your administrator

The problem is that "Connection timeout" is also the error we get when the network is down.
Maybe when we get a "connection timeout", we should run the connection validator once more and only report a FatalError if the connection is indeed broken. otherwise mark it as a NormalError. That's some work and it may not play well with the parallelism, but that's an option.

The client get the error that is delivered from the client server, but as you said, is not necessarily disconnected, only for some files or folder that could have problems finishing the download/upload

@SamuAlfageme Should we add https://github.com/owncloud/client/issues/4622#issuecomment-301005803 to this meta-issue?

The current user experience is bad: in this case the client shows 403 error and aborts current sync run. The it restarts a new run which shows 403 and aborts. And so on.

@guruz my initial focus wasn't about client side problems like flaky wifi, more about error states on the server side that abort the sync run every time at the same position, so that later files will never be synced (like in @cdamken 's example)

At ownCloud Conference 2017 a user even proposed to randomize the order to get a bigger chance that later files get synced, but I of course, I would prefer to fix the root cause of the problem.

@michaelstingl

@SamuAlfageme Another one More: #6049 #6050 – could you verify?

Those are not sync aborts but crashes. Will look into them to see what introduced 'em.

Those are not sync aborts but crashes.

Pain is the same: Later files will never get synced. oC desktop sync client need better strategies to deal with oC server or infrastructure fuckup.

At ownCloud Conference 2017 a user even proposed to randomize the order to get a bigger chance that later files get synced

This is something for @mrow4a ;-) SCNR

oC desktop sync client need better strategies to deal with oC server or infrastructure fuckup.

But if the local storage or the network connection go away then there is only so much you can do.. :-)

In general we all agree with the goal of making the discovery less costly and therefore earlier going to sync. 2.4 should already be better in this.

But if the local storage or the network connection go away then there is only so much you can do.. :-)

@guruz In the cases above it's not about local storage and network connection. It's about mounted storages that the oC server can't access temporarily or permanently. My expectation is, that oC client continues sync with the next available file/folder. And it seems we have enough cases where this isn't the case.

@michaelstingl I will be careful with this. I cannot find now a case for generating some inconsistency when the folder gets available again (if it wont start randomly downloading/deleting files). But maybe it is possible.

@guruz @SamuAlfageme Sorry, I understood wrong. #6049 & #6050 are about local storage that becomes unavailable. I was thinking about server-mounted storage that becomes unavailable. Sorry for the noise.

The solution is what i write in
https://github.com/owncloud/client/issues/5859#issuecomment-319409953

When a propagator jobs end in a connection timeout, we assume it is because the network is down and we abort the sync. But before aborting the sync, we should do a connection validator to check if the network is down or if it is a problem with that file.

@ogoffar while it make sense for single file, when whole folder gets unavailable it is a signal that sth is wrong with the server. It could mean it is flaky network drive (user should manually remove it in server UI and client should halt for that period not to allow moving files across folders and materialize the state on faulty server) or federated share (which might lead to even worse because multiple users do changes and patent node of folder is down or faulty). What do you think ?

@ogoffart @michaelstingl what exactly is ready to test here? Only abortion on error 503?

Right, when the server returns an error 503 for a file, (but the server is not in maintainence mode) it should still sync all other files.
I don't know exactly how to get error 503 on the server, this is usually because of a problem in the storage backend.
However, when the server is in maintainence mode, the error is also 503, but in this case, the sync needs to stop. (because otherwise all the files would be 503)

Maybe we can just close this issue.

Closing this aggregating issue. All individual issues will be/were tested separately. 503 error will be tested in https://github.com/owncloud/client/issues/5088

Closing this aggregating issue. All individual issues will be/were tested separately.

👍 😍

Was this page helpful?
0 / 5 - 0 ratings