Client: Local file discovery should be done at most once per run, use file watcher events to seed Discovery

Created on 10 Oct 2014  Â·  65Comments  Â·  Source: owncloud/client

Currently the client runs a full local discovery every five minutes due to limitations in our file watching code (see #2268 for previous discussion). These limitations should be removed to allow the client to be reliable with only a single run of the local file discovery or without it entirely.

On Mac: We don't get notifications for changes our own process caused, so a single run should be enough. We may be able to hook into the filesystem change journal to even skip that.

On Linux: We get full path information about changes and that could be used to filter out the changes on files we cause ourselves.

On Windows: Similar to Linux (have to use ReadDirectoryChanges instead of FindFirstChangeNotification) and could even possibly skip the single discovery by using "Change Journals".

Enhancement Performance ReadyToTest technical debt

Most helpful comment

Owncloud keeps checking for changes for hours, whereas rsync does the same in a minute.
Slowness, along with unreliability, forced me to let go of Owncloud in the end.

All 65 comments

These kind of stuff becomes important as we see people having 50K+ files in their sync folder. If you manage to sort this out to work at scale of a few 100K files in a sync directory then you will have extreme competitive advantage. But that's some way to go, still....

Linking https://github.com/owncloud/mirall/issues/880 since it's related.
I still have some ideas for Windows directory traversal..

@moscicki points out in #2451:

Please also consider that in some cases an (advanced) user may require to keep the ability of periodic full scans of the local sync folder (on per folder-pair basis). This is if someone have their local folders on mounted filesystems which do not always reliably provide the notifications (e.g. in some cases the AFS). In that case a setting in the folders config file could be sufficient.

The current state in master is:

  • Force sync interval changed from 5 minutes to 2 hours
  • The file watchers report changes with full paths
  • The propagator keeps track of files the sync touches and then folders ignore change notifications on files that were touched by the sync for 15s

This should make full discovery runs much more rare!

Seems you have almost perfect solved the problem. Exist a simple solution for a beginner that installed all standard? Owcloud server 7.0.3 + debian7 + nginx + php5 + mysql
Thanks

@reylon You can compile the mirall master branch yourself if you want :) that would be the 1.8

Documentation for the change journal stuff:
OS X: https://developer.apple.com/library/mac/documentation/Darwin/Conceptual/FSEvents_ProgGuide/UsingtheFSEventsFramework/UsingtheFSEventsFramework.html#//apple_ref/doc/uid/TP40005289-CH4-SW10
Windows: https://msdn.microsoft.com/en-us/library/windows/desktop/aa363798(v=vs.85).aspx

FYI @dragotin @danimo

Note that this bug is about running local discovery only once, change journals are to avoid discovery at all

This is done.

Internal misunderstanding, this is not done..

I'm experiencing pretty much constant discovery which is only interrupted with the error headlined in owncloud/core#18206. I'm guessing that's related to this issue?

The idea is to only build up a very limited set of files that was changed, not the whole set that we get from the discovery phase by now. We want to rely on the filesystem notification for that. Still the client has to do the full sync as well of course.
@ogoffart please add your thoughts.

The idea would be to run the reconcile phase only on files comming from the file system watcher, and having a way then to do lazy discovery of the files we need.

@dragotin: I remember what you said that you'd need to keep the whole tree in memory. I am not really sure if you need that. For simplicity imagine you only have local changes to propagate (and no remote changes to reconcile). You do not need the whole tree. Just the part of the tree where you have seen the modifications. It would get a bit trickier if you also have the remote changes but maybe one can come up with some smart algorithm which requires you to traverse only a union set of local and remote subtrees. Of course, the prerequisite is that you have reliable notifications or you are efficiently able to detect when notifications are lost or become less reliable (due to load, internal buffer size, rate of events or whatever other factors).

                                                                                  We talked about exactly this at lunch :-) you should have stayed longer at the conf :-)                                                                                                                                                       

Given @ogoffart 's recent discovery improvements, I'm moving this to 2.2.

Is there any update on this?

Can we somehow make this happen?

We have some people in our institution who are avid users, and are considering to drop ownCloud because of this.

Thanks,

@Trefex Do you have any metrics on this? How long does the discovery take?
Which OSes?
How many files/folders?

Doesn't sync client produce something like that in the logs? However, collecting it from a larger user base could be a maybe painfull process. Is there a way to automate client-side-monitoring metrics like this one to make it at the disposal of the admin of an instance? I would like to know that too for my users.

@moscicki We don't send such information back to the server.

But you can estimate it by looking at your logs:
We first do a first PROPFIND on the root folder to know if the etag has changed. If something has changed, then we do the local discovery, then we do the server discovery starting with the PROPFIND on the root folder. You should be able to distinguish these two propfind because of the difference in the Depth and in the property list. And so you can estimate the time taken by a local discovery.

Having big issues with sync aswell, For the last day, my Owncloud have been "Checking for xxx". And it never seems to do anything.

It does contains some git repositories where .git is included to sync, but all scrap files like node_modules are ignored.

Still, if dropbox and google drive manage to sync this files, owncloud should be able to do it aswell?

I am the user that @Trefex was referring to. I have 418.526 items, totalling 264,3 GB in that folder. I am using Version 2.2.2. of the Linux desktop client.

Part of the advantage of OwnCloud is not having to pay to store a large set of files shared in subsets with a variable set of users, ~15 in my case, but this does not work if I can't keep the set in sync just like each smaller subset.

@perara Do those failing files have a hardlink count > 1 ?

@guruz please see @rmtfleming comment for the answer to your question

@guruz no, hardlink count is <= 1 :)

I think I am being hit by this on my Linux box for the moment. It seems to spend hours to "check for changes", and I have been waiting for ages for a particular file to update, after which I ended up just downloading a copy from the web interface to a different folder and use that file meanwhile.

  • I have my Dropbox, my Google Drive, and my work ownCloud connected to my ownCloud server. On my Linux box I sync all of this through my ownCloud server (totalling something like 34GB)
  • I also have a macbook running OSX at work, however there I do not sync Dropbox and Google Drive. This seems to run much more smoothly.
  • I do not think I have had the Linux box open for long enough to this "check for changes" to finish, so perhaps it is then starting all over every time I start the machine.

Another (probably ignorant) question, can't you have a look at what unison does and steal some ideas from there? Seems to run a lot faster, and I am synchronizing a lot more GB through that system (though I guess it is perhaps the number of files/folders and not the GB which is the primary challenge).

Edit: Should mention that this is using version 2.2.2 on Ubuntu 16.10.

I tested Owncloud on a server (using Xubuntu 16.04) with three clients (Kubuntu 16.04 and Windows 7). I have ~ 300 GB that I keep sync'd. Everything seemed to be working fine until I noticed the laptop never did sync anything but just kept churning away, "looking for changes". Since this is the first time the laptop had gone through a cycle (no need to download anything as all the files are already there) I thought I'd be patient and wait until it was done indexing (if that's what it was doing). The deal-breaker is my work computer, where I use selective sync. I sync one folder for work that has subfolders. Owncloud proceeded to remove all of the subfolders from their parent folder! Talk about unexpected behavior. I pay Dropbox $100/year and I'd gladly pay $100 ONE TIME for software that can replace them. Maybe someday...

@scottbomb I hear your frustration.

About your files being removed: There typically is a "Unchecked folders will be removed from your local file system" warning on the selective sync ui. So I assume you were using the account wizard to point to an existing folder and used the wizard's selective sync option at the same time? Because there's indeed a warning missing there and I'll address that.

No, that was not the problem. Here is an example of the bug:

Let's say the server has just one folder and we'll call it Parent. This folder has sub-folders Child1, Child2, and Child3. If I only want to see Child1 and Child2 on a particular client, I tell that client to selectively sync just those two folders. It does this fine. However, on the server, Child1 and Child2 are REMOVED from the folder Parent. Instead of the server having just one directory (Parent), it now has three (Parent, Child1 and Child2). Child3 was never moved because it was not part of the selected sync. That's good, but Child1 and Child2 should not have been moved either (on the server).

@scottbomb Can you walk me through how you set up the client to sync only Child1 and Child2? I can't reproduce it yet.

Owncloud keeps checking for changes for hours, whereas rsync does the same in a minute.
Slowness, along with unreliability, forced me to let go of Owncloud in the end.

With #6087 merged, the client will now do full local discovery only on startup and once per hour. Incremental syncs between that will be fed from file watcher events only and be way faster.

Do you detect if you reached inotify file descriptor limit on Linux (and if yes, what do you do in this case=>is there an error message?)

Some refs:
https://unix.stackexchange.com/questions/13751/kernel-inotify-watch-limit-reached#13757
https://www.dropbox.com/help/syncing-uploads/files-not-syncing -> "Monitoring more than 10,000 folders on Linux"

BTW. Does this problem exist on Windows / BSD ?

Also, we would like to measure the impact of this change with owncloudcmd (essentially integrating into recurrent smashbox runs). Is this possible with owncloudcmd somehow?

For the record, the ownCloud packages install this sysctl parameter:

fs.inotify.max_user_watches = 524288

at least for rpm packages.

@jnweiger @dschmidt I am not sure about the deb packages, can you double check?

@moscicki If some inotify descriptors can't be created the folder watcher notifies that it's now unreliable and the client switches back to old behavior. There's a somewhat different problem on Windows where the change-buffer can overrun and items can be missed. The solution is similar. See 66f0ce6616edcb6e44d235ae779170eb28c9c7c0

owncloudcmd: No. This behavior requires monitoring the filesystem between sync runs - owncloudcmd just triggers a single isolated sync run.

Nice strategy with switching to old behaviour. Could there be some warning
displayed about this (as per low disk space for example)?

But maybe it should not re-scan too frequently in such case? It will make the process consume loads of CPU, drain laptop's battery etc. So a warning would be also good to inform user that he has exceeded the limits.

On Wed, Oct 25, 2017 at 8:47 AM, ckamm notifications@github.com wrote:

@moscicki https://github.com/moscicki If some inotify descriptors can't
be created the folder watcher notifies that it's now unreliable and the
client switches back to old behavior. There's a somewhat different problem
on Windows where the change-buffer can overrun and items can be missed. The
solution is similar. See 66f0ce6
https://github.com/owncloud/client/commit/66f0ce6616edcb6e44d235ae779170eb28c9c7c0

owncloudcmd: No. This behavior requires monitoring the filesystem between
sync runs - owncloudcmd just triggers a single isolated sync run.

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/owncloud/client/issues/2297#issuecomment-339231512,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AAl9jf_gAaESiBkA02rPeVPlWkEQaAGcks5svtmbgaJpZM4CtT9t
.

--

Best regards,
Kuba

@moscicki Currently there's no warning. We could add one. I can make a ticket for that.

@dragotin same with all debian based packages. But there is no sysctl call in the post install scripts, so first-time-ownclouders don't experience the benefit.

People interested in testing this should have a look at the recent daily builds of 2.5: https://download.owncloud.com/desktop/daily/

This is part of 2.5.0 alpha1, did anyone give it a try? :) https://github.com/owncloud/client/releases

Tried to install it via the MSI and got a ton of .dll errors and registry issues.

@coreyman And how is the MSI for a daily build? (There were fixes in meanwhile)
https://download.owncloud.com/desktop/daily/ownCloud-2.6.0.315-daily20180708.343.msi

@guruz the daily build you linked seems to be working good. The local detection is a lot faster. I have probably a little more than 80k files and can notice the difference. The only issue now is the remote detection being slow :)

2.5.0 beta1 is out :-)
https://central.owncloud.org/t/desktop-sync-client-2-5-0-beta1-released/14667
Everyone, please comment here if we can close this issue. Thank you.

Test report from @coreyman indicates we are good here.
With small test sets under 2000 files there is not much difference between 2.4.1 and 2.5.0beta1 in speed seen here.
For what it is worth: For a long time I have not seen lengthy discovery phases at all.

This speedup should be announced with 2.5.0

OK: 'Force Sync now' with 1700 files previously synced is instantly, was 1..2sec with 242
SKIP: I cannot think of a good local test setup for this. Please reopen if there is an easy one.

This speedup should be announced with 2.5.0

Yep it should. I hope @michaelstingl @lefherz @JKawohl can somehow pack this in a blogpost with graphs?

How do you measure the improvement here? We have some quite large file sets
we can test with.

On Fri, Aug 10, 2018 at 8:50 PM, Markus Goetz notifications@github.com
wrote:

This speedup should be announced with 2.5.0

Yep it should. I hope @michaelstingl https://github.com/michaelstingl
@lefherz https://github.com/lefherz @JKawohl
https://github.com/JKawohl can somehow pack this in a blogpost with
graphs?

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/owncloud/client/issues/2297#issuecomment-412173028,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AAl9jRNoZuMbsLZpAaPsg9pHVeOddedEks5uPdX_gaJpZM4CtT9t
.

--

Best regards,
Kuba

@jnweiger please share your test setup and procedure.

As mentioned above, I have not yet found a good way to test.

Maybe generate

  • 33 folders with 3300 files
  • 330 folders with 330 files
  • 3300 folders with 33 files
  • 10/10/10/10 foldertree with 10 files

then do wall clock time of the discovery phase for

  • initial sync up
    *1000 added files scattered
  • 1000 added files in one folder
  • 1000 files changed in gew folders
  • 1000 files changed in 1000 folders.

all this with 2.4.1 and 2.4.2 and 2.5.0

@guruz should that reveal the speedup?

The speedup is only visible for consecutive sync (i.e: not the first sync after creating the client.)

Test case: in a very large sync folder (by large i mean huge amount of directories and files), do a small modification to a resonably small file. Time the time it takes for that file to be on the server. It should be much faster with 2.5
(Note, should be especially true if the HD is slow, and the fs cache is empty)

@ogoffart I have difficulties measuring the speedup. Here are my results:

Testing on a slow USB-2 SSD, server running on the same machine on a SATA SSD.
10000 file of 6 bytes each, scattered in 100 folders.

for i in $(seq 1 100); do echo $i; mkdir -p "t$i"; for j in $(seq 1 100); do echo hello > t$i/f$j; done; done

client 2.5.0daily20181014

  • initial sync up: 7:04.4 sec
  • first force resync: 8.9 sec
  • 2nd, 3rd force resync: 0.5 sec
  • client stop; start: 1.5 sec
  • small file change: 0.5 sec
  • client stop; rm db; client start: 20.8 sec

client 2.4.3

  • initial sync up: 8:01.0 sec
  • first force resync: 1.3 sec
  • 2nd, 3rd force resync: 0.5 sec
  • client stop; start: 1.5 sec
  • small file change: 0.5 sec
  • client stop; rm db; client start: 22.8 sec

client 2.4.1

  • initial sync up: 8:06.0 sec
  • first force resync: 12.2 sec
  • 2nd, 3rd force resync: 0.5 sec
  • client stop; start: 1.5 sec
  • small file change: 0.5 sec
  • client stop; rm db; client start: 22.0 sec

Is it possible to try with 1 million files of 2-20 KB scattered in 10000 folders?

Sent from my iPhone

@Trefex I'll need to avoid the initial sync for that, otherwise we get results next month. Will try.

The issue one of our user had in Linux was that the sync would not finish before the next sync would start. This was with around 1 million files and broke sync for these use cases completely.

Hopefully full sync every hour will not impact too much in those cases?

Why is 2.4.3 so much faster on 'first force resync:' compared to the other two? Typo?

What about inotify watch limits? Also with 1M files it will take around 1GB of kernel memory according to: https://unix.stackexchange.com/questions/13751/kernel-inotify-watch-limit-reached

Will the sync client actually warn the user that there are more files to watch than the current limit?

Also, I would be good to check/document what would be the practical limits for oc sync client per platform (OSX, Win, Linux)?

@dragotin Not exactly sure, but I noticed that with 2.4.3 when the initial sync finished, the ichon was green only for one or two seconds, then it became blue again and was busy for another 15 seconds. My guess is, it then already did, what the other two would do only on the first resync. Tried twice with same behaviour.

@moscicki yes, we warn the user when we run out of inotify listeners. See #6119 for details.
I am curious what exact limit is needed for 1M files, and how much kmem that adds.
My 10k files test did not hit the limit 8000.

200k files - a demonstrator that shows good speedup in 2.5.0

Summary: 200.000 files scattered in 2.000 folders on the server are in sync with a 2.4.1 and a 2.5.0 desktop client. When editing, adding or deleting a single file, the 2.4.1 client needs less than 10 seconds.

for ii in $(seq 1 100); do for i in $(seq 1 100); do echo "$i/$ii"; mkdir -p "t$i/$ii"; for j in $(seq 1 100); do cp /etc/samba/smb.conf t$i/$ii/f$j; done; done; done

have enough inotify watches for 1M files.
echo 200000 | sudo tee -a /proc/sys/fs/inotify/max_user_watches

client 2.5.0daily20181014

  • vmstat-m.20180815-13:03:56.txt
  • selective sync with t1 t2 t3 t4 t5 t6 t7 t8 t9 10 (100k file).
  • initial 'sync' with sqlite removed: 2:05.1 sec
  • enabling folders t11 t12 t13 t14 t15 t16 t17 t18 t19 t20 (new total 200k files)
    (false warning printed, that unchecked folders will be removed. they are not removed.)
  • second 'sync' with more folders added:

    • Thousands of false messages scroll through the activity log: 'Datei ist seit der Entdeckung geändert worden'

    • Files are being downloaded from the server although identical files exist locally.

    • The client process grows to 1.8GB memory footprint.

    • Duration: 38:34.0 sec (all 4 CPUs of a quadcore busy with client and server activity)

    • 5 sec idle, then another sync: 10:28.3 sec (one CPU busy, no server activity seen)

    • The client process drops to 1.1GB memory footprint.

  • vmstat-m.20180815-14:28:33.txt
  • client stop; start: 45.5 sec
  • small local file change: 6.6 sec, 5.1 sec (including 3 sec before the sync starts)
  • small remote file change: 3.4 sec, 4.4 sec

client quit: vmstat-m.20180815-14:29:55.txt

client 2.4.1

  • selective sync with t1 t2 t3 t4 t5 t6 t7 t8 t9 10 (100k file).
  • initial 'sync' with sqlite removed: 3:20.2 sec
  • enabling folders t11 t12 t13 t14 t15 t16 t17 t18 t19 t20 (new total 200k files)
  • second 'sync' with more folders added:

    • The client process grows to 2.4GB memory footprint.

    • 80 messages correclty name the remaining unsynced folder. (Nothing scrolls wildly.)

    • Duration: 10:15.6 sec

    • The client process remains at to 2.4GB memory footprint.

  • first force resync: 48.2 sec
  • 2nd, 3rd force resync: 46.5 sec, 35.2 sec
  • client stop; start: 28.5 sec
  • small local file change: 33.7 sec, 45.1 sec
  • small remote file change: 25.1 sec, 31.3 sec

1M files - a demonstrator that shows moderate speedup in 2.5.0

Summary: 1.000.000 files scattered in 10.000 folders on the server are in sync with a 2.4.1 and a 2.5.0 desktop client. When editing, adding or deleting a single file, the 2.4.1 client needs between 1 and 2 minutes, but 2.5.0 only needs 20 to 30sec to sync to the server.

client 2.5.0daily20181014

  • client stop; rm db; client start: ca 15 min (1M files are now included)
  • small local file change: 30.1 sec, 22.0 sec, 20.3 sec
  • small remote file change: 1:05.5 sec, 27.4 sec, 51.9 sec

client 2.4.1

  • client stop; rm db; client start: ca 15 min (1M files are now included)
  • small local file change: 2:48.8 sec, 1:04.7 sec, 1:02.6 sec
  • small remote file change: 1:02.1 sec, 1:45.3 sec, 1:03.2 sec

@moscicki Good news about inotify watch limits:
With a test tree of exactly 1010010 files in 10103 folders, the official number of watches appears to be 10390.
This number was found by bisecting /proc/sys/fs/inotify/max_user_watches settings.
Compared with earlier tests, this confirms the impression that we have one watch per folder plus ca. 200 -- the number of files seems irrelevant.

Linux slabinfo (via vmstat -s and slabtop, using echo 3 > /proc/sys/vm/drop_caches) did not seem to be accurate enough to actually measure additional kernel memory consumption for the watches. In some cases kernel memory footprint even appeared to be smaller while the client was watching the tree.

@jnweiger i consider switching from 2.4.3 to a 2.5 client on Ubuntu 14.04.5,
while checking /proc/sys/fs/inotify/max_user_watches the value is: 65536

So i looked into /etc/sysctl.d and found these settings with grep inotify.max_user_watches *.conf

100-owncloud-inotify.conf:fs.inotify.max_user_watches = 524288
100-sync-inotify.conf:fs.inotify.max_user_watches = 524288
30-tracker.conf:fs.inotify.max_user_watches = 65536

AGAIN the current value is 65536, this confirms my assumption, that the usual load-order specification has two digits for a reason:
According the docs

All configuration files are sorted by their filename in lexicographic order, regardless of which of the directories they reside in. If multiple files specify the same option, the entry in the file with the lexicographically latest name will take precedence. It is recommended to prefix all filenames with a two-digit number and a dash, to simplify the ordering of the files.

The https://github.com/owncloud/client/issues/4107#issuecomment-169472763 suggesting to set the owncloud-inotify.conf load-order to 100, and the resulting commit https://github.com/owncloud/administration/commit/8218eb3b282862b8c801ef643a4661926e6f5955 created a bug.

@gnanet Thanks for debugging. Your suggestion would be 99-*-inotify.conf?

Was this page helpful?
0 / 5 - 0 ratings