oc_storages inside the root storage-0 https://github.com/nextcloud/server/blob/9e884567680e359365a74f2b1039ce4e919b8400/lib/private/legacy/OC_Util.php#L159@kesselb @rullzer @icewind1991 Feedback welcome
Hi @MorrisJobke as I can see, the idea is to split the preview folder into a small number of folders than the default (fileId) using the md5-first-7-letters approach.
Till that, no big problem, the "error" solved is the load on the directory listing(mostly on local filesystem environments).
My concern on this ticket is related to bucket configuration and the limit of thems.
The idea is to have a configuration (like the current user bucket preference) that will save the previews randomly inside the buckets, using the md5 first seven letters. Even tho maybe an "on the fly" approach can be used, and avoid some database inner joins.
A math approach can be:
This will indeed divide the storage for previews, around the buckets.
Something that I would mention is that this will only work with a static bucket number. If bucket number is increased, then is mandatory to save the calculated bucket by id in an additional table.
Regards
Hi @MorrisJobke as I can see, the idea is to split the preview folder into a small number of folders than the default (fileId) using the md5-first-7-letters approach.
This is already implemented. And as this is done as a layer we can reuse this layer to also do the distribution across multiple buckets.
A math approach can be:
- First, check the multi bucket config and get the number of buckets
- Take the first md5 7 letters and convert them to numbers (from hexadecimal to decimal)
- Divide the decimal result by the number of buckets and get rest of the division
- Dynamically set the bucket number for uploading and downloading the preview, based on the calc made
Yep - that would also be our naive approach. We maybe join the efforts in here with the ideas of #22039 and make this a bit more permanent. Otherwise changing the logic or changing the number of buckets will lead to wrongly calculated bucket numbers. So storing the result of the formula together with the preview makes it possible to change the number of buckets or the formula itself later on. But maybe this is also something we could do afterwards and have a solution for the first problem.
Something that I would mention is that this will only work with a static bucket number. If bucket number is increased, then is mandatory to save the calculated bucket by id in an additional table.
There we plan to use the filecache_extended table most likely.
Thanks for the feedback.
First proof of concept is in #22063 - this introduces the storages and already stores new previews in there.
Implemented in #22063. And there is a migration tool, that migrates pre-Nextcloud 19 previews to the new preview folders that #19214 (this also works for non-multibucket setups): #22135. If there are previews in the folder structure of #19214 already, there is no migration path yet.
Most helpful comment
Hi @MorrisJobke as I can see, the idea is to split the preview folder into a small number of folders than the default (fileId) using the md5-first-7-letters approach.
Till that, no big problem, the "error" solved is the load on the directory listing(mostly on local filesystem environments).
My concern on this ticket is related to bucket configuration and the limit of thems.
The idea is to have a configuration (like the current user bucket preference) that will save the previews randomly inside the buckets, using the md5 first seven letters. Even tho maybe an "on the fly" approach can be used, and avoid some database inner joins.
A math approach can be:
This will indeed divide the storage for previews, around the buckets.
Something that I would mention is that this will only work with a static bucket number. If bucket number is increased, then is mandatory to save the calculated bucket by id in an additional table.
Regards