Some files are being unnecessarily re-uploaded to S3 every time I run sync.
My understanding of the cause so far:
NextMarker – the key of the last object returned in a single S3 ListObjects request – contains a space.a 1000.txt is encoded by AWS to a+1000.txt in the S3 ListObjects response XML.marker parameter, rclone encodes it again, so it becomes marker=a%2B1000.txt.a+1000.txt rather than the expected a 1000.txt, so any subsequent objects that start with a and a space are omitted.--fast-list is used, as it could affect any sync of >1000 files in totalrclone version)v1.50.2-086-ga186284b-beta (also tested with v1.50.2)
macOS 10.14.6, 64-bit
AWS S3
rclone copy /tmp remote:tmp)This is my minimal reproduction case:
mkdir files
for i in {0001..1100}; do touch "files/a $i.txt"; done
rclone sync files/ "s3:<bucket name>" --config rclone.conf --use-server-modtime --update --log-level DEBUG --dump headers,bodies
rclone sync files/ "s3:<bucket name>" --config rclone.conf --use-server-modtime --update --log-level DEBUG --dump headers,bodies
rclone.conf:
[s3]
type = s3
provider = AWS
region = us-west-2
env_auth = true
On the second run (and all subsequent runs), the last 100 files are re-uploaded unnecessarily.
In my case, this was causing tens of GBs of photos to be re-uploaded every time I ran a sync.
-vv flag (eg output from rclone -vv copy /tmp remote:tmp)Log from the second run: https://paste.ee/p/FQu37
Excellent debugging :-) And thank you for the repro and log - both of which were very useful.
I managed to replicate this after setting the provider correctly in my config (yes that is code for a 30 minute trip down a rabbit hole ;-)
This bug got introduced when we introduced URL encoding into the listings to fix listings with non XML representable characters (eg control characters).
Try this - it should fix it hopefully!
https://beta.rclone.org/branch/v1.50.2-085-g2dddcc52-fix-3799-s3-nextmarker-beta/ (uploaded in 15-30 mins)
@ncw Thanks for looking into this so quickly! I can confirm that your fix resolves the issue for me.
Thanks for testing.
I've merged this to master now which means it will be in the latest beta in 15-30 mins and released in v1.51