Meteor-files: S3 uploaded & unlinked but the 'meta' data was not updated after s3 upload.

Created on 3 Nov 2020  路  10Comments  路  Source: veliovgroup/Meteor-Files

Hello guys-

I got a question.

I have 350,000 images saved in S3 module from the variety mobiles and here's some irregular result.

  • 368 files in server file system, not exist in S3, not exist in DB

    • file upload error

    • normal case

  • 2 files in server file system, not in S3, exist in DB

    • file upload success

    • S3 upload error

    • normal case

  • 1 file in server file system, exist in S3, exist in DB (but no 'meta' data for S3 link)

    • file upload success

    • S3 upload success

    • ???

First two cases are fine but last case seems like that failed to update collection after S3, so it lost S3 'meta' data.

I couldn't find specific error and couldn't figure this out on the code.
Because the code 'unlink' the file when the collection update is success,
so I don't understand why the file is removed although collection.update was failed.

this.collection.update({
_id: fileRef._id
}, upd, (updError) => {
if (updError) {
console.error(updError);
} else {
// Unlink original files from FS after successful upload to AWS:S3
this.unlink(this.collection.findOne(fileRef._id), version);
}
});

Does anyone have an idea?

question

All 10 comments

Hello @kakadais ,

Thank you for insights on using this package. I'm glad it serves project needs, on mid-scale

For no.1

These files shouldn't exist either, as unfinished uploads sould get cleaned up within continueUploadTTL which is by default 3 hours (10800 seconds), could you check what's inside __pre* collection?

For no.2

Use "job" which would check if upload to S3 is failed and would attempt to re-upload again later

For no.3

This is question to your MongoDB. Are you using real replicaSet?

For no.1

Nothings in __pre_files collection.

My S3 config is like below,
and did I miss something for this timeout + remove file thing?

  const s3 = new S3({
    secretAccessKey: s3Conf.secret,
    accessKeyId: s3Conf.key,
    region: s3Conf.region,
    // sslEnabled: true, // optional
    httpOptions: {
      timeout: 300000,
      agent: false
    }
  });

  Files = new FilesCollection({
    debug: false, // Change to `true` for debugging
    storagePath: `${process.env.HOME}/hdd/files`,
    // storagePath: '/Users/kakadais/hdd/files',
    collectionName: 'files',
    // Disallow Client to execute remove, use the Meteor.method
    allowClientCode: true,
    onBeforeUpload(file) {
    ...

For no.2

Could you let me know more specific reference for 'job' function? (I can't find it in Wiki. Is it 3rd party one?)
I planed to make some code for 'Checking whole Collection without S3 'meta' reference and upload it if file exist' thing for full recovery,
and this could be just a final exception handler if re-upload works well.
I'll post some examples for this codes if it's helpful for the others.

For no.3

I'm using MongoDB 4.2.7 Community version,
and it is just a single repl set,
so I don't think it's not a write concern problem.

So you doubt that the MongoDB driver doesn't return the error even the update is not performed successfully?

I should make some log for this case cause this is just happaned once in 350,000 files,
and I start to doubt my file management than a MongoDB driver's or this package's error ^^

I'll let you know if this happnes more, so please forget this.

Thanks

I've used Meteor-Files for several projects and this works great always than I expected even in CollectionFS and File mode as well.
I'm sure this approach is great and give a lot inspiration to the other platform and packages soon or later.

So thanks for your effort @dr-dimitru

For no.1

I restarted server sometimes to maintain it,
so if 10800 seoncds is counting on memory, than this could be happen if I restart server.

Is it possible senario?

@kakadais

and did I miss something for this timeout + remove file thing?

Please, see recently refactored and updated demo app source code. But not a lot of changes in this part of code. This will be new official usage recommendations. Docs update is in the progress

Could you let me know more specific reference for 'job' function? (I can't find it in Wiki. Is it 3rd party one?)

Very similar to what you have described. I see two options:

  • Handle an error and retry in a minute for example (won't survive server reboots)
  • Write periodical task (Job/CRON) in the way you described checking leftover files, and not pushed files to AWS:S3. We are planning to update this app with josk-powered Job solution, sometime in the future, it's in our backlog

So you doubt that the MongoDB driver doesn't return the error even the update is not performed successfully?

Only if you on distributed replicaSet. Since file exists and not removed. Yes, mongo returned unsuccessful update, I assume connection to the mongo is the main reason.

@kakadais

I restarted server sometimes to maintain it,
so if 10800 seoncds is counting on memory, than this could be happen if I restart server.

Is it possible senario?

Yes, very-much. Always implement graceful app shutdown, and this is difficult case. I'd write script checking _pre* collection for records. If .count({}) === 0 then there are no active uploads at this moment and it's safe to reboot

Please, see recently refactored and updated demo app source code. But not a lot of changes in this part of code. This will be new official usage recommendations. Docs update is in the progress

Okay. I'll follow up your update then make some comment after do it.

Write periodical task (Job/CRON) in the way you described checking leftover files, and not pushed files to AWS:S3. We are planning to update this app with josk-powered Job solution, sometime in the future, it's in our backlog

Josk is the package that I need exaclty. God, You Velov guys knows what Needs the world always.
I'll make some code laster and comment it as well.

Yes, very-much. Always implement graceful app shutdown, and this is difficult case. I'd write script checking _pre* collection for records. If .count({}) === 0 then there are no active uploads at this moment and it's safe to reboot

  • _pre* collection is removed all after server restarted?

I don't think my all empty _pre* collection is just a coincidence reason.
Maybe I have to write some safe terminate code or script for this.

_pre collection is removed all after server restarted?

No, It is designed to be able to resume upload after Client or Server connection disrupted, including Server reboots

God, You Veliov guys knows what Needs the world always.

馃榾 鉂わ笍

Okay. I'll follow up your update then make some comment after do it.

馃憤 Keep us updated

Get my tiny patron par. ;)

I think it's going to taken a long time to figure the #3 problem.
I'll reopen if I get some evidence or hint.

Thanks.

@kakadais Open new issue or update this one in the future. Hope I was able to help

Was this page helpful?
0 / 5 - 0 ratings