Hello everyone.
Today I would like to ask community to give a right path for Meteor-Files 2.0. Any feature requests, suggestion and ideas is highly appreciated.
Lately this lib gained an attention, thank you to everyone who has a chance to use is. Lots of contributors have made this lib much better.
Now it's time for v2.0, time to move from CoffeeScript to ES6, time to remove all obsolete code.
Here is some of my thoughts and questions to you:
upload method, and on Server you receive everything you need at afterUpload hook. This one may also help to move this lib to NPM. Another good option is to split this lib into two - one which manages files as collections, and another which handles uploads.Authorization header with current Meteor token for uploads. For file requests (download, display, play) it's still unclear. As discussed here we can not rely on cookies as they are not accessible at various circumstances, especially on production with mobile apps. Authentication headers is not under control when you displaying file in HTML. Get query is not safe, HTTPS (SSL) partly solves get query parameters security. public files? Like when it served by proxy-server nginx, apache, etc.As always main idea to keep this project flexible as possible, by providing thin API to work with files and uploads without overhead.
@dr-dimitru this looks like a very solid feature list and we are looking forward to see 2.0 already.
POST sounds great of course, but web worker sound like a potential legacy-support problem case.metadataHandler object/interface. You could have a class that accepts constructor arguments like the database name and some of the current 'global' constructor arguments and creates a Mongodb-backend.This reminds me of something i would like to mention: the current API is rather opaque and inconsistent in naming and parameters. It is workable because the API is documented well enough, but I think structuring the API into different parts would greatly help this: MeteorFiles.Collection(storage, metadata, options), MeteorFiles.StorageBackend and MeteorFiles.MetadataBackend or something like that would make it much easier to grasp and extend in single places by grouping the callbacks and configuration options by purpose and defining clear interfaces.
If 2.0 is going to break backwards compatibility anyway, I would suggest fixing some of the naming issues in the same sweep. For example there is no clear style for callback options in the configuration, there are some names like on(Before/After/Initiate)*, then there are downloadCallback and namingFunction and finally some properties that can be values or functions creating them: protected, interceptDownload... This is very confusing because all of them have different structures; some are called function or callback and obviously should be functions, some are verbs and don't even sound like they should be set by the user (interceptDownload) and some are adjectives and don't sound like they could be scripted.
public currently because all of our files are controlled with rather strict permission tests (or should be, the authentication issue is relevant here). I think it is still a useful feature. In my proposal above it could very easily be made possible by an alternative StorageBackend implementation that returns a link built according to a user function and saves files at a path supplied like that. @s-ol thank you for feedback
Agree about mess with naming through library. This project started more than two yeas ago, when it was less than 200 lines. Both v1.5.0 and v1.6.0 was attempts to rewrite lib from the scratch, but due to backward compatibility all naming was saved. You can participate in reviewing 2.0 once its branch will be pushed.
And thanks for all contributions from your end, during this year.
Please add a TypeScript definitions file. This will also make it easier - even for non-TypeScript users - to better understand the structure and usage of the package. Thanks in advance!
@Avijobo Thank you.
Here is something you can start with: https://github.com/VeliovGroup/Meteor-Files/issues/226
Just found the package (yep, looking to get rid of CFS :P).
As the (extreme) modularity fan I am, I don't see the point in splitting the lib in tow (files VS handling uploads).
I mean: does anyone on Earth need a file collection if his app is not allowing to upload files?
PS: May I understand from issue #51 that cluster is supported? That's the main reason we're moving away from CFS (although we'll set a files microservice in another machine this time). Nop, we don't have any experienced devop in the team :P
@luixal I'm not sure I understand what you mean:
As the (extreme) modularity fan I am, I don't see the point in splitting the lib in tow (files VS handling uploads).
files vs handling uploads? I don't understand what the 'file' part of that separation would be. What I am proposing is to split between
Not all of these things need to be strictly seperate, and not all of them need to be very extensible (I didn't suggest to abstract the Data Transfer layer, though it would be possible), but I think especially the two types of storage are important to be patchable conveniently, any average-sized app will have to touch them eventually.
I mean: does anyone on Earth need a file collection if his app is not allowing to upload files?
no, but there are tons of use cases for different storage and metadata-storage backends (and the separation in between).
For example:
I think that this is heading into the very right direction. After about one year with this great library we'd love to see a more modularized solution written in ES6. From a developer's perspective this project should primarily aim on the heavy lifting e.g. getting data from client to server and vice versa, this must of course include authorization and later on maybe sophisticated solutions for different protocols like webRTC etc.
In addition all we need are pre- and post-hooks, where context must contain the meta data, every hook can cancel further progress and data consistency is automatically maintained while transfers are in progress (to avoid orphaned data left on storage).
@menelike
In addition all we need are pre- and post-hooks, where context must contain the meta data, every hook can cancel further progress and data consistency is automatically maintained while transfers are in progress (to avoid orphaned data left on storage).
Right, agree what all hooks, callbacks, events should share same context with .abort() method and meta-data.
Btw what API do you prefer/most use?:
@dr-dimitru we are using react client-side and try to use a data-driven approach throughout the whole project, so currently we use promises. When I get back to work tomorrow I might be able to provide you with a brief overview of how we currently use MF1.0, that might give you a good idea of our view and use cases.
I don't think events are appropriate for what is a very linear process and callback chains are mostly just 'dumb' promises so I do believe they might be the objectively best here.
@s-ol thank you, waiting for more info from your end.
I guess you were talking about client. As on server it is not a linear process it's just events of incoming data, requests, etc...
@dr-dimitru no, i was actually meaning the server, but I didn't explain well what I meant. Meteor Files so far does not have a process like I described, but we added that to it in our usage idiom and I hope I can show you how and why soon. But basically the process encompasses data validation, processing (thumbnails, storing size metadata) and kicking of database work in other collections.
I think callbacks may be more _standard_ (that's why I would go with callbacks. That and supporting wrapAsync as in the issue you have referenced above), but we are using promises more often lately. They are quite similar (IMHO), and your projects provide good documentation (thanks for that!), so any of them would be a good solution.
Maybe you could choose one way to go at first and add the others (or ask for PR) later on :)
So here's a couple of "problems" we faced in our up/download api programming with meteor files:
most of these things are fairly straightforward to do and documented in the meteor-files wiki, but it becomes a lot of boilerplate if you have a couple of different collections.
1., 2. and 3. we do via graphicsmagic and originally plainly in the onAfterUpload like it is recommended on the wiki too. However using the asynchronous gm library to autorotate, write back, loop through all sizes, write back etc. becomes a lot of callbacks with a lot of things that can go wrong. In the face of this we started using promises. Our intermediate solution looked something like this:
// because it's tedious to collect all those 'err' arguments
export const rejectErrArg = (reject, callback) => (err, result) => {
if (err) reject(err);
else callback(result);
};
export const autoOrient = docRef => new Promise((resolve, reject) => {
if (!docRef.isImage) resolve(docRef);
gm(docRef.path).autoOrient().write(docRef.path, rejectErrArg(reject, () => resolve(docRef)));
});
export const resize = (newSize, quality = 85) => docRef => new Promise((resolve, reject) => {
if (!docRef.isImage) resolve(docRef);
if (docRef.versions.original.meta.width <= newSize && docRef.versions.original.meta.height <= newSize) {
resolve(docRef);
return;
}
gm(docRef.path)
.format(rejectErrArg(reject, format => {
let { type, extension } = docRef;
const newPath = `${docRef.path}_${newSize}.${docRef.extension}`;
gm(docRef.path)
.resize(newSize, newSize, '>')
.quality(quality)
.write(newPath, rejectErrArg(reject, () => {
gm(newPath).size(rejectErrArg(reject, dimensions => {
const { width, height } = dimensions;
// store metadata in docRef.versions
resolve(docRef);
}));
}));
}));
});
function onAfterUpload(fileRef) {
autoOrient(fileRef)
.then(size)
.then(createSketch)
.then(resize(docRef, 64))
.then(resize(docRef, 128))
.then(resize(docRef, 256))
.then(resize(docRef, 512))
.then(docRef => {
this.collection.update(
{ _id: docRef._id },
{
$set: {
versions: docRef.versions,
// other metadata we need to store
},
}
);
})
.catch(err => {
console.error(err);
throw new Meteor.Error(err);
});
// some other post hooks: insert into referenced collection
}
this is how we used it for quite a while, but we couldn't really reuse as much of this code as we wanted to and when we started to work on 5 (azure cloud storage integration) we wanted to make the whole thing more modular for various reasons. For one we wanted to be able to reuse the azure handling code in multiple collections, but we also needed the ability to fall back to an equally supported local filesystem solution onDemand (for debugging / local development purposes).
Integrating azure was very straightforward by following along the lines of the examples in the wiki. What we found great is having access to the node HTTP response / stream object. For some implementation reasons we do not want to use public CDN URLs to our files and instead loop the response through our meteor instance. Access to the http.response stream object means we dont have to deal with chunking and other I/O ourselves and can let the nodejs implementation take care of it:
function interceptDownloadAzure(http, fileRef, name) {
const { versions, meta: { azureContainer } } = fileRef;
const { azureBlobId } = versions[name] || {};
if (containerName && blobId) {
blobService.getBlobToStream(azureContainer, azureBlobId, http.response, logError);
return true;
}
return false;
};
In the development of all this we realized that all of our data processing followed the same process pattern, with different steps according to the type of data:
media: upload -> fix EXIF rotation -> store size metadata -> select first frame for gifs -> generate thumbnails -> [upload to azure -> delete local files -> save azure access metadata to collection] -> post message
avatars: store size metadata -> check size against avatar conventions -> generate thumbnails -> [upload to azure -> delete local files -> save azure access metadata to collection]-> update profile
in the end we came up with this concept:
/*
* creates a Meteor-Files Collection that transforms documents using a Chain of callbacks
*
* additional Options:
* - preChain: an Array of callbacks [(docRef) => docRefOrPromise] to run before files are stored
* - postChain: an Array of callbacks [(docRef) => docRefOrPromise] to run after files are stored
* - onTransformFailed: a Callback [(docRef, err) =>] to run when a Chain is rejected
*
* the docRef is supposed to be modified (preferrably) in-place in the preChain. After the preChain,
* the document's 'versions' field will automatically be updated in the Collection.
* If the document is changed in the postChain, updates must be applied manually.
*
* NOTE: onAfterUpload must only be overwritted with caution, needs to run pre/postChain manually
*/
export class LocalTransformCollection extends FilesCollection {
constructor(options) {
const { preChain = [], postChain = [] } = options;
// eslint-disable-next-line no-param-reassign
options.onAfterUpload = options.onAfterUpload || function onAfterUploadStorage(fileRef) {
if (Meteor.isServer) {
const chain = [...preChain];
chain.push(function updateDoc(docRef) {
const { _id, versions } = docRef;
this.collection.update(
{ _id },
{ $set: { versions } }
);
return docRef;
});
chain.push(...postChain);
reducePromise(fileRef, chain.map(f => f.bind(this))).catch(err => {
logError(err, 'in Storage chain');
if (options.onTransformFailed) options.onTransformFailed(fileRef, err);
throw new Meteor.Error(err);
});
}
};
super(options);
}
}
and an alternate version:
/*
* like LocalTransformCollection, but:
* - uploads to azure blob storage
* - automatically saves 'meta.containerName' aswell
* - removes 'path' from all versions
* - calls 'unlink' on document
*/
export class AzureTransformCollection extends TransformCollection {
...
}
what I like about this approach:
things im not quite happy with:
docRef needs to either update the collection directly or be followed by some other step that collects changes and updates them and such crossdependencies are pretty hard to see in the code.these both fall into point 4 from my list above I guess, and I hope you can address them in MF 2.0, as you have already mention them in your original questions in this thread.
Also note that I don't want to say this TransformCollection is the way to go. It is our solution to our personal set of problems with the limitations and interfaces of Meteor Files.
I hope this helps with your design considerations and I am looking forward to see what you come up with :)
@s-ol reviewing the comments here, I realize I didn't write back to yours (as I thought I'd done). Your case is much more complicated than ours (we're basically using a microservice that receives, stores and serves files, nothing else, no transformations and no external integrations for now), but I totally agree with one of your points:
embedding the metadata inside another collection (very strong point considering mongo's limitations in joins etc.)
That would be great. Also, I would keep it transparent to anyone who doesn't need to work with metadata separately (kind of a get helper that would return all info when asking for a file).
I can't tell about other points as we're not facing those problems right now.
@s-ol , @luixal thank you for additional and very detailed info.
I'm started API planning here any thoughts, comments and suggestions is highly appreciated. Naming was an issue in v1, here is a chance to make it better :)
All callbacks has camelCased names, and starts with on prefix, like onEventName, while events has no prefix .on('eventName', () => {/*...*/});.
Also here is idea to have global hooks on client (on Server all hooks is global by default), like from second column "Client Upload Instance Hooks", - here
The problem between events and callbacks, as you can't really return something from event, like onBeforeRemove is callback as you need to have possibility to return false to abort removal.
.insert() to .upload()?onUploaded or onEnd (for events - uploaded, end) ?As we're going to omit Mongo Collection dependency in v2, the FileCursor and FilesCursor will gone too. The basic idea - is to return path, URI to file and other metadata to onUploded hook and let to you decide what you're going to do with this data. You can store it to separate collection as it was in v1, or you may just to extend existing docs in other collections.
Like if you upload Avatar for user's profile, you will just update user's document. What do you think?
From my nooby point of view (I've just play around a little bit with meteor-files):
insert keeps the standard name with mongo collections but, as you're removing that dependency, maybe upload is a better description (in the end, what you're really doing from the client is _uploading_ a file). I'll vote for upload and maybe keeping insert as kind of an alias.onUploaded (if you finally keep the insert, I'd call it onInserted.If there is no real use-case for DDP uploads, it can be dropped
I am eager to that. I only want to upload my files.
驴what about signed urls like in nginx? with DDP you get a signed url that allows you to get the file. It is valid both, upload and download.
I think it is not necessary, if you can copy your file elsewhere in the FS, you can place it in a public served directory.
callbaks and events should be the standard in node.
Thanks.
Hello @hacknlove ,
Thank you for your point of view.
Here is my comments:
@dr-dimitru just saw that it seems like I didn't reply to point 5 above; Promises are definitely the direction we are headed.
As we're going to omit Mongo Collection dependency in v2, the FileCursor and FilesCursor will gone too. The basic idea - is to return path, URI to file and other metadata to onUploded hook and let to you decide what you're going to do with this data. You can store it to separate collection as it was in v1, or you may just to extend existing docs in other collections.
While this is great for our - and most more complex - use cases, i think there should at least be some optional utility layer that has similar functionality to MF now, since many projects just need a very basic API to store files for them and don't want to worry about figuring out the whole data schema again I would believe.
Hi @s-ol ,
Thank you for update and opinion on the Promises.
.rawCollection() to pass Collection, but I'm still not sure how this will work on the Client... I guess it will be plain Meteor.Mongo instance, which will be updated/managed on server and updated synced with Client@dr-dimitru
Sometimes I will copy the files to some folder served with NGINX and sometimes I am going to upload the files to S3
Bothways I am using signed urls to get protected files
To get the signed url I will call some meteor method to get the url server-side signed.
nginx configuration example:
location ~ /private/ {
secure_link $arg_md5,$arg_expires;
secure_link_md5 "$secure_link_expires$uri some secret";
if ($secure_link = "") {
return 403;
}
if ($secure_link = "0") {
return 410;
}
try_files $uri =404;
}
urls signation example
const secret = 'some secret'
const crypto = require('crypto')
const generateSecurePathHash = function (url, expires, secret) {
if (!url || !expires || !secret) {
return undefined
}
return (new Buffer(crypto.createHash('md5').update(expires + url + ' ' + secret).digest()).toString('base64')).replace(/=/g, '').replace(/\+/g, '-').replace(/\//g, '_')
}
var nginxsign = function (url, secret, timeout) {
var t = Date.now() + timeout
return url + '?m=' + generateSecurePathHash(url, t, secret) + '&t=' + t
}
var s3sing = function (path) {
return s3.getSignedUrl('getObject', {
Bucket: 'dakapp',
Key: path,
Expires: 60 * 60 * 4
})
}
Apropos of signed urls:
Yes, the signature goes in the url. But they have an expiration date.
In NGINX you can improve the security adding the IP or some header into the signature just do secure_link_md5 "$some_nginx_variable$secure_link_expires$uri some secret";
Sadly it is not easy get the client ip in a Meteor callback
There are lots of ways to deal with url creation.
I think each developer should choose the one that fits best with their needs.
Sometimes you have enough knowing the _id of the related object (a post for instance) to get the main image, doing https://example.com/images/post/{{_id}}.jpg
Some times, you need to store the filename in a field of the document you are showing, or the fullurls if you allow to link external resorces...
Just a comment: if dropping Mongo integration is in the roadmap, maybe another package (meteor-files-mongo-collections or similar) could be the options. Adding both packages you get the full solution if you're working with mongo.
Main problem here is avoiding to be the package hell CollectionFS is right now.
I'm currently trying to understand cookie usage in an app which uses Meteor and Ostrio files.
The cookie aspect looks problematic, as it reuses the same security token as is kept in localStorage for Meteor, so exposing that token via failings of both localStorage and cookies, so removing the use of cookies would seem a big security gain, as hardly any deployments secure cookies correctly. Is there some specific reason for doing file uploads this way, I'm presuming old versions of IE or some such.
Old wisdom: https://blog.meteor.com/why-meteor-doesnt-use-session-cookies-e988544f52c9
@SimonWaters I don't know about uploads, but we had issues with downloads on Cordova that prompted some investigation. In the end, for downloads, it comes down to the fact that cookies are just about the only data you can transmit in an <img> tag (without changing the URL for each user, which is a viable path IMO but complicates things a bit between data layer and front-end).
I think the authentication runs piggy-backs off the meteor tokens so that the leases don't have to be manually controlled by the app and you can't have weird situations where you are logged in but mf thinks you aren't or the other way around.
Had quick discussion with JS guru here. We are using data URLs in authenticated requests to deliver images in non-Meteor app over web sockets, but that is largely because we have other constraints on how images are handled rather than a preference for data URLs. However feels to me if you have to use cookies then perhaps the localStorage route was a mistake. Cookies have a horrid security model, but simply sticking the same value in both places feels like the wrong solution.
Hi @SimonWaters ,
localStorage used internally by MeteorlocalStorage to have access to token in HTTP headersCookies want a secure flag and/or HSTS with include subdomains directive (pragmatically you want both because without HSTS users will click through warnings). I have plenty of practice securing them (and stealing them, and changing them). But the point of Meteor localStorage was to avoid the machinations of securing cookies and get a mechanism that uses the URI scheme in the same origin policy by default. I thought the proposal above was to lose the cookies, or has that gone?
I thought the proposal above was to lose the cookies, or has that gone?
Yes, by appending session token to the uri, which looks less secure, but by the fact is the same, will be token in a Header or URI Query - doesn't changing situation much.
WDYT?
I vote to keep ddp.
Most helpful comment
I vote to keep ddp.