we need a way to exclude stickerpacks from being nuked by quarantine_media somehow. suggestions welcome
Probably the easiest way to do this is to maintain a list of users (or a community?) in the config of users who are immune to quarantining, on the assumption that they are trustworthy and not going to start uploading virii or whatever.
And we can't just prevent removal of "m.sticker" events?
I don't think we even create events, it looks like the media is just uploaded and a manifest bundled with the referring mxc as url. So maybe first the script should exclude certain users while waiting for stickerpacks to be done some other way.
This has popped up again on matrix.org. Seems like you're proposing that we add a config option featuring a list of trusted users whom we won't quarantine media of. This would work for our local stickerpacks.
However, this would not work for remote users (as it stands) as we don't know who uploaded a piece of media without asking the homeserver that's hosting it.
Thinking in terms of community sticker packs, and custom emoji, this may become a problem in the future.
However, for now, this should solve the issue of a homeserver operator's own stickerpacks from being quarantined accidentally.
the config option could also just be a list of mxc uris to never destroy, which makes it agnostic to stickers and agnostic to users.
the config option could also just be a list of mxc uris to never destroy, which makes it agnostic to stickers and agnostic to users.
Yes, and I think this would be useful in addition to users. Having a user which we can upload all of our stickerpacks through and knowing they won't be quarantined is quite practical.
the config option could also just be a list of mxc uris to never destroy, which makes it agnostic to stickers and agnostic to users.
How about a new column on the local_media and remote_media_cache tables, rather than having a big list in the config (which would presumably require us to restart the server each time we wanted to add a new item?). Initially this could be updated manually: if it became useful, we could add an admin API later.
One problem with that approach would be that it would be difficult to quarantine-lock media from remote servers before we had cached a copy of it. Is that likely to be a problem in practice?
You could take the other ideas and put them in a database table instead, forgoing the need for a restart.
For those idea, doing it by user would still be quite practical for the stickerpack use-case. If we had a good relationship with another homeserver operator who uploads stickerpacks, we could whitelist a user from their server which uploads them at any time - rather than servicing their requests to do so each time they uploaded a new piece of media.
However, this breaks down when you want to prevent quarantine of a legitimate stickerpack someone uploads, but not other images they generally upload in their spare time.
Perhaps we could have a hybrid approach of being able to mark specific media via @richvdh's proposed method, but additionally have a table of users that marks any media they upload at time of upload. We could then add and remove users from this list at will.
For those idea, doing it by user would still be quite practical for the stickerpack use-case. If we had a good relationship with another homeserver operator who uploads stickerpacks, we could whitelist a user from their server which uploads them at any time - rather than servicing their requests to do so each time they uploaded a new piece of media.
I don't think we know the uploader for remote media, so I don't think that works.
I suggest that for now we either a) add a column in the tables to say media shouldn't get quarantined and/or b) add a table for users whose images should not be quarantined.
I don't think we can do any better for now while the mxc's are so divorced from rooms, so we should just do something sub optimal to unblock the specific case of sticker packs on matrix.org. Given this won't be a public API we don't really need to worry about backwards compat and can change our minds down the line
add a column in the tables to say media shouldn't get quarantined
I seem to recall a conversation where it was decided this ^ is the way forward. Am I misremembering?
Otherwise, any hints on how to decide between the two approaches? Adding the column seems like it would have less impact overall, but not sure it would meet the requirements.
@clokep I think all involved parties are happy with the database column solution. I was rooting for the config option for remote server, but considering we don't know who uploaded a piece of media from a remote server, it loses its advantage.
I think adding a database column for media will solve this problem for now. We can always do something else later.