Synapse: We should create JPG thumbnails, not PNGs

Created on 22 Mar 2017 · 11Comments · Source: matrix-org/synapse

Apparently if you upload a PNG it creates a PNG thumbnail. Surely thumbnails should always be lossy JPEGs

enhancement media-repository performance

Source

ara4n

👍1

Most helpful comment

'cos thumbnails are already inherently lossy, so there's no point in using a typically lossless format like PNG given they are already imperfect. it just wastes bandwidth.

ara4n on 27 Jun 2017

👍2

All 11 comments

I was looking into this issue and I found that
I'm not not able to upload png files
Synapse (develop branch) outputs this error
IOError: cannot identify image file

Full log http://pastebin.com/n2CRXs1p

EDIT : This was a problem with PIL and not Synapse because I couldn't open that png file using PIL

APwhitehat on 23 Mar 2017

@ara4n Do we really want to have only jpeg thumbnails ?
Because in Synapse code there is a variable named output_type being passed around .
If we just want "JPEG" thumbnails why have the output_type , we can just implicitly assume it will be "JPEG" .
as for forcing "JPEG" editing https://github.com/matrix-org/synapse/blob/develop/synapse/rest/media/v1/thumbnailer.py#L89 as
# Force thumbnails to be JPEG
output_image.save(output_bytes_io, "JPEG", quality=80)
is enough

APwhitehat on 23 Mar 2017

Surely thumbnails should always be lossy JPEGs

But why?

iav on 27 Jun 2017

'cos thumbnails are already inherently lossy, so there's no point in using a typically lossless format like PNG given they are already imperfect. it just wastes bandwidth.

ara4n on 27 Jun 2017

👍2

Also for black-and-white lineart icons, and initialy tiny number of colors paletted png or gifs? Wrong. jpg for such files can be 10 KB for bad dirty jpg against 700 B png

iav on 28 Jun 2017

No solution is going to solve all problems. I'd imagine there's a higher ratio of images that produce nicer jpg thumbnails than png thumbnails.

Although there's nothing that I can see stopping the use of multiple thumbnail formats, but the determination of which container to use gets fairly arbitrary.

turt2live on 28 Jun 2017

👍1

it's true that for lineart PNGs can be smaller. But i suspect the majority of thumbnails are not for lineart (and i'm not sure we're going to get stuck into trying to guess if an image is lineart in order to pick the right compression algorithm! in other news, it sucks that nobody's written a ubiquitous image format that handles both nicely, yet)

ara4n on 28 Jun 2017

Then in most cases post author of source picture already did optimal choice – jpg as default and png for special cases?

iav on 28 Jun 2017

And what if picture in svg?

iav on 28 Jun 2017

the point is that even if people made the optimal choice with the source imagery, it will still be lossy when rescaled to a thumbnail, at which point (other than perhaps for lineart) using a lossless format is a waste of time. for instance, most OSes store screencaps as lossless PNGs, causing them to be huge if they have any photographic content. So thumbnailing these as PNGs is a waste of space.

I suspect the right solution here is to get the client to gen the thumbnail itself, and the sender can tune as they choose. This is how e2e thumbs and svg thumbs would work if they existed, as the server cannot really generate the thumbs due to lack of data or lacknof security (svg). This is already implemented in Riot; just needs to be hooked up. It does pose the trust issue that clients can lie about thumbnail contents, but this is inevitable with e2e.

We can't use svgs for svg thumbs as if I send you a 10MB SVG you certainly don't want a 10MB thumbnail; a crappy 100KB 640x480 jpeg should be fine.

ara4n on 28 Jun 2017

👍1

This came up again (somewhat) in #7586. Note that currently jpeg and webp become jpeg thumbnails and png and gifs become png thumbnails, see:

https://github.com/matrix-org/synapse/blob/789606577ade2335f19e944efcfecfe808519b36/synapse/config/repository.py#L72-L75