Sharp: Lambda storage space shrinking

Created on 9 Feb 2017  路  7Comments  路  Source: lovell/sharp

We are using Sharp in an AWS Lambda function, and we have an issue with storage space (of all things...).
Function downloads an image from S3, places it inside tmp directory. After that:

var image = sharp('/tmp/imageName.png');
image.rotate();
image.toFormat('PNG').toBuffer(function(err, data) {
    // here we check storage space available and files inside the /tmp dir
    // some more stuff
})

With one 18MB PNG image we observed that, after outputting to buffer, the storage space available had dropped from 496MB (514MB available to Lambda minus 18MB for the image we placed inside tmp) to 142MB. This is a problem, because Lambda may reuse the previously used container, in which case the available storage remains at 160MB (142 + 18MB for the image, that we delete before exiting function), and if we upload the same image twice in a row, and the same container gets used, the function runs out of free space on disk and dies.

Question is simple - does Sharp and / or it's underlying libraries, create any temporary files, that may not get deleted, or could this just be bad memory management, causing a swap file to be filled up?

I'm still waiting on a response from AWS, because I do not know if a swap file even exists inside the Lambda instance, so I'm not sure what might be causing this sudden spike in storage space usage. Also, other images don't seem to be behaving this way - a 9MB PNG file doesn't cause any spikes in disk space usage from my observations.

Sadly I cannot share the image itself.

I thank you for any guidance in advance.

question

All 7 comments

Hello, you'll either need to try sequentialRead to reduce memory or set the VIPS_DISC_THRESHOLD environment variable.

Thank you @lovell , setting the VIPS_DISC_THRESHOLD fixed the issue.

There is one thing to add though - at first I tried it with sequentialRead() and that worked most of the time, however one image did fail.
20161108_132059

This image, taken by a Samsung Galaxy S6, fails at toFormat with the following Lambda logs:
2017-02-10T16:17:42.506Z 7271c281-efac-11e6-9c2e-c52e7d71286e [Error: VipsJpeg: out of order read at line 2988 VipsJpeg: out of order read at line 2988 VipsJpeg: out of order read at line 2988 VipsJpeg: out of order read at line 2988 VipsJpeg: out of order read at line 2988 VipsJpeg: out of order read at line 2988 VipsJpeg: out of order read at line 2988 VipsJpeg: out of order read at line 2988 VipsJpeg: out of order read at line 2988 VipsJpeg: out of order read at line 2988 VipsJpeg: out of order read at line 2988 VipsJpeg: out of order read at line 2988 VipsJpeg: out of order read at line 2988 VipsJpeg: out of order read at line 2988 VipsJpeg: out of order read at line 2988 VipsJpeg: out of order read at line 2988 VipsJpeg: out of order read at line 2988 ]
A guy here had a similiar problem https://github.com/jcupitt/libvips/issues/261, only with libvips.

This is not an issue for me anymore, because setting environment variable fixed it, so this is more of an FYI.

That image has a Rotate 90 CW orientation tag. Thus it will try to correctly rotate the image (EXIF auto-orient).

@lovell I think we need to force baton->accessMethod to VIPS_ACCESS_RANDOM if we are trying to rotate the image (just like the trim operation). I've also seen these errors with the sharpen and blur operations in combination with sequentialRead. Not sure if there any more operators which needs access method force to VIPS_ACCESS_RANDOM.

@KrissKulins Thanks for confirming, I'll close.

@kleisauke Good idea, thank you. We could also try adding a tilecache operation, like resize does here, to allow the continued use of sequential read. I've created #709 to track this.

The thumbnail operation gets around this problem by copying the shrunk image to a memory buffer before the rotate. You can see the logic here:

https://github.com/jcupitt/libvips/blob/master/libvips/resample/thumbnail.c#L479

The idea is that the shrunk image will usually be smaller (!!), so copying that to memory (rather than the source image) saves some RAM. It also drops latency, since you can hide the decode inside processing.

@KrissKulins @lovell i鈥檝e Similar issue, First thing is VIPS_DISC_THRESHOLD set as a Lambda environment variable if so what would be the ideal value. I鈥檓 trying to process files greater than 25mb

What is the advantage of using sequentialRead() does it not use up the disk space?

@5um1th Did you see the following?

http://www.vips.ecs.soton.ac.uk/supported/current/doc/html/libvips/VipsImage.html#vips-image-new-from-file (old doc web site)

http://jcupitt.github.io/libvips/API/current/VipsImage.html#vips-image-new-from-file (new doc web site)

"Large images are decompressed to temporary random-access files on disc and then processed from there ... The disc threshold can be set with ... the VIPS_DISC_THRESHOLD environment variable. The value is a simple integer, but can take a unit postfix of "k", "m" or "g" to indicate kilobytes, megabytes or gigabytes. The default threshold is 100 MB."

"VIPS_ACCESS_SEQUENTIAL means you will read the whole image exactly once, top-to-bottom. In this mode, vips can avoid converting the whole image in one go, for a large memory saving."

If you're still having problems, please provide specific examples of what is not working as expected/documented in a new issue.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

henbenla picture henbenla  路  3Comments

iq-dot picture iq-dot  路  3Comments

OleVik picture OleVik  路  3Comments

paulieo10 picture paulieo10  路  3Comments

kachurovskiy picture kachurovskiy  路  3Comments