Confirm by changing [ ] to [x] below to ensure that it's a bug:
Describe the bug
When you repeatedly upload file stream to S3 using upload method it infinitely increases memory usage of Node.js process.
Is the issue in the browser/Node.js?
Node.js
If on Node.js, are you running this on AWS Lambda?
No
Details of the browser/Node.js version
v12.13.0
SDK version number
v2.558.0
To Reproduce (observed behavior)
Run this code:
const AWS = require('aws-sdk')
const fs = require('fs')
const http = require('http')
const credentials = new AWS.Credentials({
accessKeyId: '......',
secretAccessKey: '........'
})
const S3 = new AWS.S3({
credentials,
region: 'eu-west-1'
})
http.createServer(async (req, res) => {
const stream = fs.createReadStream('./test.mp4')
await S3.upload({ Bucket: 'test', Key: 'test', Body: stream }).promise()
res.end(JSON.stringify(process.memoryUsage(), null, 2))
}).listen(1234)
As you send requests to this server memory will infinitely increase.
I am using small video file, but you can use any file.
Hey @glepur thank-you for reaching out to us with the issue.
I believe the code snippet itself wont be enough to determine if there's a memory leak.
We'd have to upload multiple videos over the course of time maybe hours to see if there is a memory leak. This might be a duplicate for #2552 which was closed due to inactivity, happy to keep this one open to address #2552 as well.
@mreinstein @brendanw this issue can be tracked here, since the lock bot locked the last thread(apologies), I'd reach out to the team to work on it more closely.
This looks suspiciously like you are missing back-pressure in your managed_upload implementation.
The issue should be observable under the following conditions:
To try and reproduce I would suggest throttling your network and upload a file of a few GB size.
@glepur I would try using a regular http request instead of using the managed upload, e.g.
const http = require('http')
const fs = require('fs')
const pipeline = require('util').promisify(require('stream').pipeline)
const {
pathname,
search,
hostname,
protocol,
port
} = new URL(await s3.getSignedUrlPromise('putObject', { Bucket: 'test', Key: 'test' }))
await pipeline(
fs.createReadStream('./test.mp4'),
http.request({
method: 'PUT',
path: `${pathname}${search}`,
hostname,
port,
protocol
})
)
I spent a bit of time debugging a pesky memory leak issue with my app uploading images to S3 but sadly couldn't resolve it and had to take other measures that to kill the node processes to release the memory after a while (using cluster/worker/thread pools).
Our task is to perform a get request from a server (using node native HTTPS module), use the sharp npm module to resize the image then upload to S3.
@ronag Unfortunately that code doesn't work for me. For anyone finding this in the future, here are my findings:
For starters you need to require("https") as for some reason, node has a different module for HTTP vs HTTPS.
Secondly, and the issue with streams, is that the PUT request needs to have a Content-Length header. This is difficult to do with a stream!
Anything you can suggest?
I found a somewhat related issue to this - #230.
It's the same in version 3 https://github.com/aws/aws-sdk-js-v3/issues/1897
Most helpful comment
Hey @glepur thank-you for reaching out to us with the issue.
I believe the code snippet itself wont be enough to determine if there's a memory leak.
We'd have to upload multiple videos over the course of time maybe hours to see if there is a memory leak. This might be a duplicate for #2552 which was closed due to inactivity, happy to keep this one open to address #2552 as well.
@mreinstein @brendanw this issue can be tracked here, since the lock bot locked the last thread(apologies), I'd reach out to the team to work on it more closely.