In my stack I've created several lamdas using the new NodejsFunction from @aws-cdk/aws-lambda-nodejs
. Functionally everything works fine, code in bundled with Parcel and the stack deploys without issues. There are however two (Parcel related?) performance issues:
I'm wondering if this is the expected performance or if I'm missing something? I did some tests with webpack to manually bundle the Lambdas, this is about 3 times faster than webpack. I'm also wondering about the caching feature in Parcel, shouldn't the second performance issue be tackled it?
I've created a simple sample app to show the issue: https://github.com/jaapvanblaaderen/cdk-lambda-bundling-performance
The app contains a single stack with one lambda that's referenced in two tests. In the test output you can observe that bundling an empty lambda function takes about 8 seconds (initially slow) and for the second test it's still around 6-7 seconds (no/bad caching?).
Bundling asset simple-stack/HandlerLambda/Code/Stage...
RUNS lib/simple-stack.test.ts
✨ Built in 8.01s
Bundling asset simple-stack/HandlerLambda/Code/Stage...
RUNS lib/simple-stack.test.ts
✨ Built in 6.66s
PASS lib/simple-stack.test.ts (21.056 s)
Simple Stack
✓ Should have memory size set to 512MB (9505 ms)
✓ Should use Node 12 as runtime (7459 ms)
Test Suites: 1 passed, 1 total
Tests: 2 passed, 2 total
Snapshots: 0 total
Time: 22.211 s
Ran all test suites.
@jaapvanblaaderen currently investigating this... looks indeed like there's an issue with the cache.
@jogold Ok cool, just let me know if I can help testing something if needed.
So... it's complicated and has to do with the content of cdk.out
changing (we are bundling in there in a temporary directory) which invalidates part of the cache...
@jaapvanblaaderen we'll see how #10763 improves the cache performance. Let me know your experience when it's merged and released.
Thanks @jogold, I'll let you know.
@jogold I've done some testing with the latest 1.68 release, which should include the changes in https://github.com/aws/aws-cdk/pull/10763 if I'm correct. I've tested with the exact same test app as mentioned in the issue description and took the average of four test runs (running npm run test
).
Results with parcel 2.0.0-beta.1
Not very promising results as it apparently got 10 seconds slower. I tested a few other versions as well, 1.62-1.66 are similar in performance and from 1.67 and upwards it's around 10 seconds slower.
Now the interesting part... I noticed that there are quite some security vulnerabilities with the currently supported version of Parcel, not really a big issue, but decided to upgrade it using npm audit fix
. This upgrades parcel to ^2.0.0-nightly.425
. Before I could re-test, I had to tweak some stuff in https://github.com/aws/aws-cdk/blob/master/packages/@aws-cdk/aws-lambda-nodejs/lib/bundlers.ts to make the build work: Change the PARCEL_VERSION
and remove the --no-autoinstall
option.
Results with parcel 2.0.0-nightly.425
That 5.4 seconds looks promising 🙂 . I hope this info helps to improve the situation? I guess you can compare CDK 1.66 and 1.67 to see why the performance (from CDK perspective) got worse? Then there is Parcel, I find it a bit weird that the latest 2.x release is quite old especially as they are still in beta and apparently making quite good (performance) improvements.
Also note that there is already a (dependabot PR) for updating parcel in CDK: https://github.com/aws/aws-cdk/pull/10953
You can solve the slow tests problem by mocking out Parcel bundling with this module - https://www.npmjs.com/package/cdk-lambda-nodejs-mock
You can solve the slow tests problem by mocking out Parcel bundling with this module - https://www.npmjs.com/package/cdk-lambda-nodejs-mock
@alan-cooney This indeed works great for running tests, thanks! 👍 Is there also an open GitHub repo linked to this module?
Hey @jaapvanblaaderen . I have implemented a new cdk construct aws-lambda-nodejs-esbuild
See comparison results on my machine:
@aws-cdk/aws-lambda-nodejs
:
Bundling asset simple-stack/HandlerLambda/Code/Stage...
RUNS lib/simple-stack.test.ts
✨ Built in 11.88s
PASS lib/simple-stack.test.ts (84.678 s)s/_qq218h940s33jg73wss4k5w0000gn/T/cdk.outeeruBW/bundling-temp-dc2901b128dbc7ef10d25ae6a7a15a98d5e3ac81c654c3ad726474eee5ec Simple Stack
✓ Should have memory size set to 512MB (13909 ms)
✓ Should use Node 12 as runtime (67285 ms)
Test Suites: 1 passed, 1 total
Tests: 2 passed, 2 total
Snapshots: 0 total
Time: 86.375 s, estimated 133 s
Ran all test suites.
vs
aws-lambda-nodejs-esbuild
:
PASS lib/simple-stack.test.ts
Simple Stack
✓ Should have memory size set to 512MB (127 ms)
✓ Should use Node 12 as runtime (50 ms)
Test Suites: 1 passed, 1 total
Tests: 2 passed, 2 total
Snapshots: 0 total
Time: 2.045 s, estimated 4 s
Ran all test suites.
I forked your sample, so you can see how to migrate
Just saying.
I have implemented a new cdk construct aws-lambda-nodejs-esbuild
Very interesting!!
@floydspace This is great work! I will definitely have a look at this.
The benchmarks look really promising regarding the https://github.com/floydspace/aws-lambda-nodejs-esbuild package. As a regular user of AWS CDK + NodejsFunction constructs, I vote strongly on switching to the esbuild
package internally, instead of the current Parcel implementation.
A side note: we've had good results with using Webpack back in the days before the NodejsFunction construct existed. Are there any plans for evaluating that one as a possible technology to back the construct?
@eladb what's your opinion here?
I've made some tests with esbuild
and it's incredibly fast. This looks like the perfect tool for aws-lambda-nodejs
and would solve some of the tweaking we do to make Parcel work (like playing with the package.json
).
If we go in this direction would you agree to include esbuild
as a real (bundled) dependency in aws-lambda-nodejs
? It has no dependencies, it only installs its binary with a postinstall
script. Moreover, it offers a synchronous Javascript API (with TS typings). This means that we could completely drop the Docker + tryBundle()
implementation. (Docker could still be used when installing packages that are not bundled though)
@floydspace would you be ready to contribute for this?
Hey @jogold , sure would love to.
I have a couple of questions tho:
docker
when installing not bundled packages, we can just use the user's environment. To be honest, I use the forked and adapted for cdk
implementation from serverless-webpack/lib/packExternalModules.js and it's covered and works perfectly, (though I need to adapt tests as well)docker
if we could just spawn parcel
cli as a child process to make it synchronous? It's basically what the esbuild
synchronous API does. Maybe I miss some important detail in how cdk ecosystem worksThank you.
Can I just post the code from my package as is as a first version?
Let's wait for @eladb's feedback
Why to use
docker
when installing not bundled packages, we can just use the user's environment. To be honest, I use the forked and adapted forcdk
implementation from serverless-webpack/lib/packExternalModules.js and it's covered and works perfectly, (though I need to adapt tests as well)
For modules using native dependencies you could want to install in a Lambda compatible environment. This could be an option of course.
Why to use
docker
if we could just spawnparcel
cli as a child process to make it synchronous? It's basically what theesbuild
synchronous API does. Maybe I miss some important detail in how cdk ecosystem works
We started to use docker
because we didn't want to include parcel
as a direct dependency in aws-lambda-nodejs
. It has too many dependencies. See https://github.com/aws/aws-cdk/issues/6340.
Sounds promising but I would still be reluctant to bundle it with the CDK. Going towards 2.0 the CDK will be a single module, which means our bar for bundling deps is going to be very high.
Can we still employ the same docker fallback approach here? What are the downsides?
Seems like a reasonable approach. To let the developers opt for Docker based or esbuild
based compilation. However, let's consider these facts:
Based on these facts, I consider the "meaningful default" to be an option that is FASTER in 99% of the use cases and requires fewer system dependencies (Docker). This option currently is either Parcel or esbuild
.
Long story short, it's about finding the right balance of keeping package dependencies low, and at the same time - keeping the development experience of consumers of the NodejsFunction construct satisfactory.
Just curious: If the NodejsFunction construct declares Parcel or esbuild
as a dependency on a fixed version, isn't it enough to mitigate risks for AWS CDK?
Can we still employ the same docker fallback approach here? What are the downsides?
Yes but one of the problems we have with Docker here is that we need to mount the project root (in order to have access to the top level node_modules
folder, this is a different situation compared to aws-lambda-python
for example). This results in poor performances (almost unusable on macOS). And since installing parcel
locally greatly improves performances without any constraints I think everyone is (or should be) using local bundling. And then the default becomes the exception....
Implementing a local only solution would result in much cleaner code also. We could maybe go back to having an "undeclared peer dependency".
Note: local bundling could still use Docker (as an option) when installing unbundled dependencies that should be built in a Lambda compatible environment.
Why exactly do we need access to the top-level node_modules? I normally have my own package.json in the lambda subfolder and I think bundling should work in this folder without the CDK project, shouldn't it?
That's a "classical" setup yes, but not a necessary one. In our organization, instead, we create Lambdas alongside/next-to the CDK constructs. So there's a single package.json and it's at the root of the CDK app. This makes it easy to centralize third party version upgrades of dozens of Lambdas.
I normally have my own package.json in the lambda subfolder
How do you easily install
such a setup? Nesting package.json
is usually an anti-pattern. There's no reason not to declare all your projects dependencies in one place (along with tooling configurations etc.), especially when bundling. The only exception are monorepos (still there's no nesting) but in this case dependencies are hoisted to the top-level node_modules
folder so you need access to it when bundling.
I see that with bundling this might indeed not be necessary. Sometimes I zip the folder as Lambda asset and then I need it contained in a subfolder. Installing is done using the normal npm ci
command. I am setting up my project with projen so this is called for all my folders automatically.
How do I specify Lambda-specific deps in one subproject and not in another?
I am open to any best-practices and happy to improve my setup.
@alan-cooney This indeed works great for running tests, thanks! 👍 Is there also an open GitHub repo linked to this module?
Currently it's on my company's monorepo, but I can split it out if useful? It essentially just replaces all NodejsFunction classes with the standard Function from @aws-cdk/aws-lambda, and sets the code as Code.fromInline('return;')
rather than bundling the actual lambda.
@hoegertn
Sometimes I zip the folder as Lambda asset and then I need it contained in a subfolder.
Got it, in this case you have no other solution (but you're creating a zip that contains much more than what's needed by your function).
That's exactly the problem we're trying to solve with the NodejsFunction
construct (zero-config, bundling of exactly what's needed). Once we solve the performance issue there won't be any reason for nested package.json
files anymore I think.
Installing is done using the normal
npm ci
command. I am setting up my project with projen so this is called for all my folders automatically.
This means that you're sometimes downloading and installing the same package multiple times. It's also complicated when you want to update packages.
I see your arguments. Performance is one issue, packaging additional files is sometimes an issue, and different dependency versions are sometimes a wanted thing.
But I would love to get rid of all this.
@floydspace we are going to replace parcel
with esbuild
while still maintaining local bundling + fallback to Docker approach for the moment. People installing esbuild
will have very fast bundling performances.
Let me first draft a PR for this then we can work on it together.
Comments on closed issues are hard for our team to see.
If you need more assistance, please either tag a team member or open a new issue that references this one.
If you wish to keep having a conversation with other community members under this issue feel free to do so.
Most helpful comment
@floydspace we are going to replace
parcel
withesbuild
while still maintaining local bundling + fallback to Docker approach for the moment. People installingesbuild
will have very fast bundling performances.Let me first draft a PR for this then we can work on it together.