Azure-functions-host: Cold start of Azure Function with large npm module trees is slow in Consumption plan

Created on 25 Apr 2016  路  127Comments  路  Source: Azure/azure-functions-host

Hi--,

I've notice that the initial call of an Azure function (i.e. a couple of minutes after the last call) takes a significant amount of time (up to 15 secs). Is there any way to pre-warm a function other than initially calling it? Will the startup performance be improved in the near future? I'm using it on the Dynamic plan. Thanks!

Best regards
--Oliver.

high-pri improvement

Most helpful comment

For anyone interested I managed to find a reasonable workaround, dramatically reduced the cold start time for some of my functions from 2-3 minutes (or even timeouts!) to couple seconds. Here's what I did:

  1. node_modules went into .gitignore
  2. Created a dependency file which requires the modules I need from node_modules
  3. Created a single server-side bundle file using browserify aka browserify --node deps.js > deps.bundle.js
  4. Require the bundled dependency file instead - even better, an uglified version (uglifyjs or similar)

I've set up a simple watcher, whenever package.js changes the bundle gets recreated so I don't have to think of it. That's it, no node_modules no cry.

To keep the functions warm I'm intending to setup a cron function which calls all functions every 5 minutes.

@davidebbo Is there any way I could store and require cross-function / shared modules? i.e. something like require('../shared/azstorage.js')?

All 127 comments

Currently, there is a short idle timeout on Function apps, so if they don't get requests for 5 minutes the next request causes a cold start which takes longer. We are looking into increasing this timeout, which will make the situation much less common.

Ive seen delays up to 2 minutes, or even timeout with Azure Function app and node.js on a dynamic hosting plan. This is terrible! running functions app version .5

Since upgrading to 0.5 delays have increased from 15 secs up to 50 secs on my side.

I'm currently mitigating the cold-start issue with an additional pre-warming Azure function that periodically calls the other Azure function like so.

let http = require('http');

module.exports = function (context, myTimer) {
        http.get('http://your.azure-function.net', (res) => {
        context.log(`Successfully called endpoint. Got response: ${res.statusCode}`);   
        context.done();
    }).on('error', (e) => {
        context.log(`Error: ${e.message}`);
        context.done();
    });
};

This function is called every 4 minutes through a timer-triggered function with the following
cron job expression 0 */4 * * * *

I tried this. I have a timer calling every 4 minutes. I still get delays up to 5 MINUTES Here's the log of it. Fine, fine, fine, TERRIBLE...

screen shot 2016-09-05 at 12 12 35 am

So, my classic plan app is running Function App v .4 and it seems that the debugging messages may work better? I get errors showing that don't get in the dynamic app (dynamic app just hangs for long periods). Could faults on one Function affect the performance of the whole app? Sometimes I even get "BAD GATEWAY"

This is strange. To make sure that this is indeed related to 0.5 and not some other factors, can you try going back to 0.4 (instructions here)?

For me personally, it is not only about the cold start/idle issue, but as I wrote in #529 I think it would be good to have some transparency of the scenarios that are targeted by Azure functions.

If I understand things correctly, the start up time of a function is not only occuring on a cold start of a single container, but also affects new scaled instances if the dynamic plan wants to scale out.

This means that if I have an API HTTP triggered function, and one instance of that function is already running (let's say an execution takes 100ms), it could wait until the execution finishes and then cater the new request, leading to a maximum execution time of 200ms (which is kind of fine).

But instead, the dynamic plan could decide to scale out, and now we have the cold start penalty again. This means that instead of a maximum execution time of 200ms, this request will go above and beyond 5 seconds.

Could you explain how you are handling these cases?

@peterjuras let's keep this issue focused on the potential cold start regression which is causing cold start of over 1 minute for some users. We can use a separate thread for a more general discussion of how the system work.

For users running into these extreme cold starts, various questions that would help investigating:

  1. Does 0.4 vs 0.5 make a difference?
  2. What functions are in the Function App? Is it just the one JS functions, or are there others? In particular, do you have any F# functions (there may be an issue there)?
  3. Can you share the name of a Function App where you saw this, and the UTC time of one specific slow cold start?

So I did a long running test again where I ping every 3 minutes, the response times are all under 1000 MS unless one of the scheduled jobs failed to fire. Then after the six minutes lapse, the response time goes to a whopping 150,000-280,000 MS. I will try reverting to .4 and see what happens.

@vance so you mean you are measuring http functions, but the app happens to also have a scheduled function? What's the function's schedule? Where does 'six minute lapse' come from?

I mean I have a timer triggered function doing a HTTP request to my other function. It is set to 3 minutes. However, sometimes it does not fire, and six minutes elapse. The next time it runs, the execution time is extreme. Sometimes it's so slow, it creates a ripple effect where it goes from 280K to 140K to 30K or so, before it finally returns to under 1k MS for the next 10-20 calls.

That's very odd behavior. If you can share your Function App name (either directly or indirectly), and the UTC time when you saw this, we can hopefully analyse the situation.

Sure, here's a dummy app in the same region. fscape-dynamic3. Any function endpoint shows the extreme time in the logs, I think.

The screenshot is dated

05 Sep 2016 19:12:35 GMT

so it should be around this time.

So I reverted to v 0.4 similar results. Also, don't know if it's related, but sometimes the "pinged" services even timeout, or even issue a CORS error, when other requests work. I tried just having a local script run a query every 10 minutes. See that result in the second image. No server changes happened at this time.

image

image

I'm seeing similar symptoms. Although this might be an unrelated issue, in my case most of the time seems to be spent loading the azure-storage library. Any ideas if this is normal or how this can be fixed?

context.log('requiring azure');
const azure = require('azure-storage');
context.log('required azure'); 

2016-09-10T21:24:46.626 requiring azure
2016-09-10T21:27:04.641 required azure

Actually, this might explain a lot of the delay. I think it may be multiplied if you hit a bunch of functions simultaneously when coming out of sleep. Maybe it pegs the I/O for the VM. I do know that this dependency is seems rather large...

I have no idea how this virtualization is engineered, but I wonder if there's some way to snapshot these static resources as globals in a shared memory location that's immutable. Just a thought, don't know if that's even possible.

Yeah keeping it in memory might not be possible is some scenarios (at least not within the host process, which might restart etc.). I think the main question though is why does even fresh initialization take so long? Azure-storage indeed has tons of dependencies (I was shocked when I first saw it, way too many for a library that effectively is just a simple rest API wrapper). But when I run the same code locally (in a new node process) it executes instantly without any issues, which makes me think it could be an Azure runtime issue...

Wonder if it's the network or storage used. Maybe when these instances come back from idle, the data must get loaded over the network since they are not bound to any particular hardware (or VM)? I did not see these long startup times when hitting Classic Service Plan app (with AlwaysOn turned off).

Good point @vance . Deleting an "average" node_modules folder on a dynamic plan takes well over 5 minutes in many cases. IOPS seems to be very low...

Any way to have the container manager or whatever app pool manager cache the resources on the container disk, so they at least come back to life without going over the network? ... at least for the first instance.

Looks like the issue is still happening. Running a simple script that has one line to require 'azure-storage' takes more than a minute. What's the plan to address this issue? Is there any timeline? We need to see if we can keep using Azure functions, because currently this looks like a blocking issue.

Also just tested this with Classic plan. The delay is 5 seconds, not as bad as Dynamic plan, but seems still higher than normal.

This is still happening an on 0.6 and seem to be seeing huge delays when function has been inactive for a while.

Normal behaviour is around 200ms but when inactive for a while we are seeing 2+ minute response times.

We are using dynamic plan, 512mb, node http triggers.

Yes, we have an issue when dealing with large npm trees and 'require' directives. The reason is that the file system we're on is Azure Files, which is very slow.

We are looking at options to optimize this.

Has there been any progress on this? We are seeing 30+ second cold startup times on some very simple c# azure functions.

@lemkepf I just renamed the issue to better capture the core issue we are discussing here, which is a Node/npm specific thing.

Since you're using C# with a simple function, you are likely running into something else. To avoid merging the two, can you open a new issue with more details:

  • Your function app name (that's the app name, not function name!), either directly or indirectly
  • UTC time at which you observed this
  • Whether it was cold start or whether your function had just been hit before

any updates on this? The warmup time for our app is normally around ~10secs. node_modules folder is about ~25mb. Is there a recommended workaround in the meantime? like using the App service plan instead of consumption plan?

screen shot 2016-11-29 at 11 37 54 am

Sorry, no significant update yet, but we are looking at approaches that will make this faster.

Using a Classic app service plan should indeed help a bit as it uses a faster file system. Still, it will be slower than what you'd see on a local machine.

@davidebbo thanks, does the Classic app service plan still shutdown after a certain amount of idle time?

@cloudkite not if you use a dedicated plan (Basic/Standard/Premium) and turn on Always On.

Of course, at that point you're no longer using a 'serverless' offering, but for some scenario, it can still be a good approach.

Great thanks will use that as workaround until module loading speed is usable on the consumption plan.

Hi team - working on an demo project to explore "serverless" designs with my customers, and running into this same issue of slow response times.

Essentially, I have a few functions tied together via ServiceBus topics that do some basic tasks such as send SMS messages (via Twillio), insert data into DocDB, and event one that responds to HTTP requests from the browser (running a DocDB query).

Since the browser endpoint has visible impact, the delay is quite noticeable - often upwards of 10 seconds, but will then be <100msec for a few requests before going back to ~10 seconds. The DocDB metrics show consistent query times of a few milliseconds, so I'm pretty sure this is just the Function runtime overhead causing the variability.

These functions are written in NodeJS, and both have fairly large package trees. I have one function written in PowerShell (just to test) that is running from a Timer that seems to be fine - it's mostly the NodeJS ones that seem impacted.

Glad to share any other details needed.

(@cloudkite how did you generate this nice diagram?)

@yvele its a mac app https://daisydiskapp.com/

Any update on this? I'm on the latest (~1) runtime, using standard plan & Always On checked:
image

Still getting around 15s cold starts on each fn, subsequent run is less than a sec. (this was 1-2 mins on consume plan)

node_modules is around 20MB (using aws-sdk, request, queryparser)

Didn't figure out the timeout on standard plan, but seems like more than 5 mins..

@jarben I don't know if that can help with Azure Function startup but I have a cleaning scripts that remove useless files in node_modules, I made a private gist of it for reference: https://gist.github.com/yvele/32bd59dbf88bd23f8ad2468f8a32391e (that's only a hacky script for now but I'm gonna make something more robust)

After cleaning my application went from 20mb to 10mb... My CI with deployment script is faster.. But I don't know if that can help with startup time.

Thanks @yvele for suggestion! Every second is definitely worth saving, although even 5-7s is still way too much.

I think it's acceptable when cold starts are maximum of 1-2s but anything above is a show stopper for us.

Also, not sure why this is still an issue when Alwas On is activated as part of the standard (S1) plan, as far as I understand it, it should be a dedicated VM as per this desc?

image

@jarben I changed the issue title to clarify that this is discussing the Consumption plan. When you use dedicated plans like 'Standard' with Always On, you should not see any slow downs, as the app stays warm all the time. If you do, let's discuss separately.

Sure @davidebbo , I'll make a sample demo and file another issue.

My function (consumption plan) has 5 lines of code to make a http request using the require('request') module. Cold start is always about 28 seconds!! Then 90ms or so for subsequent requests until it's unloaded again.

I'm surprised node.js support for Functions GA'ed.

Can understand why it's so so if it's loading individual files over a remote network share. With AWS Lambda you zip up the function and node_modules folder as a single file for deployment.

The most problematic thing is that sometimes we have timeouts on cold starts (today I had one in Node.js with consumption plan in version ~1)

Any updates on that issue? It is a bit confusing when it looks like serverless concepts are not for a real life scenario :(
I believe many of these cases are related to huge libraries for Azure related services, so if it is not possible to connect these services with full functionality bindings then maybe different fix would be possible.
If many people are installing and then loading the same copies of node.js libraries why not trying to store those most common libraries in one fast directory (like ultra-global libraries or straight npm) instead of thousands of separate copies in personal user's related directories?
I belive real structure might be very different but looking at kudu it looks like you are creating something like OS-level VM when in my opinion it should remind me something much closer to function-level VM structures.

We are actively looking at ways to make this faster, but it is still going to take a while longer to get this right. We're hoping to find some initial bits people can try in about a month.

The suggestion to have local copy of those packages on the local drive is something we considered, but in practice it would not work well. The problem is that all those packages are constantly updated, and we don't want to force everyone to use the same one. Also, it breaks down the minute you need some other package that is not in the local list. So we're looking for a more general solution that will limit access to the fast drive, without reducing generality.

Same issue here... 2-3 minutes cold start time (!) sometimes timeouts too.. ironically it's the Azure Storage SDK filling up node_modules..

For anyone interested I managed to find a reasonable workaround, dramatically reduced the cold start time for some of my functions from 2-3 minutes (or even timeouts!) to couple seconds. Here's what I did:

  1. node_modules went into .gitignore
  2. Created a dependency file which requires the modules I need from node_modules
  3. Created a single server-side bundle file using browserify aka browserify --node deps.js > deps.bundle.js
  4. Require the bundled dependency file instead - even better, an uglified version (uglifyjs or similar)

I've set up a simple watcher, whenever package.js changes the bundle gets recreated so I don't have to think of it. That's it, no node_modules no cry.

To keep the functions warm I'm intending to setup a cron function which calls all functions every 5 minutes.

@davidebbo Is there any way I could store and require cross-function / shared modules? i.e. something like require('../shared/azstorage.js')?

@1N50MN14 Nice! I'm gonna try that but with webpack instead of browserify ;)

You may have a look at https://github.com/asprouse/serverless-webpack-plugin

@1N50MN14 that is exactly how we handle extensions to our code. We have a lib folder as a function that holds a bespoke SDK. Then we reference this from within our other functionapps.

Example:
/Library/sdk.js
/Function1/index.js

In /Library/sdk.js

var https = require('https');
var RestClient = (function () {
    //Some code
    return RestClient;
}());

exports.RestClient = RestClient;

In /Function1/index.js

var sdk = require('../Library/sdk');

module.exports = function(context, req) {
    //Some code
    var rest = new sdk.RestClient(context, url);
};

for those considering bundling nodejs code with webpack, this might be helpful https://github.com/palmerhq/backpack

@kevinbayes Ah cool thank you good to know it's possible - this should reduce the need of having to update the functions so very often :)

Now we just need to convince the IoT Hub team to support CORS so we could call our functions via websockets..

I was hoping webpack would be able to solve the problem. I'm glad to see that others have tried it.

I am also seeing start times upwards of 2 minutes using Azure Storage and other dependencies. Once warmed up, the function takes about 5 ms to respond.

I will try webpack and removing any need for node_modules.

I can't imagine what is going on behind the scenes, it's almost as if it is reinstalling node module dependencies at each cold start.

If that is true, why not prepare this at deploy time:

Idea: Resolve and Store Node Modules at Deployment

  • Install Dependecies from node_modules
  • Zip node_modules and save to a blob
  • At cold start, load from blob and unzip

Now, I can see that getting large or unwieldy because of deep dependency packages. So here is another idea:

Idea: Auto Package Node_Modules

Perhaps there is a way to create a standard deployment script when using Git Deploy that would bundle all the node_modules with tree shaking for each individual function.

I can see this not working with some modules, but there could be a way to configure this in .deployment.

Anything is better than a 2 min cold start which is a complete failure for production usage.

Also, as far as I understand, anytime this scales out the new function instance would also have the same cold start which is strange.

Anyway, I hope webpack can solve the problem for me.

Catching up on this. @1N50MN14 your approach sounds quite interesting! I see mentions of browserify, webpack, backpack, which seem to be alternative to achieve the same thing? It would be great if one of you could post step by step instructions on making that all work with an Azure Function, starting with a Function that has a large npm tree. Thanks!

I can't imagine what is going on behind the scenes, it's almost as if it is reinstalling node module dependencies at each cold start.

@ricklove It's all caused by a file system that's way slower than we'd like (it uses Azure File storage).

The approach we've been looking at involves zipping up all the contents into one file, and then use a special file system handler which basically uses the zip content as a read-only file system.

In some ways, it's probably analogous to the browserify/webpack techniques others describe here, but applying to the whole content (not npm/Node specific). But there are challenges to making this work well, which is why it's taking time, and also why I'm curious about the webpack approach, which could provide a good alternative. :)

@davidebbo backpack is a build system around webpack, you can safely ignore that for now and focus on webpack and browserify - they're both "smart" module bundlers allowing you to write CommonJS/node style code that runs in the browser. Instead of loading multiple <script> tags, you would instead require a node.js module the same way you'd do server side. When you bundle a file (say index.js), both browserify and webpack will look up the modules you actually require/use and output a single file which contains all of that code that index.js needs to run.

Using the same concept, both browerify and webpack allow for generating server-side bundles (in the case of browserify using the --node flag), and both are greatly flexible supporting transforms (say you like to write your code in TS, a transform will take care of compiling that to JS at bundle time), shims (turn a non-commonJs code into a node.js module) you could even bundle images if you wanted to.

The overall idea here is to avoid hitting the filesystem all together by loading a single file once. To optimize things further, we minify the bundle file, my "all-modules-big-bundle" went from 350K to 100K minified, and it loads very fast.

Instead of creating a separate bundle for each Azure Function, which was my initial approach, I ended up bundling all of the modules that I used across my functions and place them under /shared, then in each Azure Function I simply require('../shared/modules.bundle.min.js').

Despite the obvious overhead drawback in functions where I don't need to load all modules, it was still faster to load 100K into memory than performing multiple requires, and most importantly continues integration is no longer a pain / suffer from timeouts because each function consists of a flat file tree and if I need to update a dependency the only folder which needs to be touched is shared rather than having to update all relevant functions.

The whole thing now is so relatively performant I'm now running my tests directly against Azure Functions and cold starts are now averaging 2 seconds, sometimes I can't tell whether a function cold started or not.

I'll put up a sample repo over the weekend with README / detailed instructions, hope this provides a better idea in the meantime ;)

@1N50MN14 thanks for all the info! So I'm trying this, and probably doing something dumb. My original test function has:

module.exports = function(context, req) {
    var azure = require('azure-storage');
    var tableService = azure.createTableService();
    // etc...

Then I create a deps.js with:

require('azure-storage');

And I run browserify --node deps.js > deps.bundle.js. Then I upload the big deps.bundle.js to my function and delete the node_modules folder. What I don't get is: how should the code change to consume this? I tried:

module.exports = function(context, req) {
    var azure = require('./deps.bundle');
    var tableService = azure.createTableService();
    // etc...

But that blows up with azure.createTableService is not a function. So I think I'm just misunderstanding the export model that comes with using such bundle. Thanks for your help!

@davidebbo No worries! Assuming your function is in index.src.js

module.exports = function(context, req) 
{
    var azure = require('azure-storage');
    var tableService = azure.createTableService();
    // etc...
}

You would normally run browserify --node index.src.js > index.js, where index.js is a bundle which contains your module.exports along with all of the code that needs to run. But for some reason (I might be wrong), Azure Functions checks for the physical presence of module.exports = function(context) and throws an error otherwise (the bundle starts with an anonymous function).

Instead I'm using this approach, which might even be better because I could (optionally) reuse the same dependencies in multiple functions.

/*
deps.js 
require your dependencies here, had no choice but to pollute global, 
then again "global" is singleton here so we're fine. 
*/
global.deps = {
  azurestorage: require('azure-storage');
}

Then
browserify --node deps.js > deps.bundle.js

And in your index.js:

require('./deps.bundle');
module.exports = function(context, req) 
{
    var azure = global.deps.azurestorage;
    var tableService = azure.createTableService();
    // etc...
}

Your Function should now only consist of index.js, deps.bundle.js (and deps.js), package.jsonand function.json. node_modules goes into .gitignore to avoid accidental upload.

You could also:
npm install uglify -g
uglify -s deps.bundle.js -o deps.bundle.min.js

and require('./deps.bundle.min') instead, loads way faster.

Yep, that works, thanks! I'll play with it some more.

Now one big question is: to what extent could this be automated, such that we could make things 'fast' when the user uploads something 'slow'? I'd guess with some parsing we could extract all the require from the function, create the pack, and then change the code to use it. e.g. it's something that could happen as a post step during git deployment. Do you think that's a good direction, or is it best to let it be an explicit user step?

@davidebbo imho it's nearly impossible to automate this and accomplish 100% coverage at the same time, it's the nature of the beast seen it with Lambda as well, specially when / if you support direct install from package.json (how do you deal with binary modules aka node-gyp based etc). What you're describing is pretty much a build chain (hence my previous comment to ignore backpack for now), and there are several of those available and widely used by the community, both gulp and grunt work just fine with browserify and/or webpack (there's also Lasso), one could combine it with LiveReload and many other available modules.

In my opinion this belongs to user land, given proper documentation and "best practice" examples on how to optimize functions for speed/performance.

@davidebbo I agree with @1N50MN14 that this belongs in user land.

Bundling may fail depending on what nodejs modules the project includes.
Webpack and Browersify will work fine for frontend bundling, but may fail to bundle some backend specific nodejs modules as they maybe using require in weird ways (dynamic requires) or have binary dependencies.

It's actually interesting, azure-storage (since it was mentioned in the sample) fails to bundle for browser use (https://github.com/Azure/azure-storage-node/issues/198) whereas it succeeds for node.js use - I'm assuming modules that depend on compilation upon npm install are the ones most bound to break (which kind of makes sense given different environments)

@davidebbo If you're thinking of implementing this as a deploy script, I don't see any reason why this couldn't be a user-land choice; but still a built-in as a feature.

e.g. You create a new function-app and there is a drop-down that gives a list of a couple of built-in deploy scripts. You have tooltips/notes which let you know the advantages of one over another; and big warnings that you may need to update the script if it fails (and a link to the relevant documentation that tells you how to do this).

I for one would absolutely love it if there was a default option that worked in "most" cases that I could easily select, and if it failed, then use it as a basis for any more tinkering that I needed to do.

Having made a number of custom build scripts to do Gulp, etc. It is a pain to troubleshoot those and get them working like designed since each time you have to push a change to source-control to see if the new fix worked... and then wait. If there were example scripts already that did most of it this would greatly reduce time, even if it had to be customized.

@davidebbo

I spent this morning updating my build tool to use webpack and it is working great (well ~40x better at least) now:

Webpack Performance

You can look at my build tool here, but it's for typescript and does some extra stuff:

https://github.com/toldsoftware/azure-functions-server

This is the config I use to call webpack (v2.2.0, but I think any version will do):

    entry: {
        // './EXAMPLE.webpack.js': './EXAMPLE.js',
    },
    output: {
        path: './',
        filename: '[name]'
    },
    target: 'node',
    node: {
        __filename: false,
        __dirname: false,
    }

Note: Although I was using typescript as the original source, I actually first transpile to ES5 with commonjs and then use webpack on the .js files.

Point taken about not trying to do it all by magic. Offering the option via deployment script does seem like a reasonable approach to look into.

@ricklove thanks for sharing your perf numbers and tools. If you try hitting it every 10 minutes, you will likely more consistently get a big difference, as the Function App would have fallen asleep by then.

Ah, I assumed it shut down after 5 mins (and got impatient sometimes). I also noticed having it open in Azure Portal seems to prevent it from shutting down or something. I'll set up an automated test to ping it every 12mins and will be able to give you better numbers.

For anyone that's interested in talking to their super awesome Azure Functions over websockets please +1 issue #1139

//cc @davidebbo

@1N50MN14 I've searched for that before, it would be great. I have settled on using long polling.

@ricklove Yup, unfortunately long polling is not an option for me :( crossing my fingers someone will take notice.. I've seen couple of stackoverflow posts and even a request about this issue since late 2015, I was surprised that CORS support was even a question..

Guys, would be best to discuss that on #1139, as it's pretty unrelated to this issue :)

Ok, I set up an automated timer to test this and have consistent results now: Using Webpack is 25x faster.

Webpack Performance

@ricklove thanks for sharing your numbers!

I think the key here is to iron out the process to make it easier for any Node developer to move to this model. This can hopefully be done with a mix of:

  • Good documentation, walkthroughs, samples, etc,,,
  • Possibly some tooling that makes this easier
  • Possibly some Kudu deployment script support to help automate some of it as discussed above

I think, first of all, there should be a friendly warning (and perhaps an explanation) in the portal that, as of now, cold start performance may be several seconds (or even minutes) when using external dependencies.

This kind of ruined my Azure honeymoon: I spent a few hours happily writing Azure Functions, blissfully unaware that, for my use-case, where cold start performance may not be longer than 1 second, those Functions are completely unusable right now ...

Partial workaround:

I have 4 functions, each using mongodb. Instead of installing mongo in each of the function-directories, I install it in the root folder (e.g.: D:\home\site\wwwroot). Then, in my functions, I require mongo as follows: require('./../node_modules/mongodb'). This reduced the cold-start time from >30s to <100ms.

Edit: Nevermind. The first function call is still slow. However, only 1 function needs to be "woken up" - then the other 3 do not experience slow cold starts.

Edit: Nevermind, again. This does not work reliably.

@MarkTiedemann agreed that we'll want the UI guiding users toward this technique,

Yes, the (edited) behavior you describe is what I would expect. This helps a bit but does not solve the fundamental issue that at some point, the big module tree is going to need to get processed.

@MarkTiedemann, I have had some improvement running a keep alive timed trigger that runs every 4 mins and calls my functions. I pass a query ?ping=true, that returns "pong" instead of trying to process the result.

        if ((req.query as any).ping != null) {
            context.done(null, {
                status: 200,
                headers: { 'Content-Type': 'text/plain' },
                body: 'PONG' as any,
            });
            return;
        }

From: https://github.com/toldsoftware/azure-functions-server/blob/master/src-server/example-timer-keep-alive.ts

    let urls = [/* urls */];
    let doneCount = 0;

    let callDone = (url: string) => {
        doneCount++;
        context.log('Keep Alive END: ', url);

        if (doneCount >= urls.length) {
            context.done();
        }
    };

    for (let x of urls) {
        context.log('Keep Alive START: ', x);
        let http_s = http;
        if (x.match(/^https/)) {
            http_s = https;
        }

        http_s.get(x, (res: any) => {
            console.log('statusCode:', res.statusCode);
            callDone(x);
        });
    }

    let timeStamp = new Date().toISOString();

    if (timer.isPastDue) {
        context.log('Timer is Past Due');
    }
    context.log('Timer started!', timeStamp);

I don't know what this does to costs, but I guess I'll find out at the end of the month.

Yeah, I am doing the same to keep my functions warm.

BTW, the first 1 million executions per month are free. If you run your keep-alive function every 4 minutes, there are 60 / 4 * 24 * 30.4 = 10944 function calls a month (times 2 for the function being called by the keep-alive function), so plenty of free function calls left. :)

@MarkTiedemann Of course, that is assuming the keep-alive is in the same azure functions app, I might just create a keep-alive functions app that is dedicated for calling others (and that way it can go through a list of urls at once). In fact, that also allows me to create my own service health monitoring system that can send me an alert if I have a non-responsive service... I love azure functions I can make so many tools so quickly now.

Great job everyone who made this happen!

Okay, so I've attempted the concept of a WebPack azure deploy script. (Don't use it yet, 'cuz its not quite working yet. See below).

The idea is that you:
1) Deploy code from source-control (goes to site/repository)
2) Run NPM install on all the folders (but run the NPM Install in site/repository).
3) Run some magic to go through each index.js, pull out all the requires, create a new dependency file, run Webpack/uglify on that file, and update the index.js to use the dependency file.
4) These files are then pushed to wwwroot.

As a result site/repository is your normal code with all your node_modules, etc; and site/wwwroot is the WebPacked/uglified code.

To be all fancy shmancy, I figured I'd implement the deploy script as its own project (See link above), and include it as a git submodule on my main project. Therefore, one deploy script to maintain potentially for multiple projects.

If this worked, it would be a beautiful thing. I'm still debugging a few, what I believe are minor issues to get it running properly; but its taking forever to debug with the extremely slow file system.

Please, please, please; fix that issue in addition to the speeding up of Azure functions. Yesterday I had a deploy take over 40 minutes, the equivalent which ran in 45 seconds locally.

@securityvoid You might reference my build tool: https://github.com/toldsoftware/azure-functions-server

The process is in the folder src-cli:

  • Build all source to "lib" folder (I use typescript, so I use tsc -w for that step).
  • Build a "deployments" folder that includes a FUNCTION.source.js file inside each function folder. This points to the source in the "lib" folder and requires the correct main method for each function.
  • Use webpack against all the *.source.js files (this allows webpack to package them simultaneously without needing to reload modules). This only includes the required modules for each target. Also, webpack 2 does tree shaking that should reduce the output size. (However, I think this only works with ES6 modules.)

I run this locally each time before I deploy to production.

The only problem with this is that it has a lot of generated code in the source control.

Concerning using a similar process as a deploy script: I would worry about deployment time causing downtime on the service. There is no staging support for azure functions, so a long deployment might cause your service to be down for a period. (Although, it might be possible to use Azure Traffic Manager with 2 azure function apps and do a scripted staging with that.)

@ricklove Thank-you, I actually was looking at your project earlier.

There should actually be minimal downtime with the deployment script even if it takes forever to deploy. Until you mess with site/wwwroot your old code is live. I build everything under site/repository/dist, then do a move of wwwroot to wwwroot2, and then a quick move of dist to wwwroot. After this is done the delete for wwwroot2 is triggered. Moving files seems to be the fastest operation on the slow file system, and this final move operation only takes a few seconds.

My problem right now has something to do with my file filter function I wrote to avoid copying everything. Somehow the file paths are resolving differently locally than on Azure, which is resulting in node_modules not being filtered out. As a result its copied everywhere. This slows EVERYTHING down, and once I get this fixed it should make everything a lot faster.

I'm hoping the final deploy will be 5-10 minutes. If it is, I'll definitely prefer the deployment script over having the node_modules in the repository.

I'm new in the Azure Functions world. I created a simple function which is using some npm packages. I have the similar problem with the loading of the npm packages when there is a cold start. Because I'm new in this world and my Node.js experience is not huge, I don't know how I can increase the performance with the tool made by ricklove. Could someone please create a step-by-step instruction for it? Thank you. Right now I'm coding in Visual Studio in TypeScript and uploading the generated .js-files via FTP to Microsoft Azure.

@picarsite I wouldn't recommend using my build tool since it is rather advanced and does a lot of things behind the scenes that is specific to my preferences.

Perhaps @securityvoid 's solution is something more appropriate for general use.

@picarsite You should be able to foll the directions on the README and get mine deploy script working. Just realize that the first time you run it, it will take a LONG time if you have a lot of npm installs. It also currently is showing that it failed the deploy when viewing the logs via the portal.azure.com site, even though the deploy worked fine. (If you see output in the log on the move of dist to wwwroot, it completed properly).

I'm still fine tuning these details and will hopefully get that last error cleared up here soon.

BTW, going to:
http://YOURAPP.scm.azurewebsites.net

Can be very useful for manually running the npm installs for each package.json manually that are within:
D:\site\repository

My app took 40+ minutes for the deploy to run with the npm installs originally, but now deploys in around 10 min. The code is all webpacked and uglified.

@securityvoid I have some trouble to get your tool to work. When pushing my project to the azure git repository the deployment fails:

remote: Running Webpack...
remote: return binding.lstat(pathModule._makeLong(path));
remote: base:D:\home\site\repository folder:checknumber outFile:azure.deps.js outputDir:D:\home\site\repositorydist\checknumber
remote: ^
remote: An error has occurred during web site deployment.
remote:
remote: Error: ENOENT: no such file or directory, lstat 'D:\home\site\repository\lib'
remote: at Error (native)
remote: at Object.fs.lstatSync (fs.js:982:18)
remote: at Object.realpathSync (fs.js:1647:19)
remote: at D:\home\site\repository.deploy\deploy.js:20:18538
remote: at Array.map (native)
remote: at Object. (D:\home\site\repository.deploy\deploy.js:20:18513)
remote: at t (D:\home\site\repository.deploy\deploy.js:1:169)
remote: at Object. (D:\home\site\repository.deploy\deploy.js:20:15566)
remote: at t (D:\home\site\repository.deploy\deploy.js:1:169)
remote: at r (D:\home\site\repository.deploy\deploy.js:21:22477)
remote:
remote: Error - Changes committed to remote repository but deployment to website failed.

@picarsite I apologize, I'm still in active development of this. Pull the dev branch rather than the master and that should work. The deploy script is not uglified itself yet. Uglify apparently has issues self uglifying (Which is what you're running into with that error).

I tried your dev branch but it does not work, too. I assume I do something wrong? I checked the 'D:\home\site\repository\.deploy\' folder and it is empty. Should it be empty?

remote: Updating branch 'master'.
remote: .........................................
remote: Updating submodules.
remote: Preparing deployment for commit id '2ca45d2e50'.
remote: Running custom deployment command...
remote: Running deployment command...
remote: 'D:\home\site\repository.deploy\deploy.cmd' is not recognized as an internal or external command,
remote: operable program or batch file.
remote:
remote: Error - Changes committed to remote repository but deployment to website failed.

@picarsite We should probably move this off this thread. Can you open an issue at:
https://github.com/securityvoid/.deploy/issues

Just looking at that, it looks like your path in your .deployment script might be wrong.

I made a very simple example here which is a bit more minimal, which might be better for some folks: https://github.com/christopheranderson/azure-functions-webpack-sample

Instructions:

  1. Add the webpack.config.json into the root of your Function App

  2. Add webpack as a dev dependency and "build":"webpack" to your scripts section in your package.json

    "scripts": {
        "build": "webpack",
    },
    "devDependencies": {
        "webpack": "^2.2.1"
    }
    
  3. Add an index.js file at the root of your Function App which references all your JS Functions as in the example

  4. Add the entryPoint and scriptFile properties to each of your function's function.json files.

Keep in mind that the above will not work on the current release (1.0.10690). The CLI preview is a bit ahead at this point, so it will work locally it and the VS tools, but those bits are not in production yet.

So to make it clear, the above will work on deployed versions > 1.0.10690

Anyone curious about my last post - I wrote a quick tool to do the transform for you, still a bit experimental here: https://github.com/christopheranderson/azure-functions-pack

npm install -g christopheranderson/azure-functions-pack
funcpack

Keep in mind it will only work locally on the latest CLI until we do our next release early next week.

@christopheranderson With AutoDeploy from Git, am I correct in understanding your solution would require returning your node modules to your source-control (At least the webpacked version)?

@securityvoid what do you mean by 'returning' in this context? I would think it should be possible to run the tool as part of deployment, and not have any node modules committed to your repo.

Once the new Functions runtime that the tool needs is available, we'll try to come up with a Kudu custom deployment script that uses it, and we can iterate from there. The tool seems quite promising from my initial playing with it.

@davidebbo Interesting... I'll have to try it out. I wonder if he handles dynamic includes. That was a fun part getting in my deployment script :-).

@davidebbo Chris's tool is very cool. Being able to set the entry-point simplifies everything from a Webpack standpoint (And gets rid of the need of 3/4 of my deploy script :-P) . When is that update going into Azure?

@securityvoid hopefully later today!

@davidebbo Hmmm... a few of my functions are running fine when they're cold, and erroring out now when they've run once recently. Hopefully unrelated to the update?

@securityvoid update is not out, so definitely unrelated. Please open new issue to discuss if needed.

@securityvoid ok, the new bits are live!

Sorry, due to an unrelated issue with the new build, we had to pull it off. It'll be back in a few days after we can fix those issues.

@davidebbo Why not just add a standard warm up route to every function app that is called before traffic is routed to the instance of the function app? This could work like authentication and could be automatically called after a deployment. For example, every function app could have a route like ".azurewebsites.net/.healthcheck".

It seems to me that this would fix the scaling out scenario. You could then enable a keep alive option that would prevent the app from completely shutting down.

@Stewartarmbrecht keeping the app alive (e.g. via timer) is a workaround, and is mentioned above. But the whole idea is that we want the cold start to be faster when the app really is cold. Keeping every Function App always hot is not a good option for the system as a whole.

I can also confirm that using Webpack I switched from 5\~6 minutes cold start (but mostly timeout with bad gateway) to 5\~6 seconds cold start!

image

I'm using Express.js with azure-function-express (I have a single entry point in my Azure Function to handle all my endpoints).

My webpack.config.babel.js is really simple 馃槈 Hope it may help someone

import path from "path";
import webpack from "webpack";
import CopyWebpackPlugin from "copy-webpack-plugin";

const pathRoot = (...args) => path.resolve(__dirname, ...args);
const pathSrc = (...args) => pathRoot("src", ...args);

export default {
  target: "node",
  module : {
    rules : [{
      test    : /\.js$/,
      exclude : /(node_modules)/,
      loader  : "babel-loader"
    }]
  },
  plugins: [
    new CopyWebpackPlugin([{
      from : pathRoot("functions")
    }])
  ],
  entry : [
    "babel-polyfill",
    "./src/handle.js"
  ],
  output : {
    path          : pathRoot("dist"),
    filename      : "routes/handle.js",
    library       : "handle",
    libraryTarget : "commonjs2"
  }
};

PS: Next step will be code minification (really simple with Webpack) 馃槈

@arafato With your "keep-warm" function, do you need to hit every function in the function-app; or is hitting a function in the function-app sufficient to keep the whole app warm?

@securityvoid

In my recent testing, I saw a global variable being shared across multiple node functions.

So it seems that they are running under the same process and therefore a keep-warm should work just calling a single function.

FYI, the change mentioned above which allows @christopheranderson's tool to work is again deployed, this time hopefully for good!

Some other developers having trouble with webpack and babel-polyfill?

Error: only one instance of babel-polyfill is allowed

You may understand that global vars may be shared between Azure Function instances and deployments.

More information in a dedicated issue https://github.com/Azure/azure-webjobs-sdk-script/issues/1234

Okay, I'm having some issues with the Webpack technique and I'm wondering what other people are doing.

I've mentioned this before, but I created a build-script. It goes through each one of the index.js for each function and extracts every require. These then go into a single, central javascript file like @1N50MN14 mentioned earlier.

e.g.
global.deps = { package1: require('package1'), package2: require('package2'), }

The final file is WebPacked and uglified, each index.js is set to include this file, and the references for each library are updated to reference the appropriate require.

e.g.
var myvar = globals.deps['package1'];

The problem I'm running into is that the final JavaScript file, while only 700kb, is large enough that if I add it to a VERY simple test function; the cold start takes longer than 5 minutes. Since I'm on a consumption plan, you can't set the timer above 5 minutes.

For example, the following will take over 5 minutes to run:

'use strict';
require('D:\\home\\site\\repository\\azure.deps.js');

module.exports = function(context){
    context.res =  {
        status: 200,
        headers: {
            "Content-Type" : "application/json"
        },
        body : { "status" : "alive"}
    }
    context.done();
}

Now, in the above example I don't even need the require, so it can be taken out. However, just as an example if I drop out the require, it of course runs really quickly. I'm sure if I ever got it cold started, the warm function would not have any issue with the 5 minutes. However, with the 5 minute timer it just dies before it can ever get warm.

What I don't understand is I don't think my dependency list is too terribly long:

  "dependencies": {
    "crypto": "0.0.3",
    "dotenv": "^4.0.0",
    "fs": "0.0.1-security",
    "https": "^1.0.0",
    "jsonwebtoken": "^7.3.0",
    "jwk-to-pem": "^1.2.6",
    "lodash": "^4.17.4",
    "path": "^0.12.7",
    "q": "^1.4.1",
    "xml2js": "^0.4.17",
    "xmlbuilder": "^8.2.2"
  }

As a result it seems like there should be some way to get this to work, and in fact if anyone is getting anything to run; I feel like I should be able to get my code to run.

@securityvoid 700k uglified code sounds like a lot are you sure you're doing it right? path, fs, crypto, https they all sound like native node modules which you don't need to include in your bundle. Also, did you make sure your node_modules folder is either in .gitignore or completely deleted this has been the reason cold start too slow. To be on the safe side, double check in the portal node_modules does not exist in your bundle directory (or other directories you don't need it) and your git integration is properly in sync.

My entry file (named index.src.js) looks something like this:

global.deps = {
  Promise: require('bluebird'),
  jwt: require('jsonwebtoken'),
  uuid: require('uuid')
}

then I have a shell script (that would be .bat on Windows I guess?) that I use to bundle and minifiy which looks like this:

rm index.bundle.min.js
rm index.bundle.js
npm install
browserify --node index.src.js > index.bundle.js
uglify -s index.bundle.js -o index.bundle.min.js 
rm -rf ./node_modules

And in my function code I use the deps like so:

require('../SDK/index.bundle.min')

var Promise = global.deps.Promise;
var uuid = global.deps.uuid;

module.exports = function(context)
{
  //code
}

Hope some of this helps ;)

@1N50MN14 I'll go back and double-check all the dependencies to be sure; but I think I have it all setup right. I can tell you that I'm gitignoring node_modules, and that the code runs fine locally. As a result it should be right.

Since I'm doing this as a deploy script my code actually has the "normal" requires prior to the script running that does all the updates.

Only the optimized code ever makes it into the wwwroot (no node_modules).

@1N50MN14 I went back and did a print-out of the JSON Stats from Webpack to see everything that was being packed up. It looks like the ~700k is correct. A good 3/4 of which are associated with the jsonwebtoken package. That package does do a lot, so it sort of makes sense.

@davidebbo Are there plans on how to handle this? I can effectively undo all the Webpacking, but then I'm still back to the slow response times.

@securityvoid

With my technique, I have around a 10 second cold start (which is equivalent to the average C# cold start from my tests). Some of my files are 3MB which is huge, but I don't run uglify and it doesn't seem to matter. (It seems that the cold-start issue with Azure Functions has to do with latency not bandwidth. It seems the number of file dependencies is the problem and large files don't slow it down much.)

Some observations:

  • I don't manually do anything to prune dependencies.
  • I leave everything in the original node_modules, place everything in its final position and then run webpack. This pulls in the node_modules dependencies so that it is no longer needed at all. (In my deployment folder, I have no node_modules and I have no project.json.)
  • I do all this on my machine before I deploy.
  • It takes around a minute for my build tool to run webpack on everything.
  • When I deploy, it takes a few minutes for everything to update from the time I git push.
  • Cold start time is around 10 seconds

    • I have run my cold-start measurment tests from a separate azure function and concluded that Javascript, C#, and F# Functions all have equivalent (within 20%) cold start times (assuming a webpack pre-built solution like this).

  • There are probably some points of improvement, but I have obtained sufficient performance for production use.

Folder structure:

  • root

    • node_modules

    • project.json

    • src (Typescript)

    • lib (Generated from src)

    • .deployments (points to deployments)

    • deployments



      • no project.json


      • [function-folder]





        • function.json



        • index.js (requires build.js)



        • build.source.js (points to entry point in lib folder)



        • build.js







          • built using webpack




          • contains all dependencies from lib and node_modules




          • largest is around 3MB (I don't use uglify)










Process:

  • Run tsc to transpile from the src folder into a lib folder in my project.

    • The code in this folder still points to the original node_modules for external require.

  • Generate a function-folder for each endpoint (this can be done manually).

    • This includes:



      • function.json


      • index.js


      • build.source.js



  • Run webpack against each build.source.js which produces a build.js

    • This now has every code from the lib folder and all dependencies from node_modules.

  • Git push the project with the deployments folder to the production branch

    • no node_modules folder is included

    • lib folder is not needed (but I include it because my CI tests run against it)

@ricklove Thank's for sharing. If you have 3mb files, the 700k definitely isn't why my stuff is so slow. I'm going to try without uglifying and see if there is a difference.

BTW, are you using the standard consumption plan?

@securityvoid Yes, I'm using the consumption plan.

For your case, I suspect there is something in your deployments folder that has many files that is getting loaded during cold start.

Also, I just extended my cold-start timing tests so I can directly compare a JavaScript run with no dependencies against a 3MB function that uses azure-storage.

My test runs every 12 minutes alternating which function it calls (from an external function app). So it's a good test from the perspective of a client - measuring the actual turn around time from initial call to client receipt of the response (the monitor page of the function itself only shows ~2 seconds, when this test shows ~10 seconds).

I'll give a report tomorrow.

@ricklove I would think if it was something generically in my deployments folder then the function would not behave different whether I had that require or not.

This times out every time:

'use strict';
require('D:\\home\\site\\repository\\azure.deps.js');

module.exports = function(context){
    context.res =  {
        status: 200,
        headers: {
            "Content-Type" : "application/json"
        },
        body : { "status" : "alive"}
    }
    context.done();
}

This runs every time:

'use strict';

module.exports = function(context){
    context.res =  {
        status: 200,
        headers: {
            "Content-Type" : "application/json"
        },
        body : { "status" : "alive"}
    }
    context.done();
}

It doesn't make any sense for me to have issues with the 700k file when you have a 3M file, but it seems like the file itself is somehow the issue. I suppose the alternative is that the file name is just the last little bit that pushes it over the 5m mark.

In any case, removing uglify did not really help.

@ricklove Arg! I think I just noticed my own typo.

If you notice in the file path of the require is "repository". That is the non-webpacked version of the file. The webpacked version is in the wwwroot.

Since the file existed, it didn't error out, and in fact would work... given enough time.

I'm redeploying now to confirm that solves things.

@davidebbo @1N50MN14 I've made the Browserify technique work without the use of global...

The shared/library module is as follows...

/**
 * ./library/shared.js -->
 * export your dependencies here
 */
module.exports = {
    require: require('request'),
    requestPromise: require('request-promise'),
    common: require("./common"),
};

And the Function code file is...

/**
 * ./functionname/index.js -->
 */
var shared = require("../library/shared.min");

function functionHandler(context, req) {
    context.log('FuncImpersonate01 HTTP trigger function received a request.');
    shared.common.logRequestHeaders(context, req);

    context.res = {
        status: 200,
        headers: { "Content-Type": "application/json" },
        body: {
            Success: true,
        },
    };
    context.done();
};
module.exports = functionHandler;

The Build/Browserify commands are...

browserify --node ./library/shared.js --standalone shared -o ./library/shared.bundle.js
uglifyjs ./library/shared.bundle.js -o ./library/shared.min.js

@phillipharding thanks for sharing!

As an aside, note that we've added support to automatically run funcpack on Kudu git deployments. To enable it, simply set an Azure App Setting: SCM_USE_FUNCPACK=1. In theory this should just work, but it's still new, so feedback is welcome (https://github.com/Azure/azure-functions-pack is a good place for it).

@phillipharding Awesomes thanks for sharing this! it seems that browerify's --standalone did the trick!

I've been running into this issue for some time now and it's very frustrating as I'm seeing 60 second startup times on a regular basis. I'll give azure-functions-pack a shot at some point, and I think that the azure-functions-pack tool is a decent workaround for the short term, but is there anything on the roadmap to address the root cause? (i.e. the underlying file system being prohibitively slow)

Specifically, I'm using the Slack Events API, and if Slack determines that a request to my API takes more than 5 seconds to complete (which happens regularly whenever whatever orchestrates Azure Functions decides to spin down our API), it'll automatically be retried over and over. This behavior in combination with Functions' filesystem issues means I can't really work with Slack's API without a hack that ignores "retry" messages based on a certain request header value, which isn't ideal for various reasons.

Thanks

I'm having the exact same issue with the Shopify Webhook API. It will retry if the request doesn't finish in 5 seconds. I usually get about 3-4 requests before one returns in time for it to stop trying.

Microsoft needs to fix this ASAP. Node.js runs fine on AWS Lamba. Disappointing when you see articles like this: seven reasons why so many Node.js devs are excited about Microsoft Azure

Forget the webpack flag, a huge number of modules wont compress properly with it. So basically no solution still, and a pretty useless product if you want serverless node.

@tony-gutierrez agree - this is something I would have expected Azure Functions to launch with, not something that still isn't fixed a full year after the original issue was filed. 馃

We started with functions trying to use the consumption plan. Not even close to working. Then we ran the app service plan for a while, and at least it worked. Then we realized we basically had no benefits of a serverless host, and all the horrid things function apps incur (like no access to console.log, super slow storage, not easy to dev locally, etc.) Did a quick modify to a normal node app and deployed to a VM and app service. The perf increase is pretty major on both.

We used webpack to bundle the npm module dependencies into a single file to reduce the cold time, refer this medium article

See also the newly announced Run-From-Zip feature, which also helps with cold start.

Closing this issue as Run-From-Zip is working well enough for most people (and some others are using funcpack). It doesn't mean that there can't be more optimizations, but we can always track specific work separately, as this issue is too open-ended.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

ahmelsayed picture ahmelsayed  路  4Comments

alaatm picture alaatm  路  4Comments

JasonBSteele picture JasonBSteele  路  3Comments

silencev picture silencev  路  4Comments

shibayan picture shibayan  路  3Comments