Azure-functions-host: App in Prod: Error code 404 on Function HTTP APIs after Cold Starts

Created on 16 Oct 2019  路  15Comments  路  Source: Azure/azure-functions-host

Hi,

We started last week to get random 404 error messages in our application in production. After several investigations inside our application, it seems that the problem is within the Functions Host. It seems that after a cold start, the first API call hitting our Function App returns 404.

Investigative information

Please provide the following:

  • Timestamp: 10/16/2019, 1:07:44 PM (Local time)
  • Function App version (1.0 or 2.0): 2.0
  • Function name(s) (as appropriate): GetSurveys
  • Host Instance Id: dfbb63e1-c3d2-401c-ad57-7403dddba979 -- Invocation Id: 8bc52f8d-3d36-4a6a-8222-686d6460c1b8 (this is a 404 from our API Gateway to the Application, step 3 below)
  • Host Instance Id: 6902b90b-97af-4e50-bfc2-9e69b624afbc -- Invocation Id: c36648a2-e2a8-4582-b17a-103a53a29580 (this is step 4)
  • Host Instance Id: 6902b90b-97af-4e50-bfc2-9e69b624afbc -- Invocation Id: b10351cd-9192-432e-a066-e73bb13de548 (this is step 5)
  • Region: Canada Central

Repro steps

Provide the steps required to reproduce the problem:

  1. Stop Azure Function App to force a cold start
  2. Start Azure Function App and wait a bit
  3. Call API --> Receive result code 404 from Application, after around 5 seconds. No trace inside application insights, Function App code not reached.
  4. Call API --> Receive result code 200 from Application, after around 5 to 8 seconds, with actual data.
  5. Call API --> Receive result code 200 from Application, after around 300 ms, with actual data.

Remarks:

  • From the application Insights perspective, we can see that the API Gateway receives a 404 from the Application, however we have no trace of the 404 in the Application.
  • The issue is 100% reproducible with our app, whenever there is a cold start, we face the 404 issue (on different APIs in the system)
  • Our current assumption is that at the first API call the Host is ready, but returns 404 directly without hitting our Function. Once we hit the API for a second time, somehow host is properly initialized but we hit some Function level cold start, and at the 3rd API call everything is ready and normal.
  • We were not able to reproduce with a simple Hello World app.

Expected behavior

  • At step 3, I would expect the API to wait until everything is ready before and never return a 404

Actual behavior (as seen for Invocation state above)

Known workarounds

None found.

Related information

Provide any related information

  • Programming language used : C#
  • Bindings used: HttpTrigger
  • Runtime Version: 2.0.12775.0 (that version was never advertised inside https://github.com/Azure/app-service-announcements/issues , I think all deployments should be advertised to help people understanding)
  • Ref to similar reported issue: https://github.com/Azure/azure-functions-host/issues/4588#issuecomment-539204071
  • We think iIssue started more than 7-10 days ago, but it seems the issue is more frequent now (reported by users more often).

If you provide us with valid previous versions runtime that we could use to force a runtime version and compare behavior, we would be able to do that.

Most helpful comment

Resolved by #5102 . We'll post an update in the release announcements when the hotfix starts rolling out.

Thank you @SimonLuckenuik for reporting this!

All 15 comments

@SimonLuckenuik Please try pinning back to 2.0.12742.0. We've seen a couple of reports of this behavior and investigation is in progress. Also I'd like to apologize for the confusion around releases. The announcements, releases and actual released bits are clearly not in sync right now. We will fix that (and figure out how we screwed it up).

Thanks for the follow-up Paul. I just tried the suggested downgrade, and I was not able to reproduce my problem with the steps listed above.

If you need us to do some tests to support you, let me know. We can easily reproduce and have some environments that could be used if you want to investigate or enable more logging and so on. This issue has got attention internally since it affects our customers in production and made management nervous about Azure Functions stability (no update in our application and application stops working from a customer perspective).

@paulbatum How long will this runtime actually be available? We see that runtimes are being deprecated pretty quickly.

cc @mathewc @fabiocav

We now understand the root cause. JobHost state is set to "Initialized" before Http routes are added here: https://github.com/Azure/azure-functions-host/blob/dev/src/WebJobs.Script.WebHost/HttpInitializationService.cs#L95

We are working on the fix and will prepare to roll out a hot fix ASAP.

@SimonLuckenuik Please try pinning back to 2.0.12742.0.

Can you point me in the direction on how to do this?

To pin the runtime version simply change the value of FUNCTIONS_EXTENSION_VERSION in the application settings of your function app.

Ref: https://docs.microsoft.com/en-us/azure/azure-functions/set-runtime-version

Resolved by #5102 . We'll post an update in the release announcements when the hotfix starts rolling out.

Thank you @SimonLuckenuik for reporting this!

Thank you to the team for the quick turnaround time!

Thanks @fabiocav !

@sneckelmann That won't work. The local.settings.json file is only intended for local development. You'll need to follow the steps outlined here.

Thanks @paulbatum . I was trying to get the "pin" for local development (and of course what I did didn't work!).

Any resources for local dev "pinning"?

@sneckelmann You should not need to pin locally, because we did not do a core tools release for the build with this bug. When you F5 in VS you should currently see version "2.0.12763.0" which is prior to 12775.

To answer your question more generally, in order to have direct control over what version of the runtime you are running locally from VS, you need to download a specific functions core tools release and then customize VS to use that. This page outlines the general approach of how to do this.

Thank you @paulbatum .

@gbilodeau How long a given version lasts will vary, as it depends on how often we are releasing (unfortunately there is a limit on how many parallel versions we can allow). We will not remove 2.0.12742.0 until some time after the release of the hotfix for this issue.

Was this page helpful?
0 / 5 - 0 ratings