Sp-dev-docs: Increasing REST API Throttling errors in SharePoint Online

Created on 2 Sep 2019  路  22Comments  路  Source: SharePoint/sp-dev-docs

Category
Question
Typo
Bug
Additional article idea
Expected or Desired Behavior
Before July 17 2019 we got 429 errors here and there, but their ratio was not significant and allowed our production servers to work as expected.

Observed Behavior
Starting July 17 2019, we see 10 times more 429 exceptions on our Sharepoint Online REST API calls, which makes our production to slow down operations 10 times.

Steps to Reproduce
Which tenant has the issue?
All tenants that we accessing (hundreds, across different regions over the world).

When did the issue happen, so that we can check right log entries?
Since around July 17 2019.
Related https://github.com/SharePoint/sp-dev-docs/issues/4336

Issue now is more severe, doesn't matter if this is OAuth authentication (using our APP) or basic (without any app ID). No any parallel request, only 1 concurrent and it's heavy throttled.

If in the past we saw jump in 10 times more throttled reponses, now it close to 30 times more, this is very strange behaviour from SP (we see this last 4 days) !!!!!

Thanks

csorest throttling question

All 22 comments

Thank you for reporting this issue. We will be triaging your incoming issue as soon as possible.

@JeremyKelley, this is something new, we never saw such high throttling rate, even if there's no OAuth used (basic auth means no APP ID is used). Single request, not parallel.

@justdevelopment do you see any high throttling rate ?
Thanks

Hi Slavag, (on behalf of my colleague @justdevelopment), yes we still have the throttling issues.

I am not sure if the problem is the same because we mostly use Graph calls and not REST, but we get the too many requests since July 17 as well.

@MRuimerman Do you see any higher rate since last week ? Thanks

No, the number of errors is more or less the same.

I am experiencing the same problem/behavior starting 26 Aug on our [large] tenancy. Process that ran reliably literally for years started failing with massive numbers of 429 errors. I have throttling detection and handling in place, but jobs are still not completing in a reasonable time or at all. Need some relief.

it is worse today again for us.

Last 5 days we see very high throttling rate.
All servers are most of the time in throttling sleep.

It's not getting any better, jobs that are used to finish in few hours now takes more than week.
Microsoft, please respond, this is total denial of service by Sharepoint.

We are facing similar issue this week. Observed it when applying PnP templates on multiple sites (about 4 sites) in parallel....which used to work earlier !

Seems that Microsoft don't really care about this.
@JeremyKelley can you please answer.

We opened a Premier Support ticket on this issue and spoke to a Microsoft engineer.

MS is monitoring "cost" on a per AppID basis. If your app is determined to be "expensive", they throttle/DOS that app.

  • How to compute Cost per REST API call is not documented
  • The threshold where "expensive" cost is reached and throttling is initiated is not documented or exposed
  • Microsoft provides no alerts that throttling will be turned on
  • Microsoft provides no diagnostics as to which apps are being throttled

We received 0 Message Center warnings about our apps, 3 of which are currently throttled/DOSed heavily. I received no other communication from O365 - TAM/SDM, email, phone call, etc - about my apps. There is no way for me to know WHY I am being throttled, so there is no corrective action or improvements I could ever make. As a paying customer, I just have to eat worse and worse performance, I guess.

Do they care about this issue? No, there is no evidence that they care about its effects on customers. They manage solely to unpublished metrics that are completely invisible to customers.

@wadehgsk It's not exactly what they are saying, for example, if customer (my customer) don't use OAuth , but just username and password, Microsoft doesn't knows about App ID (it's just not provided in the request) and we see throttling in those accounts as well (no concurrent calls, single call every 3-5 seconds). And same relevant for OAuth , we see heavy throttling on single call every 3-5 seconds.

The credentials you are using IS the app, imo. I use a tenant Full Control client IDs set up via AppRegNew. They see that ID and are aggregating everything that runs under that identity. I suspect they are doing the same for your user identities.

@wadehgsk Credentials for Basic Authentication have nothing to do with App. It's customer's credentials , have no relations to our product. And, yes, seems that they don't care at all and, as we spoke a lot with support and their answer is business as usual.

I can't explain your behavior. I have no insights into how they compute costs and impose throttling other than what I posted above. I am suggesting they are tracking it somehow so there must be something lumping your BA calls together.

We got some confirmation from MS Support that they had a performance issue :

_I have been in discussion with my seniors about the issue and would like to update you that there was a Service Interruption causing performance issues which was resolved on Aug 15 and the patch was deployed on Aug 31, 2019. The Service Interruption number is SP188056._

As for now, we still see high throttling rate for Sharepoint / OneDrive , but it's less than was 4 days ago.

Did anyone hear anything? For us throttling is still high, and happening now on more tenants, even our test tenant with only a few active users.

It still high for us as well, we're trying to work with MS support, but so far no progress.

Recently, we have been seeing increased throttle.htm redirects in our custom web pages that are hosted in site collection and use ajax REST API calls. The redirects seem to be quite intermittent and often happen for users who cannot be described as heavy users. We cannot utilize user-agent decoration, because the applications are executed in web browsers. Neither we can use retry-after header, because all we are getting as response is throttle.htm page. We do not have App ID to identify application however throttling redirects are user-specific.

Since we started getting these issues quite recently and nothing drastically changed in our apps, they have to be related to changes made by Microsoft. How do we pinpoint the heavy use queries? We looked through all guidance materials and could not find answers. Any help will be appreciated.

You need to try to talk with MS support, it's not easy at all and can take months back and forth and result is not guaranteed. This is the only way, MS guys stopped to respond here and looks like no one cares about heavy throttling issues. It's really problem, MS doesn't provides throttling limits (like google), neither dashboard to monitor and pin-point problematic queries, even more throttling can be caused by their internal bug (we have confirmation by MS support) , so, I'm afraid at the end of the day, no solution for that, MS is no really cares about eco-system and here you're on your own.

We opened ticket with Microsoft and after reviewing logs, they pinpointed queries that were causing problems. In our case it was actually problem in our code. I wish we had tools to do this troubleshooting ourselves.

Was this page helpful?
0 / 5 - 0 ratings