Elasticsearch: Improve "TOO_MANY_REQUESTS/12/index read-only / allow delete (api)" message when running low on disk

Created on 19 Jun 2020 · 6Comments · Source: elastic/elasticsearch

Elasticsearch version (bin/elasticsearch --version): 7.7

Plugins installed: []

JVM version (java -version): -

OS version (uname -a if on a Unix-like system): -

Description of the problem including expected versus actual behavior:

When hitting the flood stage watermark we set all indices on that node to read_only_allow_delete as documented here.

Previously (?) when this happened, the messages returned to each request would usually say FORBIDDEN/12/index read-only / allow delete (api) but it seems that something recently changed and it now says TOO_MANY_REQUESTS/12/index read-only / allow delete (api).
I think the TOO_MANY_REQUESTS is misleading, as it suggests that this could be caused by throttling.

Also found several comments in discuss where users found this confusing. Here are two examples:
https://discuss.elastic.co/t/elastic-7-7-0-too-many-requests-with-one-request/233417/3
https://discuss.elastic.co/t/too-many-requests-12-index-read-only-allow-delete-api/236718/3

@DaveCTurner since you happened to reply to both of them I'm tagging you here.

:DistributeAllocation >bug Distributed

Source

jakommo

All 6 comments

I agree the message could be improved, substantially. I don’t think it’s any less clear with TOO_MANY_REQUESTS vs. FORBIDDEN. I want to clarify the reason for TOO_MANY_REQUESTS (vs. FORBIDDEN). Previously we returned a 403 status code in this situation, which translates to FORBIDDEN, and clients would not retry. That’s bad since disk full is transient: once an administrator cleans up, we resolve the situation by removing the block. So a client could have retried. This should be indicated by 429, so we made this change. And that associates to TOO_MANY_REQUESTS.

jasontedor on 19 Jun 2020

❤1

Ah, thanks for the background @jasontedor . Didn't think about this being reflected from the HTTP status code.

Fully agree that FORBIDDEN wasn't any better.

Would we be able to add something about disk space/watermarks to the message or is it parsed from HTTP code and index setting etc?

"reason": "index [test] blocked by: [TOO_MANY_REQUESTS/12/index read-only / allow delete (api)];"

jakommo on 19 Jun 2020

👍1

Yes, currently it’s produced from the block placed on the index which only carries a status code and a message that indicates the type of block (note a user can manually apply the block, so its presence doesn’t mean disk full necessarily). It will be some effort to make the situation clearer.

jasontedor on 19 Jun 2020

a user can manually apply the block, so its presence doesn’t mean disk full necessarily

In practice I don't think users really do apply this block manually. If they did, it would today be automatically removed a short while later. We discussed this when contemplating the auto-release behaviour and decided that this block should be considered as under the control of the disk-based shard allocator; if users want a read-only index then they can apply other blocks, and if they want to delete such an index then they can manually remove the block first.

This was technically a breaking change but I haven't seen any real-world impact in the 9 months since 7.4.0 was released.

Given this, I think rewording the description of this block is the right thing to do.