This may be related to a previous issue we had filed: #2081
The AWS SDK publishes updates to a large number of gems, which can sometimes cause us to get throttled. We had implemented batching with sleep and try to use the retry-after header to back-off, however we are receiving the same, very large value (216000000) in that header.
Should we not be using this header or is there something else we should be using? Is there a possibility of getting a rate limit increase for our account (awscloud)? We need to publish updates to a large number (potentially 200+) every day which could potentially exceed the 650 req in 10000 min (6.9 days) limit.
@simi @dwradcliffe
Hello all just checking in on this since this is currently preventing us from publishing any updates to the AWS SDK gems - appreciate any help in advance!
Would it be possible to at least reset the current cache / limits as a short term fix?
I found a problem on our end and I'll try to get a fix pushed tonight.
Great - thanks! Let me know if there is anything I can do to help out.
We shipped the fix, can you try again?
Thanks! I'll confirm this works correctly during our release today.
Looking at the fix (#2268), it looks like the backoff calculation for all of the other throttling levels is incorrect (for operations outside of gem push) since they are doing (10.minutes ** level), this will give very long time frames for the higher levels (the documentation lists 10000 minutes for level 4 but this computation gives 2160000000 minutes).
Really appreciate you getting this change in! The fix worked for us today and the retry-after header now seems to be set correctly; however, I think there is still a longer term problem for us here.
We own ~230 gems and occasionally we need to update many/all of them together which will cause us to exceed this threshold. Using the retry-after header we could sleep for ~1 hour between batches, but this causes our release process to drag out over a long period of time. We could also distribute our release across multiple hosts since the throttling is ip based, but again this isn't really ideal.
Could we discuss the possibility of account whitelists or other solutions (note: we're happy to do the work to add this if needed as I know it may be a non-trivial change)? Also happy to open this up as a new feature-request and close this issue out since the Retry-After header issue has been fixed.
We have been discussing this internally. I鈥檒l update here when we have a plan.
Just to remind, I have proposed to introduce "verified accounts" to skip the limits.
Initial implementation was done at https://github.com/rubygems/rubygems.org/pull/2163, but it wasn't working as expected. I can take a look for another round if needed.
Thanks everyone for jumping on this. As a quick hack, what about whitelisting an IP range? I wouldn't think Amazon's IP addresses would change any time soon. Let me know if this sounds feasible.
Is this also something we could contribute to? (i.e. do open source contributors have access to relevant things to test this?)
what about whitelisting an IP range?
I personally would prefer to improve things overall so that we have more relaxed limits instead of adding a workaround for one user.
Retry-After = 216000000
was grossly incorrect and I am sincerely sorry you have face issues because of this. Since then we have worked on a few improvements like only adding rate limit with back-off only on failed gem push
requests (#2311), reduce the number of back-off levels (#2330), increased our per hour limit on total push requests to 400 req/hr (#2407) and add close monitoring of 429 on push requests. We have also added a guide page for the rate limit.
Feel free to re-open this or file a new issue you are still seeing 429 on gem push.
You rock! Thank you! We will "open up the flood gates" on our end and try publishing ~220 gems instead of batches of 42 every 10 min.
Most helpful comment
I personally would prefer to improve things overall so that we have more relaxed limits instead of adding a workaround for one user.
Retry-After = 216000000
was grossly incorrect and I am sincerely sorry you have face issues because of this. Since then we have worked on a few improvements like only adding rate limit with back-off only on failedgem push
requests (#2311), reduce the number of back-off levels (#2330), increased our per hour limit on total push requests to 400 req/hr (#2407) and add close monitoring of 429 on push requests. We have also added a guide page for the rate limit.Feel free to re-open this or file a new issue you are still seeing 429 on gem push.