Pygithub: github.GithubException.RateLimitExceededException

Created on 7 May 2019 · 15Comments · Source: PyGithub/PyGithub

I am trying to fetch the number of open issues using the following code in my Flask application.

g = Github()

repo = g.get_repo(repo_name)

open_pulls = repo.get_pulls(state='open')
open_pull_titles = [pull.title for pull in open_pulls]

open_issues = repo.get_issues(state='open')
open_issues = [issue for issue in open_issues if issue.title not in open_pull_titles]

and I get the error github.GithubException.RateLimitExceededException:.

repo.get_issues() returns the count of open issues plus pull requests.

stale

Source

242jainabhi

Most helpful comment

If I understand correctly, the return type of get_issues and get_pulls is PaginatedList. It uses yield element for iteration. So the request is not performed until open_issues = [issue for issue in open_issues if issue.title not in open_pull_titles]. If your token limits are reached, it will throw the RateLimitExceedException.

wangpeipei90 on 7 May 2019

👍2

All 15 comments

wangpeipei90 on 7 May 2019

👍2

Is there any workaround?

242jainabhi on 7 May 2019

g = Github()
Did you authenticate at this step? public api has less rate limits.

sfdye on 8 May 2019

g = Github()
Did you authenticate at this step? public api has less rate limits.

Yes, I did authenticate that step.

242jainabhi on 8 May 2019

@242jainabhi
The things I usually do when reaching the rate limit is just holding off the program for some time.

Instead of using the list comprehension, I may use just a common loop with try-catch. Once a rate limit exception is caught, call the sleep function to wait for a while and check the rate limit with GitHub API again. The code only proceeds if the rate limit comes back to 5000.

wangpeipei90 on 8 May 2019

I am sorry for missing a part of the code. Below is the code in continuation to the code in first comment.
I am accessing created_at date for all the open issues. This will again access the API for all the issues and hence end up making calls more than the limit.

for issue in open_issues:
    created_at = issue.created_at.timestamp()

I could not find a solution to this problem. Even if I authenticate the requests, the limit will be exhausted if the issues are too many (let's say 2000).

242jainabhi on 9 May 2019

Something like this:

repositories = g.search_repositories(
    query='stars:>=10 fork:true language:python')

Also triggers the rate limit. I assume, it's doing the pagination automatically and that's triggering the rate limit? Is there any way for me to do it manually so I can pause?

shamoons on 16 May 2019

I now realize that this is a real issue.
One possible workaround could be code snippet like the below. I have not tried it yet, let me know whether it works or not.

iter_obj=iter(open_issues) ## PaginatedList is a generator
 while True:
    try:   
        issue=next(iter_obj) 
        ## do something
    except StopIteration:
        break  # loop end
    except github.GithubException.RateLimitExceededException:
        sleep(3600) # sleep 1 hour
        ## check token limits
        continue

wangpeipei90 on 17 May 2019

👎3 👍1

@wangpeipei90 Does not work.

eelco on 28 Jun 2019

For some reason this behavior is highly unpredictable and it's maddening.
My program can effectively cycle and preemptively call the ratelimit api to check if it adheres within the limits for one to two hours before randomly giving a 403.

Some native rate limit adherence would be a warm welcome here. Having to implement sleeps based on intuition when your application decides to spill the beans after two hours of running smoothly should not be expected behavior.

 File "word.py", line 126, in get_stargazers_inner
    for i in repo.get_stargazers_with_dates():
  File "/usr/local/lib/python3.6/dist-packages/github/PaginatedList.py", line 62, in __iter__
    newElements = self._grow()
  File "/usr/local/lib/python3.6/dist-packages/github/PaginatedList.py", line 74, in _grow
    newElements = self._fetchNextPage()
  File "/usr/local/lib/python3.6/dist-packages/github/PaginatedList.py", line 199, in _fetchNextPage
    headers=self.__headers
  File "/usr/local/lib/python3.6/dist-packages/github/Requester.py", line 276, in requestJsonAndCheck
    return self.__check(*self.requestJson(verb, url, parameters, headers, input, self.__customConnection(url)))
  File "/usr/local/lib/python3.6/dist-packages/github/Requester.py", line 287, in __check
    raise self.__createException(status, responseHeaders, output)
github.GithubException.RateLimitExceededException: 403 {'message': 'API rate limit exceeded for user ID xxxx.', 'documentation_url': 'https://developer.github.com/v3/#rate-limiting'}

Additionally, one could use the backoff library -- however it can not account for the current position in item iteration and will therefore start from scratch again.

benjaminvanrenterghem on 25 Aug 2019

Well, I visited https://github.com/settings/tokens and did a "Regenerate token". That got me rolling again, but I'm not sure for how long.

I used the "token" method of authentication. Example:

    github = Github("19exxxxxxxxxxxxxxxxxxxxxe3ab065edae6470")

docktermj on 28 Aug 2019

See also #1233 for excessive requests.

maxsharabayko on 12 Oct 2019

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

stale[bot] on 11 Dec 2019

@wangpeipei90

It works for me, but the RateLimitExceededException is not under GithubException in the version I used. Here is my code.

from github import RateLimitExceededException

issues = g.search_issues(query=keyword, **{'repo': repo, 'type': 'pr'})
            iter_obj = iter(issues)
            while True:
                try:
                    pr = next(iter_obj)
                    with open(pr_file, 'a+') as f:
                        f.write(pr.html_url + '\n')
                    count += 1
                    logger.info(count)
                except StopIteration:
                    break  # loop end
                except RateLimitExceededException:
                    search_rate_limit = g.get_rate_limit().search
                    logger.info('search remaining: {}'.format(search_rate_limit.remaining))
                    reset_timestamp = calendar.timegm(search_rate_limit.reset.timetuple())
                    # add 10 seconds to be sure the rate limit has been reset
                    sleep_time = reset_timestamp - calendar.timegm(time.gmtime()) + 10
                    time.sleep(sleep_time)
                    continue

Here is part of the log:

2020/01/08 23:42:09 PM - INFO - search remaining: 0

Xiaoven on 8 Jan 2020

Thanks, @Xiaoven I am finally able to solve this with your code.

sparkingdark on 18 Mar 2021

Was this page helpful?

0 / 5 - 0 ratings

Related issues

Support search rate limit

nwalsh1995 · 13Comments

Docstrings problem

RSully · 24Comments

Can't publish new versions using manage.sh

s-t-e-v-e-n-k · 12Comments

New release?

sfdye · 9Comments

Branch object is missing most attributes and methods

thefunkjunky · 11Comments