Hi, sometimes when Atlantis is triggered on a PR in github, Atlantis posts the following error onto the PR:
Plan Error
GET https://api.github.com/repos/myorg/myrepo/pulls/1163/files?per_page=300: 404 Not Found []
Looking at github's API docs, that per_page=300 seems okay:
Note: Responses include a maximum of 3000 files. The paginated response returns 30 files per page by default.
We can replan and it works- e.g., it appears to be intermittent. Looking in the Atlantis logs, I see the following (I've removed the timestamps and redacted IPs/private info):
[INFO] server: POST /events – from xxx:51544
[INFO] server: Identified event as type "other"
[INFO] server: POST /events – respond HTTP 200
[EROR] myorg/myrepo#1163: GET https://api.github.com/repos/myorg/myrepo/pulls/1163/files?per_page=300: 404 Not Found []
[EROR] myorg/myrepo#1163: Unable to hide old comments: GET https://api.github.com/repos/myorg/myrepo/issues/1163/comments?direction=asc&sort=created: 404 Not Found []
We have a couple of theories but haven't been able to reproduce. First, it's only happened since we updated to v12.0, the current release. (we also added --hide-prev-plan-comments --disable-markdown-folding at this time).
Second is that it may happen with a largeish number of directories, though generally our changed dirs is under 50 and changed files is under 100.
The third theory is that it might happen when two unrelated repos are processing at the same time. That can be seen here; I've left the timestamps so you can see the overlap, "myrepo" is the same as above, and "REPO2" is the other repo that is planning.
2020-04-30T10:29:45.155000 [INFO] myorg/myrepo#1163: Creating dir "/home/atlantis/.atlantis/repos/myorg/myrepo/1163/default"
2020-04-30T10:29:45.252000 [INFO] myorg/REPO2#276: Creating dir "/home/atlantis/.atlantis/repos/myorg/REPO2/276/default"
2020-04-30T10:29:45.961000 [INFO] myorg/REPO2#276: Successfully parsed atlantis.yaml file
2020-04-30T10:29:45.963000 [INFO] myorg/REPO2#276: 1 projects are to be planned based on their when_modified config
2020-04-30T10:29:46.152000 [INFO] myorg/myrepo#1163: Successfully parsed atlantis.yaml file
2020-04-30T10:29:46.156000 [INFO] myorg/myrepo#1163: 13 projects are to be planned based on their when_modified config
2020-04-30T10:29:46.157000 [INFO] myorg/myrepo#1163: Acquired lock with id "myorg/myrepo/xxx"
2020-04-30T10:29:46.457000 [INFO] myorg/REPO2#276: Acquired lock with id "myorg/REPO2/./yyy"
2020-04-30T10:29:46.457000 [INFO] myorg/REPO2#276: Creating dir "/home/atlantis/.atlantis/repos/myorg/REPO2/276/yyyy"
This is happening to us too. It always happens when opening a PR, but then we comment atlantis plan and it works. :man_shrugging:
we are having this issue as well:
Are you guys implementing a retry/backoff mechanism to handle eventual consistency?
--- PR CREATION
Jul 24 11:31:25 ip-42-42-42-42 bash[5186]: 2020/07/24 11:31:25+0000 [INFO] server: Identified event as type "opened"
Jul 24 11:31:25 ip-42-42-42-42 bash[5186]: 2020/07/24 11:31:25+0000 [INFO] server: Executing autoplan
Jul 24 11:31:25 ip-42-42-42-42 bash[5186]: 2020/07/24 11:31:25+0000 [INFO] server: POST /events – respond HTTP 200
--- ERROR
Jul 24 11:31:25 ip-42-42-42-42 bash[5186]: 2020/07/24 11:31:25+0000 [EROR] owner/atlantis-repo-name#1: GET https://api.github.com/repos/owner/atlantis-repo-name/pulls/1/files?per_page=300: 404 Not Found []
--- END
Jul 24 11:31:27 ip-42-42-42-42 bash[5186]: 2020/07/24 11:31:27+0000 [INFO] server: POST /events – from 127.42.42.42:4242
Jul 24 11:31:27 ip-42-42-42-42 bash[5186]: 2020/07/24 11:31:27+0000 [INFO] server: POST /events – respond HTTP 200
Looks like we need to add a retry.
FWIW we hadn't seen this for a while and started seeing it again yesterday (or the day before?). I'm sure it's a github problem, not an Atlantis problem, but Atlantis probably needs to work around it.
We have been using Atlantis for ~1 week and we also just saw this problem just now for the first time in a project with ~30 terraform files and on a PR with only 1 changed file. We are using Atlantis v0.14.0.
FWIW we hadn't seen this for a while and started seeing it again yesterday (or the day before?). I'm sure it's a github problem, not an Atlantis problem, but Atlantis probably needs to work around it.
Yeah totally, and it shouldn't be too hard to throw some retries in there.
We just started seeing messages like this on PRs. I wonder if there need to be more retries or implement the exponential backoff? Mostly commenting just to see if others who happen to stop by here are having the same issue
We just started seeing messages like this on PRs. I wonder if there need to be more retries or implement the exponential backoff? Mostly commenting just to see if others who happen to stop by here are having the same issue
We have been getting is more often as well
After trying the version with the fix, we stopped seeing this
We're running with the fix implemented in #1131 and have still seen this issue occur, relatively often within the past week (presumably due to GitHub performance), so it seems like it might be worth implementing a different retry strategy such as exponential backoff, as suggested in that PR.
Most helpful comment
This is happening to us too. It always happens when opening a PR, but then we comment
atlantis planand it works. :man_shrugging: