Describe the bug
Frequently my Mac OS X jobs fail. Usually, when a job fails, the subsequent steps display as canceled and take 0s, and the if: always () steps run anyway, as in:

(From https://github.com/mit-plv/fiat-crypto/runs/593609449?check_suite_focus=true )
However, the Mac job failures, such as https://github.com/mit-plv/fiat-crypto/pull/753/checks?check_run_id=593732097 , display broken logs, where the failing step has no contents (no down arrow), and all subsequent steps fail with 0s, including the if: always () steps:

Furthermore, if I click the three dots and click "View raw logs", the logs are missing; I get directed to a page like https://github.com/mit-plv/fiat-crypto/commit/c31a955db7a356f1788d979ac9b1ed1a4fc67674/checks/593732097/logs which says only
2020-04-16T22:40:17.1881748Z ##[section]Starting: Request a runner to run this job
2020-04-16T22:40:17.9419267Z Requesting a hosted runner in current repository's account/organization with labels: 'macos-latest', require runner match: True
2020-04-16T22:40:18.0349714Z Labels matched hosted runners has been found, waiting for one of them get assigned for this job.
2020-04-16T22:40:18.0610578Z ##[section]Finishing: Request a runner to run this job
Area for Triage: Apple
Question, Bug, or Feature?: Bug
Virtual environments affected
Expected behavior
I should get sensible logs, or, better, the jobs should not be failing at all (they work fine if I restart the job enough times, and they work fine consistently on Linux and often on Windows)
Actual behavior
See above. Link: https://github.com/mit-plv/fiat-crypto/pull/753/checks?check_run_id=593732097
@JasonGross , thank you for report of this issue!
Unfortunately, this repository manages only image content but we will try to escalate this issue to appropriate team.
@TingluoHuang , @ericsciple , @alepauly , is it something that actions/runner does?
@maxim-lobanov can you check runner diagnostic for this job to see why the runner can't upload log? I don't know how to access the runner log for hosted mac pool.
I'm now seeing this happen on our Linux jobs too, such as https://github.com/mit-plv/fiat-crypto/pull/766/checks?check_run_id=606437846

It seems to be happening on the artifact upload step on Linux, maybe the machines are running out of space or something?
Hello, Just a quick update, issue can come from bug on our backend. We are still looking at it.
@JasonGross , Hello!
Could you please check if you still see the same issues?
Closing this for now but please let us know if you still see the same issue
It's been happening less often, but it just happened again : https://github.com/JasonGross/fiat-crypto/runs/697783422

I've also seen the Mac OS jobs frequently show up as "cancelled" when I didn't cancel them, and I don't believe anyone else did, either.
So I guess this issue should be re-opened
We've had the same issue on one of our builds scheduled to run Monday through Friday at 01:30 UTC. It would be cancelled _randomly_ after running for about 10-15 minutes.
Last night it succeeded for the first time in weeks but I will continue to monitor.
I had this problem, so I made an example project to show GitHub support https://github.com/joehinkle11/Mac-GitHub-Actions-Test/actions
They also responded with an email saying
Hi Joe,
Thank you for your continued patience while we investigated these issues. For context, there is an existing issue tracking this:
https://github.com/actions/virtual-environments/issues/736
Due to similar reliability reports and errors when using our current MacOS platform for GitHub Actions, we have decided to make larger changes that will take provide a long-term solution.
We understand that you may continue to experience reliability issues while on the current platform, and hope to provide a better experience as soon as possible. If you notice any issues with billing on the next billing cycle, please reach out.
At this time we have improvements planned for early July and will keep our customers up to date through our blogs and changelog
Please let us know if you have any questions or concerns!
Cheers,
GitHub Support
Hope this helps anyone who is working on an Action and doesn't yet realize it's a bug with GitHub and not their scripts.
I've also had frequent random cancellations of GH Actions jobs (especially Mac OS), with missing logs, such as https://github.com/mit-plv/fiat-crypto/runs/791672004?check_suite_focus=true

And here's one where the logs are present https://github.com/mit-plv/fiat-crypto/runs/791678094?check_suite_focus=true :

GitHub won't even tell me who canceled these jobs, or why they were canceled. (Was it because I pushed another commit that triggered the workflow? Is GitHub now forcibly canceling jobs on old commits, even those which are on the tip of their branch but are not the newest one running across all branches?)
I can also confirm that jobs on MacOS are cancelled for no apparent reason: https://github.com/cytopia/pwncat/pull/80/checks?check_run_id=792119613
Additionally to say there are not logs or other info regarding why it had been cancelled
We fixed a configuration issue in the service that causes mac hosted build hit this error every day in 1:00-2:00 AM UTC
We fixed a configuration issue in the service that causes mac hosted build hit this error every day in 1:00-2:00 AM UTC
That sounds promising 馃帀 Is that fix already live?
@svenmuennich the fix is already live, and I can confirm from the telemetry that the fix works as expected.
馃憞 we no longer have the big spike every night.

Great! Thank you 馃
Will keep this issue opened for a few more days, @svenmuennich , @cytopia , @JasonGross , could you please report back if you still see the same issues
We still see the same issues. Here is a build from 8 hours ago (Wed, 24 Jun 2020 08:08:20 GMT) that failed in this way: https://github.com/mit-plv/fiat-crypto/pull/817/checks?check_run_id=802532325

Attempting to fetch the raw logs gives
2020-06-24T08:08:06.3800369Z ##[section]Starting: Request a runner to run this job
2020-06-24T08:08:06.6634916Z Can't find any online and idle self-hosted runner in current repository that matches the required labels: 'macos-latest'
2020-06-24T08:08:06.6634949Z Can't find any online and idle self-hosted runner in current repository's account/organization that matches the required labels: 'macos-latest'
2020-06-24T08:08:06.6634965Z Found online and idle hosted runner in current repository's account/organization that matches the required labels: 'macos-latest'
2020-06-24T08:08:06.8916332Z ##[section]Finishing: Request a runner to run this job
which is bizarre.
@TingluoHuang , can it be something different?
@maxim-lobanov yes, https://github.com/mit-plv/fiat-crypto/pull/817/checks?check_run_id=802532325 failed due to https://github.com/github/c2c-actions-compute/issues/643 which is not the one I fixed.
https://github.com/github/c2c-actions-compute/issues/643 is a 404 for me; is there any issue I can track about this (other than this present one)?
Last night our scheduled build failed again. This time we got an error though:
An error occurred while provisioning resources (Error Type: Disconnect).
No idea whether that is related to this issue.
Hello everyone!
We have recently done some changes on our side. Could you please check if you still see the same issue (steps without logs)?
Hello everyone!
We have recently done some changes on our side. Could you please check if you still see the same issue (steps without logs)?
I have the same problem on macOS https://github.com/atlas-engine/AtlasStudio/runs/1215960239.

@maxim-lobanov
Most helpful comment
@svenmuennich the fix is already live, and I can confirm from the telemetry that the fix works as expected.
馃憞 we no longer have the big spike every night.
