Gatsby: [EPIC] Improve reliability of Windows CI

Created on 24 Aug 2018  路  12Comments  路  Source: gatsbyjs/gatsby

Who will own this?

_What Area of Responsibility does this fall into? Who will own the work, and who needs to be aware of the work?_

Area of Responsibility:

_Select the Area of Responsibility most impacted by this Epic_

  • [x] OSS

Summary

Gatsby uses a free Appveyor account to run Windows CI tests. The tests are very slow, and sometimes don't get reported at all. This means that PRs are often not tested on Windows before being merged in.

How will this impact Gatsby?

Domains

_List the impacted domains here_

Components

_List the impacted Components here_

Goals

_What are the top 3 goals you want to accomplish with this epic? All goals should be specific, measurable, actionable, realistic, and timebound._





    1. 3.

How will we know this epic is a success?

_What changes must we see, or what must be created for us to know the project was a success. How will we know when the project is done? How will we measure success?_

User Can Statement

  • User can...

Metrics to Measure Success

  • We will see an increase /decrease in...

Additional Description

_In a few sentences, describe the current status of the epic, what we know, and what's already been done._

What are the risks to the epic?

_In a few sentences, describe what high-level questions we still need to answer about the project. How could this go wrong? What are the trade-offs? Do we need to close a door to go through this one?_

What questions do we still need to answer, or what resources do we need?

_Is there research to be done? Are there things we don鈥檛 know? Are there documents we need access to? Is there contact info we need? Add those questions as bullet points here._

How will we complete the epic?

_What are the steps involved in taking this from idea through to reality?_

How else could we accomplish the same goal?

_Are there other ways to accomplish the goals you listed above? How else could we do the same thing?_

Next Steps

  • [ ] Under Pipeline select Proposed Epics (only if you are NOT the AoR owner)
  • [ ] Under Assignees select the AoR Owneryou listed in the Epic
  • [ ] Under Labels select Epic
  • [ ] Select Create Epic

You're all done!

stale?

Most helpful comment

I'm a product manager on Azure Pipelines. Let me know if you have any questions or suggestions.

All 12 comments

cc @m-allanson @KyleAMathews I've created #7652 which describes the two problems currently causing all Windows tests to fail. However, fixing these issues won't impact on the speed of Windows testing, so there might still be problems.

Great stuff, thanks @davidbailey00 馃憤

Here's some WIP notes on improving AppVeyor build times, which should also help with reliability.

Rolling builds

There's a "rolling builds" configuration setting for AppVeyor that will tell it to only test the newest commit from any given PR: https://www.appveyor.com/docs/build-configuration/#rolling-builds. This has to be enabled through the AppVeyor UI.

From the Appveyor docs:

"rolling builds" are great for very active OSS projects with lengthy queue. Whenever you do a new commit to the same branch OR pull request all current queued/running builds for that branch or PR are cancelled and the new one is queued. Other words, rolling builds make sure that only the most recent commit is built.

I can't see this option in the AppVeyor UI, I assume @KyleAMathews needs to give @pieh and myself additional permissions on the AppVeyor account?

Fail strategy

Appeyor's default behaviour is to run all build jobs even if one of them fails. There is a fast_finish option which will cancel all other jobs as soon as one job fails.

https://www.appveyor.com/docs/build-configuration/#failing-strategy

Concurrent jobs

Appveyor offers one concurrent job for OSS builds. Additional concurrency can be added by paying for a basic account, and then paying $25/month per additional concurrent job: https://www.appveyor.com/pricing/

Investigate caching

Cache node_modules between builds? Cache anything else?

https://www.appveyor.com/docs/build-cache/

Job matrix configuration

There is an install script that cancels most jobs in the matrix, running them only for releases or forced builds. However, this script does not run until after the repo has been cloned for each job, meaning they can take a couple of minutes to be cancelled. See example.

Can this functionality be replicated via Appveyor's config options? See config reference.

An alternative would be to temporarily drop these extra jobs, and look at adding them back in once everything else here has been investigated.

I assume these jobs don't run on every PR because they take a while - but it seems counterproductive to have tests that are only run under certain conditions. Maybe we should reduce the number of jobs in the matrix and always run them. Instead of having many jobs that are only run under certain conditions.

Other things to investigate

Nice investigation @m-allanson! Stopping builds + paying for more concurrency seems like easy wins.

@KyleAMathews has enabled the rolling builds feature

I'm a product manager on Azure Pipelines. Let me know if you have any questions or suggestions.

@jeremyepling Sorry for getting back to You late, we just recently started experimenting with Azure Pipelines and right now we are facing git checkout CRLF/LF problem:
in https://github.com/gatsbyjs/gatsby/pull/8836 there is attempt to fix our unit tests for windows (which passes currently for appveyor CI but fails in our rudimentary Azure Pipelines setup) - snapshot don't match most likely because saved snapshot are checked out with CRLF style line endings (as opposed to LF style line endings that function we tests produce). Is there option to set Azure Pipelines checkout to use LF line endings?

I saw there is checkout configuration, but it doesn't seem to cover that part - https://docs.microsoft.com/en-us/azure/devops/pipelines/yaml-schema?view=vsts#checkout

Seems like we can use .gitattributes to handle CRLF/LF issues - https://github.com/gatsbyjs/gatsby/pull/8922 :)

I got a successful build after I created a .gitattributes file that sets all line endings to LF for your repository. This file will override Git user settings for CRLF.

I'd like to have a way to specify gitconfig values from the Azure Pipelines YAML, but we don't have that yet.

I should have refreshed the page earlier. I'm just now seeing your comment. :)

Old issues will be closed after 30 days of inactivity. This issue has been quiet for 20 days and is being marked as stale. Reply here or add the label "not stale" to keep this issue open!

Hey again!

It鈥檚 been 30 since anything happened on this issue, so our friendly neighborhood robot (that鈥檚 me!) is going to close it.

Please keep in mind that I鈥檓 only a robot, so if I鈥檝e closed this issue in error, I鈥檓 HUMAN_EMOTION_SORRY. Please feel free to reopen this issue or create a new one if you need anything else.

Thanks again for being part of the Gatsby community!

Was this page helpful?
0 / 5 - 0 ratings

Related issues

rossPatton picture rossPatton  路  3Comments

timbrandin picture timbrandin  路  3Comments

dustinhorton picture dustinhorton  路  3Comments

benstr picture benstr  路  3Comments

KyleAMathews picture KyleAMathews  路  3Comments