Motivated by this https://github.com/neuropoly/spinalcordtoolbox/issues/3057#issuecomment-735281988 and as discussed during the weekly meeting we will migrate the continuous integration infrastructure to Github Actions.
I am creating this issue to document the process and centralize discussions on the topic.
Working on this! I got motivated today because @joshuacwnewton and shimming-toolbox were hitting pointless CI bugs with trying to use sct as part of other projects.
So far https://github.com/neuropoly/spinalcordtoolbox/runs/1602878596 has gotten to running the tests in 3 minutes. Travis always took at least 6 to get to the same place.
I hate to say it but vive la GithubCI.
I've ported the platform matrix over (almost); I sidestepped my util/dockerize.sh because Actions supports docker via its API: just say: container: debian:10, so that's nice.
However I've hit a stupid growing-pains snag: in travis we blocked off platforms into always/nightly, e.g.
always:
nightly:
Actions doesn't let you use if: within matrix: the way Travis does, so the most obvious equivalent on Actions I could think of is to add a nightly flag to the matrix and then read it:
but this fails with
The workflow is not valid. .github/workflows/tests.yml (Line: 57, Col: 9): Unrecognized named-value: 'matrix'. Located at position 3 within expression: !(matrix.nightly) \|\| (matrix.nightly && (github.event_name == 'schedule' \|\| (github.event_name == 'push' && github.ref == 'refs/heads/release'))) | Â
-- | --
This is a discovered bug, though Github hasn't addressed it so maybe it's not a known one? https://github.community/t/how-to-conditionally-include-exclude-items-in-matrix-eg-based-on-branch/16853/5. It sounds like if: on a job: got added only as an afterthought, and I guess not done very well. Indeed, it works if I move the if: a level down into step:. That'll work but leave us with a bunch of vacuous test runs cluttering up the UI and I don't want that.
The suggested fix there is to compute a run/don't run flag in the build matrix based on whatever condition we want -- just like we do with Travis -- but then add an exclude: block to cross off cases based on that. I don't really understand how exclude:/include: interact though and I haven't got this working yet.
EDIT: A couple of suggestions in https://github.community/t/conditional-matrices/17206/2; one unanswered SO post at https://stackoverflow.com/questions/65384420/how-to-make-a-github-action-matrix-element-conditional.
This person has nested a list inside the build matrix, which is already a list, but apparently that helped? https://github.com/rotators/ReDefine/blob/636a6676e48402489c88ea42a4874cdc5b98aa03/.github/workflows/Build.yml#L13-L28
This person is integrating information from github.* in their matrix https://github.community/t/how-to-conditionally-include-exclude-items-in-matrix-eg-based-on-branch/16853/6 but I don't see how yet if/how that lets them add or remove cases.
This one https://github.community/t/conditional-matrices/17206/2 suggests splitting the jobs in two; that would work, I think, though inelegant: we'd have one 'nightlies' job with a build matrix covering all the nightly builds, and one 'always' job with everything else, and the nightlies job could use if: in it because it wouldn't have to look into matrix from there, which is currently broken/unsupported in Actions (even though the other lines like runs-on and container can read it!); in the end it would all expand into one large pool of jobs running in parallel.
I get the sense GH Actions expects people to manage triggers by splitting their test suites into multiple workflows (i.e. multiple *.yml files), since event triggers are applied on a per-workflow basis (the on: keyword applies to the entire workflow file).
With Travis, we had one big .travis.yml file, but with GH Actions it seems they want us to think more in terms of a workflow directory (containing multiple workflows) instead.
So, we could have these two .yml files:
core_platforms.yml: Run on: branch push/pull request/nightly, with nightly timing being handled by the schedule: syntax.extra_platforms.yml: Run on: nightly only.Or, slightly different organization:
push-pr.yml: Core platforms only.nightly.yml: Core platforms + extra platforms.Thanks for the perspective @joshuacwnewton. I think you're probably right. I also think this is going to turn out to be an oversight and they'll come around to allowing this sort of thing because it forces code duplication for nothing.
Update: that SO thread now has an answer, which says to use two jobs: one to generate and filter the jobs -- using jq of course -- and the second to actually take that list of jobs and expand it: https://stackoverflow.com/a/65434401
I'm pretty sure this is just a stupid bug though. Working around it with jq is a good idea but hard to maintain because you can't use the standard yaml build matrix syntax for it -- you have to feed in the same data as json as a string or from a file instead which is annoying.
In other exciting adventures: Actions is tripping some kind of locale bug in sct_download_data: https://github.com/neuropoly/spinalcordtoolbox/runs/1606746772?check_suite_focus=true
Trying URL: https://github.com/sct-data/PAM50/releases/download/r20201104/PAM50-r20201104.zip
Downloading: PAM50-r20201104.zip
--- Logging error ---
Traceback (most recent call last):
File "/github/home/sct_dev/python/envs/venv_sct/lib/python3.6/logging/__init__.py", line 996, in emit
stream.write(msg)
UnicodeEncodeError: 'ascii' codec can't encode character '\u201c' in position 42: ordinal not in range(128)
Call stack:
File "/github/home/sct_dev/spinalcordtoolbox/scripts/sct_download_data.py", line 168, in <module>
res = main()
File "/github/home/sct_dev/spinalcordtoolbox/scripts/sct_download_data.py", line 160, in main
install_data(url, dest_folder, keep=arguments.k)
File "/github/home/sct_dev/spinalcordtoolbox/download.py", line 141, in install_data
logger.warning("Removing existing destination folder \u201c%s\u201d", dest_folder)
Message: 'Removing existing destination folder \u201c%s\u201d'
Arguments: ('/github/home/sct_dev/data/PAM50',)
so that's on the pile to fix.
EDIT: probably because they set LC_ALL=LC_CTYPE=en_US.UTF-8. But this bug is on our end: we should be able to handle different locales; or else we should use locale.setlocale(locale.LC_ALL, 'C') (i.e. american english ascii) explicitly at the start of all our programs.
Also this: actions (specifically actions/checkout@v2) is quietly not using git behind our backs:
Run actions/checkout@v2
/usr/bin/docker exec dcc30894dde171c7043d2c7308e176d7558eb5104d548c7ea9d9bf454d5ea0c3 sh -c "cat /etc/*release | grep ^ID"
Syncing repository: neuropoly/spinalcordtoolbox
Getting Git version info
Working directory is '/__w/spinalcordtoolbox/spinalcordtoolbox'
Deleting the contents of '/__w/spinalcordtoolbox/spinalcordtoolbox'
The repository will be downloaded using the GitHub REST API
To create a local Git repository instead, add Git 2.18 or higher to the PATH
Downloading the archive
Writing archive to disk
Extracting the archive
/usr/bin/tar xz -C /__w/spinalcordtoolbox/spinalcordtoolbox/91e91672-e878-4bc4-880e-5aed13971109 -f /__w/spinalcordtoolbox/spinalcordtoolbox/91e91672-e878-4bc4-880e-5aed13971109.tar.gz
Resolved version neuropoly-spinalcordtoolbox-ae131aa
that's probably not good. that's going to break..stuff.
At a minimum it breaks https://github.com/neuropoly/spinalcordtoolbox/blob/6c18169f2d2ea8f7b43438c5f505a7fb0eea83d7/.ci.sh#L28
More adventures: the macOS test is hanging at conda activate: https://github.com/neuropoly/spinalcordtoolbox/runs/1606814539
@joshuacwnewton and I have seen this before but I'm unclear what causes it. ???
Got WSL passing. Now I need to figure out why macOS is hanging.
On test run https://github.com/neuropoly/spinalcordtoolbox/tree/0fdc21c54d6ce49816e7e8702b444e06e603fcdd/ here are outputs from travis vs github:
Their formats aren't quite the same; github has timestamps on each line while travis has them as special 'travis_time' lines at the start and end of blocks. Travis uses CR-LF lines. In set -x, Github's wrote "+ command" and travis wrote "+command". Both have cruft at the top and bottom I don't care about. This cleans it up:
$ cat /tmp/travis-logs.txt | tr -d '\r' | awk 'BEGIN { P=0 } /Installing SCT/ { P = 1 } P==1 { print }' | head
+echo Installing SCT
Installing SCT
+yes
+ASK_REPORT_QUESTION=false
+PIP_PROGRESS_BAR=off
+./install_sct
*******************************
* Welcome to SCT installation *
$ cat /tmp/gh-logs.txt | cut -f 2- -d ' ' | awk 'BEGIN { P=0 } /Installing SCT/ { P = 1 } P==1 { print }' | head
+ echo Installing SCT
Installing SCT
+ yes
+ ASK_REPORT_QUESTION=false
+ PIP_PROGRESS_BAR=off
+ ./install_sct
*******************************
* Welcome to SCT installation *
So now I can diff them:
$ diff -u <(cat /tmp/travis-logs.txt | tr -d '\r' | awk 'BEGIN { P=0 } /Installing SCT/ { P = 1 } P==1 { print }') <(cat /tmp/gh-logs.txt | cut -f 2- -d ' ' | sed 's/^\+ /\+/' | awk 'BEGIN { P=0 } /Installing SCT/ { P = 1 } P==1 { print }') | head -n 50
--- /dev/fd/63 2020-12-25 00:32:08.535441609 -0500
+++ /dev/fd/62 2020-12-25 00:32:08.535441609 -0500
@@ -14,7 +14,7 @@
Checking OS type and version...
-Darwin Traviss-Mac.local 19.6.0 Darwin Kernel Version 19.6.0: Mon Aug 31 22:12:52 PDT 2020; root:xnu-6153.141.2~1/RELEASE_X86_64 x86_64
+Darwin Mac-1608870864276.local 19.6.0 Darwin Kernel Version 19.6.0: Thu Oct 29 22:56:45 PDT 2020; root:xnu-6153.141.2.2~1/RELEASE_X86_64 x86_64
ProductVersion: 10.15.7
Checking requirements...
@@ -26,12 +26,12 @@
SCT version ......... dev
Installation type ... in-place
Operating system .... osx (10.15.7)
-Shell config ........ /Users/travis/.bashrc
+Shell config ........ /Users/runner/.bashrc
--> Crash reports will not be sent.
-SCT will be installed here: [/Users/travis/build/neuropoly/spinalcordtoolbox]
+SCT will be installed here: [/Users/runner/work/spinalcordtoolbox/spinalcordtoolbox]
Do you agree? [y]es/[n]o:
Skipping copy of source files (source and destination folders are the same)
@@ -40,45 +40,1203 @@
Installing conda...
-rm -rf /Users/travis/build/neuropoly/spinalcordtoolbox/python
+rm -rf /Users/runner/work/spinalcordtoolbox/spinalcordtoolbox/python
-mkdir -p /Users/travis/build/neuropoly/spinalcordtoolbox/python
+mkdir -p /Users/runner/work/spinalcordtoolbox/spinalcordtoolbox/python
-wget -O /var/folders/z3/_825pg0s3jvf0hb_q8kzmg5h0000gn/T/tmp.NIkTWTnG/miniconda.sh https://repo.anaconda.com/miniconda/Miniconda3-latest-MacOSX-x86_64.sh
+wget -O /var/folders/24/8k48jl6d249_n_qfxwsl6xvm0000gn/T/tmp.3oleVKUV/miniconda.sh https://repo.anaconda.com/miniconda/Miniconda3-latest-MacOSX-x86_64.sh
---2020-12-25 04:48:52-- https://repo.anaconda.com/miniconda/Miniconda3-latest-MacOSX-x86_64.sh
+--2020-12-25 04:41:34-- https://repo.anaconda.com/miniconda/Miniconda3-latest-MacOSX-x86_64.sh
Resolving repo.anaconda.com (repo.anaconda.com)... 104.16.130.3, 104.16.131.3
Connecting to repo.anaconda.com (repo.anaconda.com)|104.16.130.3|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 57112343 (54M) [application/x-sh]
-Saving to: ‘/var/folders/z3/_825pg0s3jvf0hb_q8kzmg5h0000gn/T/tmp.NIkTWTnG/miniconda.sh’
+Saving to: ‘/var/folders/24/8k48jl6d249_n_qfxwsl6xvm0000gn/T/tmp.3oleVKUV/miniconda.sh’
Full output: diff.txt
They're both running on almost the same macOS, builds done just a couple days apart:
ESC[0;32mChecking OS type and version...ESC[0m
-Darwin Traviss-Mac.local 19.6.0 Darwin Kernel Version 19.6.0: Mon Aug 31 22:12:52 PDT 2020; root:xnu-6153.141.2~1/RELEASE_X86_64 x86_64
+Darwin Mac-1608870864276.local 19.6.0 Darwin Kernel Version 19.6.0: Thu Oct 29 22:56:45 PDT 2020; root:xnu-6153.141.2.2~1/RELEASE_X86_64 x86_64
They have different pwds of course: /Users/travis/build/neuropoly/ vs /Users/runner/work/spinalcordtoolbox/ but that's not a big deal.
Interestingly, only Github loaded my polyfill (#3123 ); Travis must preinstall the gnu userland tools?
+realpath ()
+{
+ python3 -c 'import sys, os; [print(os.path.realpath(f)) for f in sys.argv[1:]]' "$@"
+}
The interesting parts are probably somewhere in the environment vars, so here:
+uname -a
-Darwin Traviss-Mac.local 19.6.0 Darwin Kernel Version 19.6.0: Mon Aug 31 22:12:52 PDT 2020; root:xnu-6153.141.2~1/RELEASE_X86_64 x86_64
+Darwin Mac-1608870864276.local 19.6.0 Darwin Kernel Version 19.6.0: Thu Oct 29 22:56:45 PDT 2020; root:xnu-6153.141.2.2~1/RELEASE_X86_64 x86_64
+set
-ANSI_CLEAR='\033[0K'
-ANSI_GREEN='\033[32;1m'
-ANSI_RED='\033[31;1m'
-ANSI_RESET='\033[0m'
-ANSI_YELLOW='\033[33;1m'
+AGENT_TOOLSDIRECTORY=/Users/runner/hostedtoolcache
+ANDROID_HOME=/Users/runner/Library/Android/sdk
+ANDROID_NDK_18R_PATH=/Users/runner/Library/Android/sdk/ndk/18.1.5063045
+ANDROID_NDK_HOME=/Users/runner/Library/Android/sdk/ndk-bundle
+ANDROID_SDK_ROOT=/Users/runner/Library/Android/sdk
ASK_REPORT_QUESTION=false
BASH=/bin/bash
BASH_ARGC=()
@@ -177,136 +1336,135 @@
BASH_REMATCH=([0]="y")
BASH_SOURCE=([0]="./install_sct")
BASH_VERSINFO=([0]="3" [1]="2" [2]="57" [3]="1" [4]="release" [5]="x86_64-apple-darwin19")
++tty
BASH_VERSION='3.2.57(1)-release'
BIN_DIR=bin
+BOOTSTRAP_HASKELL_NONINTERACTIVE=1
+CHROMEWEBDRIVER=/usr/local/Caskroom/chromedriver/87.0.4280.20
CI=true
-CONTINUOUS_INTEGRATION=true
+CONDA=/usr/local/miniconda
DATA_DIR=data
-DEBIAN_FRONTEND=noninteractive
DIRSTACK=()
-DISPLAY=/private/tmp/com.apple.launchd.tLNuxrYBPd/org.macosforge.xquartz:0
-DISPLAY_UPDATE_PATH='export PATH="/Users/travis/build/neuropoly/spinalcordtoolbox/bin:$PATH"'
+DISPLAY_UPDATE_PATH='export PATH="/Users/runner/work/spinalcordtoolbox/spinalcordtoolbox/bin:$PATH"'
+DOTNET_MULTILEVEL_LOOKUP=0
+DOTNET_ROOT=/Users/runner/.dotnet
+EDGEWEBDRIVER=/usr/local/share/edge_driver
EUID=501
-GEM_HOME=/Users/travis/.rvm/gems/ruby-2.6.6
-GEM_PATH=/Users/travis/.rvm/gems/ruby-2.6.6:/Users/travis/.rvm/gems/ruby-2.6.6@global
-GIT_ASKPASS=echo
+GECKOWEBDRIVER=/usr/local/opt/geckodriver/bin
+GITHUB_ACTION=run2
+GITHUB_ACTIONS=true
+GITHUB_ACTION_REF=
+GITHUB_ACTION_REPOSITORY=
+GITHUB_ACTOR=kousu
+GITHUB_API_URL=https://api.github.com
+GITHUB_BASE_REF=master
+GITHUB_ENV=/Users/runner/work/_temp/_runner_file_commands/set_env_ff0378c9-c61c-4899-84a2-f6d5739f6c5b
+GITHUB_EVENT_NAME=pull_request
+GITHUB_EVENT_PATH=/Users/runner/work/_temp/_github_workflow/event.json
+GITHUB_GRAPHQL_URL=https://api.github.com/graphql
+GITHUB_HEAD_REF=ng/ci-gh-actions
+GITHUB_JOB=test
+GITHUB_PATH=/Users/runner/work/_temp/_runner_file_commands/add_path_ff0378c9-c61c-4899-84a2-f6d5739f6c5b
+GITHUB_REF=refs/pull/3125/merge
+GITHUB_REPOSITORY=neuropoly/spinalcordtoolbox
+GITHUB_REPOSITORY_OWNER=neuropoly
+GITHUB_RETENTION_DAYS=90
+GITHUB_RUN_ID=443499730
+GITHUB_RUN_NUMBER=62
+GITHUB_SERVER_URL=https://github.com
+GITHUB_SHA=65218e7625ed0a4128aa435aedc27339f6a952f0
+GITHUB_WORKFLOW=Tests
+GITHUB_WORKSPACE=/Users/runner/work/spinalcordtoolbox/spinalcordtoolbox
GROUPS=()
-HAS_JOSH_K_SEAL_OF_APPROVAL=true
-HOME=/Users/travis
-HOMEBREW_NO_INSTALL_CLEANUP=1
-HOSTNAME=Traviss-Mac.local
+HOME=/Users/runner
+HOMEBREW_CASK_OPTS=--no-quarantine
+HOMEBREW_NO_AUTO_UPDATE=1
+HOSTNAME=Mac-1608870864276.local
HOSTTYPE=x86_64
IFS=$' \t\n'
-IRBRC=/Users/travis/.rvm/rubies/ruby-2.6.6/.irbrc
+ImageOS=macos1015
+ImageVersion=20201212.1
+JAVA_HOME=/Library/Java/JavaVirtualMachines/adoptopenjdk-8.jdk/Contents/Home
+JAVA_HOME_11_X64=/Library/Java/JavaVirtualMachines/adoptopenjdk-11.jdk/Contents/Home
+JAVA_HOME_12_X64=/Library/Java/JavaVirtualMachines/adoptopenjdk-12.jdk/Contents/Home
+JAVA_HOME_13_X64=/Library/Java/JavaVirtualMachines/adoptopenjdk-13.jdk/Contents/Home
+JAVA_HOME_14_X64=/Library/Java/JavaVirtualMachines/adoptopenjdk-14.jdk/Contents/Home
+JAVA_HOME_7_X64=/Library/Java/JavaVirtualMachines/zulu-7.jdk/Contents/Home
+JAVA_HOME_8_X64=/Library/Java/JavaVirtualMachines/adoptopenjdk-8.jdk/Contents/Home
LANG=en_US.UTF-8
LC_ALL=en_US.UTF-8
-LOGNAME=travis
+LC_CTYPE=en_US.UTF-8
+LOGNAME=runner
MACHTYPE=x86_64-apple-darwin19
MACOSSUPPORTED=13
MPLBACKEND=Agg
-MY_RUBY_HOME=/Users/travis/.rvm/rubies/ruby-2.6.6
-NVM_BIN=/Users/travis/.nvm/versions/node/v15.1.0/bin
+NUNIT3_PATH=/Library/Developer/nunit/3.6.0
+NUNIT_BASE_PATH=/Library/Developer/nunit
NVM_CD_FLAGS=
-NVM_DIR=/Users/travis/.nvm
-NVM_INC=/Users/travis/.nvm/versions/node/v15.1.0/include/node
-OLDPWD=/Users/travis/build/neuropoly/spinalcordtoolbox
+NVM_DIR=/Users/runner/.nvm
+OLDPWD=/Users/runner/work/spinalcordtoolbox/spinalcordtoolbox
OPTERR=1
OPTIND=1
OS=osx
OSTYPE=darwin19
OSver=10.15.7
-PAGER=cat
-PATH=/Users/travis/.rvm/gems/ruby-2.6.6/bin:/Users/travis/.rvm/gems/ruby-2.6.6@global/bin:/Users/travis/.rvm/rubies/ruby-2.6.6/bin:/Users/travis/.rvm/bin:/Users/travis/bin:/Users/travis/.local/bin:/Users/travis/.nvm/versions/node/v15.1.0/bin:/usr/local/bin:/usr/bin:/bin:/usr/sbin:/sbin:/opt/X11/bin:/Library/Apple/usr/bin
+PATH=/usr/local/opt/pipx_bin:/Users/runner/.cargo/bin:/usr/local/lib/ruby/gems/2.7.0/bin:/usr/local/opt/ruby/bin:/usr/local/opt/curl/bin:/usr/local/bin:/usr/local/sbin:/Users/runner/bin:/Users/runner/.yarn/bin:/usr/local/go/bin:/Users/runner/Library/Android/sdk/tools:/Users/runner/Library/Android/sdk/platform-tools:/Users/runner/Library/Android/sdk/ndk-bundle:/Library/Frameworks/Mono.framework/Versions/Current/Commands:/usr/bin:/bin:/usr/sbin:/sbin:/Users/runner/.dotnet/tools:/Users/runner/.ghcup/bin:/Users/runner/hostedtoolcache/stack/2.5.1/x64
+PERFLOG_LOCATION_SETTING=RUNNER_PERFLOG
PIPESTATUS=([0]="0")
+PIPX_BIN_DIR=/usr/local/opt/pipx_bin
+PIPX_HOME=/usr/local/opt/pipx
PIP_PROGRESS_BAR=off
-PPID=2500
-PS4=+
-PWD=/Users/travis/build/neuropoly/spinalcordtoolbox
+POWERSHELL_DISTRIBUTION_CHANNEL=GitHub-Actions-macos1015
+PPID=1011
+PS4='+ '
+***
PYTHONNOUSERSITE=1
PYTHON_DIR=python
-RC_FILE_PATH=/Users/travis/.bashrc
+RCT_NO_LAUNCH_PACKAGER=1
+RC_FILE_PATH=/Users/runner/.bashrc
REPORT_STATS=no
-RUBY_VERSION=ruby-2.6.6
+RUNNER_OS=macOS
+RUNNER_PERFLOG=/usr/local/opt/runner/perflog
+RUNNER_TEMP=/Users/runner/work/_temp
+RUNNER_TOOL_CACHE=/Users/runner/hostedtoolcache
+RUNNER_TRACKING_ID=github_0c1419f8-feb6-47f6-8b2d-8099980a8b52
+RUNNER_WORKSPACE=/Users/runner/work/spinalcordtoolbox
SCRIPT_DIR=scripts
-SCT_DIR=/Users/travis/build/neuropoly/spinalcordtoolbox
+SCT_DIR=/Users/runner/work/spinalcordtoolbox/spinalcordtoolbox
SCT_INSTALL_TYPE=in-place
-SCT_SOURCE=/Users/travis/build/neuropoly/spinalcordtoolbox
+SCT_SOURCE=/Users/runner/work/spinalcordtoolbox/spinalcordtoolbox
SCT_VERSION=dev
SHELL=/bin/bash
SHELLOPTS=braceexpand:hashall:interactive-comments:xtrace
SHLVL=4
-SSH_AUTH_SOCK=/private/tmp/com.apple.launchd.2AtCNsNnZs/Listeners
-TERM=xterm
+SSH_AUTH_SOCK=/private/tmp/com.apple.launchd.sFBrBcEC3k/Listeners
+TERM=dumb
THE_RC=bash
-TMPDIR=/var/folders/z3/_825pg0s3jvf0hb_q8kzmg5h0000gn/T/
-TMP_DIR=/var/folders/z3/_825pg0s3jvf0hb_q8kzmg5h0000gn/T/tmp.NIkTWTnG
-TRAVIS=true
-TRAVIS_ALLOW_FAILURE=false
-TRAVIS_APP_HOST=build.travis-ci.com
-TRAVIS_APT_PROXY=http://build-cache.travisci.net
-TRAVIS_ARCH=amd64
-TRAVIS_BRANCH=master
-TRAVIS_BUILD_DIR=/Users/travis/build/neuropoly/spinalcordtoolbox
-TRAVIS_BUILD_ID=210558155
-TRAVIS_BUILD_NUMBER=14694
-TRAVIS_BUILD_STAGE_NAME=
-TRAVIS_BUILD_WEB_URL=https://travis-ci.com/neuropoly/spinalcordtoolbox/builds/210558155
-TRAVIS_CMD=./.travis.sh
-TRAVIS_COMMIT=65218e7625ed0a4128aa435aedc27339f6a952f0
-TRAVIS_COMMIT_MESSAGE='Merge ebc644d2d42928cf55ba7f12b348f662724d4ecf into 6c18169f2d2ea8f7b43438c5f505a7fb0eea83d7'
-TRAVIS_COMMIT_RANGE=42f5a2ec1857a145f2de05e2b7fb6da627b5c7e1...ebc644d2d42928cf55ba7f12b348f662724d4ecf
-TRAVIS_CPU_ARCH=amd64
-TRAVIS_DIST=notset
-TRAVIS_ENABLE_INFRA_DETECTION=true
-TRAVIS_EVENT_TYPE=pull_request
-TRAVIS_HOME=/Users/travis
-TRAVIS_INFRA=macstadium
-TRAVIS_INIT=notset
-TRAVIS_INTERNAL_RUBY_REGEX='^ruby-(2\.[0-4]\.[0-9]|1\.9\.3)'
-TRAVIS_JOB_ID=464757046
-TRAVIS_JOB_NAME='OSX 10.15 (Catalina)'
-TRAVIS_JOB_NUMBER=14694.3
-TRAVIS_JOB_WEB_URL=https://travis-ci.com/neuropoly/spinalcordtoolbox/jobs/464757046
-TRAVIS_LANGUAGE=ruby
-TRAVIS_OSX_IMAGE=xcode12.2
-TRAVIS_OS_NAME=osx
-TRAVIS_PULL_REQUEST=3125
-TRAVIS_PULL_REQUEST_BRANCH=ng/ci-gh-actions
-TRAVIS_PULL_REQUEST_SHA=ebc644d2d42928cf55ba7f12b348f662724d4ecf
-TRAVIS_PULL_REQUEST_SLUG=neuropoly/spinalcordtoolbox
-TRAVIS_REPO_SLUG=neuropoly/spinalcordtoolbox
-TRAVIS_ROOT=/
-TRAVIS_RUBY_VERSION=default
-TRAVIS_SECURE_ENV_VARS=false
-TRAVIS_SUDO=true
-TRAVIS_TAG=
-TRAVIS_TEST_RESULT=
-TRAVIS_TIMER_ID=1c567080
-TRAVIS_TIMER_START_TIME=1608871730959253000
-TRAVIS_TMPDIR=/var/folders/z3/_825pg0s3jvf0hb_q8kzmg5h0000gn/T/tmp.lM8NZWbk
+TMPDIR=/var/folders/24/8k48jl6d249_n_qfxwsl6xvm0000gn/T/
+TMP_DIR=/var/folders/24/8k48jl6d249_n_qfxwsl6xvm0000gn/T/tmp.3oleVKUV
UID=501
-USER=travis
+USER=runner
+VCPKG_INSTALLATION_ROOT=/usr/local/share/vcpkg
+XCODE_10_DEVELOPER_DIR=/Applications/Xcode_10.3.app/Contents/Developer
+XCODE_11_DEVELOPER_DIR=/Applications/Xcode_11.7.app/Contents/Developer
+XCODE_12_DEVELOPER_DIR=/Applications/Xcode_12.3.app/Contents/Developer
XPC_FLAGS=0x0
XPC_SERVICE_NAME=0
_=-a
-__CF_USER_TEXT_ENCODING=0x1F5:0x0:0x0
+__CF_USER_TEXT_ENCODING=0x1F5:0:0
bidon=0
change_default_path=y
-cmd='bash /var/folders/z3/_825pg0s3jvf0hb_q8kzmg5h0000gn/T/tmp.NIkTWTnG/miniconda.sh -p /Users/travis/build/neuropoly/spinalcordtoolbox/python -b -f'
+cmd='bash /var/folders/24/8k48jl6d249_n_qfxwsl6xvm0000gn/T/tmp.3oleVKUV/miniconda.sh -p /Users/runner/work/spinalcordtoolbox/spinalcordtoolbox/python -b -f'
e_status=0
macOSmajor=10
macOSminor=15
opt='?'
-profiles=/Users/travis/.bash_profile
-rvm_bin_path=/Users/travis/.rvm/bin
-rvm_path=/Users/travis/.rvm
-rvm_prefix=/Users/travis
-rvm_version='1.29.9 (latest)'
+profiles=/Users/runner/.bash_profile
sourceblock=$'\nif [[ -n "$BASH_VERSION" ]]; then\n # include .bashrc if it exists\n if [[ -f "$HOME/.bashrc" ]]; then\n . "$HOME/.bashrc"\n fi\nfi'
sw_vers_output=$'ProductVersion:\t10.15.7'
-txt='bash /var/folders/z3/_825pg0s3jvf0hb_q8kzmg5h0000gn/T/tmp.NIkTWTnG/miniconda.sh -p /Users/travis/build/neuropoly/spinalcordtoolbox/python -b -f'
+txt='bash /var/folders/24/8k48jl6d249_n_qfxwsl6xvm0000gn/T/tmp.3oleVKUV/miniconda.sh -p /Users/runner/work/spinalcordtoolbox/spinalcordtoolbox/python -b -f'
type=code
-uname_output='Darwin Traviss-Mac.local 19.6.0 Darwin Kernel Version 19.6.0: Mon Aug 31 22:12:52 PDT 2020; root:xnu-6153.141.2~1/RELEASE_X86_64 x86_64'
+uname_output='Darwin Mac-1608870864276.local 19.6.0 Darwin Kernel Version 19.6.0: Thu Oct 29 22:56:45 PDT 2020; root:xnu-6153.141.2.2~1/RELEASE_X86_64 x86_64'
Combing through this, two things jump out at me:
DISPLAY set and is running xquartz and Github is notTERM=dumb set:
> -SSH_AUTH_SOCK=/private/tmp/com.apple.launchd.2AtCNsNnZs/Listeners
> -TERM=xterm
> +SSH_AUTH_SOCK=/private/tmp/com.apple.launchd.sFBrBcEC3k/Listeners
> +TERM=dumb
>PAGER=cat set, i.e. make less a no-op.DEBIAN_FRONTEND=noninteractiveGIT_ASKPASS=echoCONTINUOUS_INTEGRATION=trueTERM=dumb could definitely choke something like conda create; if conda create is expecting to be giving prompts. Or any of them really.
Anyway I did a quick test and applied by yes |-is-bad patch from #3102 and it..worked: https://github.com/neuropoly/spinalcordtoolbox/runs/1607610135 (I cancelled that run but if you scroll down you can see it getting past conda create and onto "sourcing conda.sh").
Then I went back and added
export TERM=xterm
export PAGER=cat
but it hung that time. Then I added the other three; still a hang.
Okay I am giving up on figuring this out. This is ...quite a mystery. Maybe something to do with Github Actions doing shennaigans with file descriptors? conda create -y works fine, and I wanted to get that in anyway, so that's what I'm going to do.
It's also hanging
and at
once the install finishing.
It seems that yes is refusing to die when its stdout is closed.
I can reproduce it with just
Travis gives
+command -v yes
/usr/bin/yes
+yes
+head -n 10
y
y
y
y
y
y
y
y
y
y
+echo 'That was 10 yeses'
That was 10 yeses
Github gives
+ command -v yes
/usr/bin/yes
+ yes
+ head -n 10
y
y
y
y
y
y
y
y
y
y
and then a hang.
@kousu also ran into this, see minimum repro:https://github.com/Drulex/conda-test-actions.
Also my previous WIP: https://github.com/Drulex/spinalcordtoolbox (see aj-basic-github-actions branch).
From my experimentation I only saw the macOS hang on Github hosted instances, not on self-hosted runners.
Oh cool! Just finished mine at https://github.com/kousu/hanging-actions/. conda turned out to be a red-herring, it's something to do with yes, or maybe with how pipes work on those runners (which would be very bizarre).
Whatever. I'm kicking it upstream over to https://github.com/actions/virtual-environments/discussions/2352; if that doesn't get any answers hopefully at least they can point us to the right project to file a bug against.
When you say "self-hosted runners" do you mean you tested self-hosting macOS runners?
This seems to be the master script that generates their (probably buggy?) runners: https://github.com/actions/virtual-environments/blob/main/images/macos/templates/macOS-10.15.json, calling out to many of the scripts in https://github.com/actions/virtual-environments/tree/main/images/macos/provision/core. I'm not going to dig into that until someone else gives me a tip.
In the meantime I'm going to solve this by rebasing on my (growing by the day) patch #3102.
Weird thing about Actions: if the platform you're on doesn't happen to have git installed, actions/checkout shrugs and downloads a .zip via the Github web API instead.
This bugs out our installer: it makes the installer think it's installing a release because
I'm not sure how much of a problem this will be. We can head-off problems by running SCT_INSTALL_TYPE=in-place ./install_sct? Is that a good idea?
Update on the macOS hang: one of Github's employees worked on this on Christmas. They reproduced it by installing https://github.com/actions/runner in self-hosted mode, so I'm pretty sure it's a bug and I'm kicking it over to https://github.com/actions/runner/issues/884 and now I'm going to try to forget about it.
Maybe, maybe, the bug is in the combination of their macOS images plus their macOS runner, but I suspect it's just that they're subtlely mishandling dup() in their runner.
I realized just now that if we want to do this well, we should have these transition periods:
That way there won't be too many surprises. I didn't know how to actually implement 2, but @joshuacwnewton figured it out: just go to https://github.com/neuropoly/spinalcordtoolbox/settings/branch_protection_rules/12845205 and turn off/on the different checks; hopefully disabling the Travis check still means Travis runs, just that it doesn't block:

here, the labels are mostly by checks added by #3125, it picked them up even though they're not on master yet, and Travis means all the platforms checked by Travis.
I think I mentioned this above but: actions/checkout@v2 will fall back on downloading a .zip over HTTP if it doesn't find git on the platform it's on. Actions' ubuntu-16.04 doesn't come pre-installed with git, but ubuntu-18.04 does. For us, this means we fall into our release install path (#3140).
Should we try to fix this? So that every platform is running the same thing?
I tried to fix this by adding
and using SCT_DIR explicitly
buttttt this fails under CI (both Travis and Actions) because, contrary to the advice we give:
https://travis-ci.com/github/neuropoly/spinalcordtoolbox/jobs/466159959#L987
Open a new Terminal window to load environment variables, or run:
source /home/travis/.bashrc
Actually doing that hits
# If not running interactively, don't do anything
[[ $- != *i* ]] && return
(most distros come with a line like that prepackaged in their bashrc's).
This is a stumbling block:
What we should be doing is installing to ~/.local/bin/ or /usr/bin/ or whatever other standard system paths are out there; if we were using a pure-pip install and skipped conda entirely we would have this already in place. But because we don't we need install_sct to inform us where it decided to install, instead of us being able to tell it where to install. And the only way it currently shares that information is via .bashrc. Which I can't source under CI on most distros.
I worked around it by changing ./install_sct to use ~/.bash_profile instead. That's an API change and I'm not confident in it because even after all this time using unix I still don't know all the corners of shell thoroughly, but I think it's right: ~/.bashrc is specifically for interactive session; for more account-wide things, you use /etc/profile and it's subsidiaries.
BUG: I'm seeing intermittent crashes -- seems like a concurrency bug -- but so far only on macOS on Actions:
e.g. https://github.com/neuropoly/spinalcordtoolbox/runs/1625704027?check_suite_focus=true#step:4:2097
INTERNALERROR> E AssertionError: Traceback (most recent call last):
INTERNALERROR> E File "/Users/runner/work/spinalcordtoolbox/spinalcordtoolbox/python/envs/venv_sct/lib/python3.6/site-packages/_pytest/main.py", line 267, in wrap_session
INTERNALERROR> E config.hook.pytest_sessionstart(session=session)
INTERNALERROR> E File "/Users/runner/work/spinalcordtoolbox/spinalcordtoolbox/python/envs/venv_sct/lib/python3.6/site-packages/pluggy/hooks.py", line 286, in __call__
INTERNALERROR> E return self._hookexec(self, self.get_hookimpls(), kwargs)
INTERNALERROR> E File "/Users/runner/work/spinalcordtoolbox/spinalcordtoolbox/python/envs/venv_sct/lib/python3.6/site-packages/pluggy/manager.py", line 93, in _hookexec
INTERNALERROR> E return self._inner_hookexec(hook, methods, kwargs)
INTERNALERROR> E File "/Users/runner/work/spinalcordtoolbox/spinalcordtoolbox/python/envs/venv_sct/lib/python3.6/site-packages/pluggy/manager.py", line 87, in <lambda>
INTERNALERROR> E firstresult=hook.spec.opts.get("firstresult") if hook.spec else False,
INTERNALERROR> E File "/Users/runner/work/spinalcordtoolbox/spinalcordtoolbox/python/envs/venv_sct/lib/python3.6/site-packages/pluggy/callers.py", line 208, in _multicall
INTERNALERROR> E return outcome.get_result()
INTERNALERROR> E File "/Users/runner/work/spinalcordtoolbox/spinalcordtoolbox/python/envs/venv_sct/lib/python3.6/site-packages/pluggy/callers.py", line 80, in get_result
INTERNALERROR> E raise ex[1].with_traceback(ex[2])
INTERNALERROR> E File "/Users/runner/work/spinalcordtoolbox/spinalcordtoolbox/python/envs/venv_sct/lib/python3.6/site-packages/pluggy/callers.py", line 187, in _multicall
INTERNALERROR> E res = hook_impl.function(*args)
INTERNALERROR> E File "/Users/runner/work/spinalcordtoolbox/spinalcordtoolbox/conftest.py", line 31, in pytest_sessionstart
INTERNALERROR> E downloader.main(['-d', 'sct_testing_data', '-o', sct_test_path()])
INTERNALERROR> E File "/Users/runner/work/spinalcordtoolbox/spinalcordtoolbox/spinalcordtoolbox/scripts/sct_download_data.py", line 160, in main
INTERNALERROR> E install_data(url, dest_folder, keep=arguments.k)
INTERNALERROR> E File "/Users/runner/work/spinalcordtoolbox/spinalcordtoolbox/spinalcordtoolbox/download.py", line 199, in install_data
INTERNALERROR> E shutil.copy(srcpath, dstpath)
INTERNALERROR> E File "/Users/runner/work/spinalcordtoolbox/spinalcordtoolbox/python/envs/venv_sct/lib/python3.6/shutil.py", line 246, in copy
INTERNALERROR> E copymode(src, dst, follow_symlinks=follow_symlinks)
INTERNALERROR> E File "/Users/runner/work/spinalcordtoolbox/spinalcordtoolbox/python/envs/venv_sct/lib/python3.6/shutil.py", line 144, in copymode
INTERNALERROR> E chmod_func(dst, stat.S_IMODE(st.st_mode))
INTERNALERROR> E FileNotFoundError: [Errno 2] No such file or directory: 'sct_testing_data/template/template/PAM50_small_t2.nii.gz'
INTERNALERROR> E assert False
INTERNALERROR>
INTERNALERROR> python/envs/venv_sct/lib/python3.6/site-packages/xdist/dsession.py:187: AssertionError
[gw0] node down: Not properly terminated
but the previous run https://github.com/neuropoly/spinalcordtoolbox/runs/1625475620?check_suite_focus=true passed.
The diff between them was simply making an in-place install happen
which shouldn't make a difference, because the macOS machines have git installed which leads to both being in-place anyway:
So, it looks like pytest is spawning parallel testers here? Why haven't we noticed that before? Is it new? It seems likely that if, say, three tests all start at the same time then there's a race to get to https://github.com/neuropoly/spinalcordtoolbox/blob/6172cd22feeb0f64ac1bd68bce3f89f7109e68b8/conftest.py#L28-L31 first.
Here's a re-run job: https://github.com/neuropoly/spinalcordtoolbox/runs/1625782954: it passed.
Found a UI bug in Actions:
More issues from Actions:
BUG: I'm seeing intermittent crashes -- seems like a concurrency bug -- but so far only on macOS on Actions:
So, it looks like pytest is spawning parallel testers here? Why haven't we noticed that before? Is it new? It seems likely that if, say, three tests all start at the same time then there's a race to get to
Aha! I've experienced this exact same issue in https://github.com/neuropoly/spinalcordtoolbox/pull/3116#discussion_r549001772. From that comment:
Sometimes it works, but sometimes redownloading the data causes a FileNotFound error, because the data folder will be wiped for each worker. (See attached
log.txt.) This is related to #2957 and #2959.I can avoid this locally by setting
-n 1as recommended in #2957 (comment), but I'm not sure what the behavior will be like in the CI.
Possibly a sign that we should address #2957/#2959 sooner than later.
Whoops, wrong button. :sweat_smile:
BUG: after a while (after a cache times out? or something?) github actions won't download logs properly to the main UI. The logs are still there if you do "Download Raw Log" but you can't link to specific lines like you can with Travis.
e.g. a week ago this link took you to the line mentioned https://github.com/neuropoly/spinalcordtoolbox/runs/1629734360#step:3:2105, but now it just says "Error"

@kousu Strange... When I click that link, I get taken to the right line. I wonder what happened with that screenshot. :worried:

Now that we've started to use up credits for TravisCI (see #3273), I've added the high priority label here. PR #3125 is ready, but it just needs a reviewer at this point.
Most helpful comment
Update on the macOS hang: one of Github's employees worked on this on Christmas. They reproduced it by installing https://github.com/actions/runner in self-hosted mode, so I'm pretty sure it's a bug and I'm kicking it over to https://github.com/actions/runner/issues/884 and now I'm going to try to forget about it.
Maybe, maybe, the bug is in the combination of their macOS images plus their macOS runner, but I suspect it's just that they're subtlely mishandling
dup()in their runner.