Arcade: Helix reporting fails due to non-POSIX compliant line in Helix reporter run

Created on 10 Mar 2020  路  18Comments  路  Source: dotnet/arcade

  • [ ] This issue is blocking
  • [ ] This issue is causing unreasonable pain

Helix reporting fails due to this line in run.sh not being POSIX compliant and -T not being a valid option on BSD mv (macOS):
https://github.com/dotnet/arcade/blob/9131f071c1b48da0f35466e632a212913f21a3f0/src/Microsoft.DotNet.Helix/Sdk/tools/azure-pipelines/reporter/run.sh#L17

The uploading of testResults.xml succeeds, but the test reporting to AzDO fails. Only the Job node shows up as failed.

Example:
_Agent_: dci-mac-build-134
_Job_ : b5c4d0e-515d-4c1b-8382-9f6c2add90b2
_Workitem_: DiagnosticTests

+ mv -T /Users/helix-runner/.azdo-env-tmp /Users/helix-runner/.azdo-env
mv: illegal option -- T
usage: mv [-f | -i | -n] [-v] source target
       mv [-f | -i | -n] [-v] source ... directory
+ _OLD_PYTHONPATH=
+ export PYTHONPATH=
+ PYTHONPATH=
+ /Users/helix-runner/.azdo-env/bin/python -c 'import azure.devops'
/tmp/helix/working/B65F09CB/p/reporter/run.sh: line 24: /Users/helix-runner/.azdo-env/bin/python: No such file or directory
First Responder

Most helpful comment

This was fixed in #5090

All 18 comments

Looks like the correct fix is to use -h for Darwin systems branching on uname -s.

whoa. @alexperovich have we noticed this previously? This implies to me that we haven't been reporting results from Macs for like 6 months or so

I haven't noticed this before. That is really odd that is it now failing. I 100% saw this working on a mac when testing it.

This definitely works for most cases. The runtime definitely gets updates and the tests get reported. I'm not sure if this build agent has a different version of mv (maybe other machines have a brew installed one). We run on OSX.1014.Amd64. Is there anyone else testing in internal? I found aspnet is but their variables still show .vsts-env for the virtualenv. The script is still using -T but it's a different agent (129). Is it maybe different mv getting used?

Also, looks like -h and -T are not available on macOS's BSD mv. Maybe the machines have a brew variant installed (say like a coreutils variant)?

@alexperovich, anything we can do to work around this and #5013. We have no reporting for at least 3 legs with this.

I can't think of any easy workaround for this issue. You could change the file in the nuget package after restore, but that wouldn't work well in a build.

@alexperovich any advice on how to proceed here? Reviewing tests on OSX is getting tedious. All our CI and PR builds look like this: https://dev.azure.com/dnceng/internal/_build/results?buildId=563039&_a=summary&view=results. We lose history and given that this is out of band testing history can often help with regression investigation.

cc: @dotnet/dotnet-diag

Can we just change the top to /bin/bash?

Do you install a special version of bash on the macOS machines? By default these are not in BSD's bash 3.2.

Did we only recently stop installing python to the path? And therefore triggered this if statement for the first time ever? I know @MattGal has been dealing with some fallout from that in other queues.

That could cause this. If the newer macs don't have this directory populated then we would start hitting this case.

This is super weird that we are only noticing this now. Do we know when (if ever) this used to work? The code is 6 months old. :-( Unfortunately, because this is all done in code in arcade, we don't have good telemetry on this in Helix (because as far as Helix is concerned, everything is fine... we ran the workitem we where given)

When I made https://github.com/dotnet/arcade/pull/4617 it was working on OSX. I specifically tested it there because that is where it was failing.

I had been meaning to update helix for a long while now because this has never worked well for us. We used to get PermissionError: [Errno 13] Permission denied: '/Users/helix-runner/.azure-devops/python-sdk/cache/options.json', meaning we were at luck of which agent got our runs. I believe Ulises worked on this and now I see some further changes for a virtual environment.

However now that I've had a chance to do all the changes necessary I only see the illegal option. This happens on any OSX run and on ARM64 windows machines for different reasons.

I tried it on 10.13, 10.14, and 10.15, and even with bash as the shell, I get the illegal option -T for mv.

I'm not sure how this worked... what is "-T" supposed to do so that we can change it to something else?

-T means 'don't treat the destination as a directory to copy the source into'. mv /a/b/c /b does something different based on if b exists. If b exists and is a directory it copies c into it, so you get /b/c. If b doesn't exist it renames /a/b/c to /b.

So all we have to do is "mkdir $ENV_PATH" before the mv? That seems easy enough. :-)

This was fixed in #5090

Was this page helpful?
0 / 5 - 0 ratings