Nextflow: Broken container environment when enabling metrics

Created on 23 Aug 2019  路  15Comments  路  Source: nextflow-io/nextflow

Bug report

Expected behavior and actual behavior

When updating to the lastest release (19.07.0), all flowcraft's generated pipelines suddenly stopped working. Regardless of the process, they all terminate with bash errors (ps, uname, grep and date not found)

After some time, I traced the error to the nxf_container_env() in the .command.run file. All flowcraft generated pipelines use the same nextflow.config file, that includes the env scope.
For some reason, the paths defined there show up with escaped quotes.

```nxf_container_env() {
cat << EOF
export PYTHONPATH=\"\$baseDir/templates:\$PYTHONPATH\"
export PATH=\"/home/ines/temp/flowcraft_test/v19/bin:\$baseDir/templates:\$PATH\"
EOF
}

When manually removing those "\", i'm able to execute the `.command.run` with no issues. 
So far any alteration to alter this through the nextflow.config fileby changing how the paths are declared has failed. :/

I would expect for them to show up in the `.nextflow.run` file as

nxf_container_env() {
cat << EOF
export PYTHONPATH="\$baseDir/templates:\$PYTHONPATH"
export PATH="/home/ines/temp/flowcraft_test/v19/bin:\$baseDir/templates:\$PATH"
EOF
}
````

Steps to reproduce the problem

I've created a simple 1 component (fastqc) nextflow pipeline for testing purposes, and is available here: https://github.com/cimendes/nextflow_issue
Test data is not included

Program output

from the .command.log:

/home/ines/temp/flowcraft_test/v19/work/78/2250444327559d6f2b9e8e8c8df4db/.command.run: line 186: uname: command not found
/home/ines/temp/flowcraft_test/v19/work/78/2250444327559d6f2b9e8e8c8df4db/.command.run: line 138: grep: command not found
/home/ines/temp/flowcraft_test/v19/work/78/2250444327559d6f2b9e8e8c8df4db/.command.run: line 139: grep: command not found
/home/ines/temp/flowcraft_test/v19/work/78/2250444327559d6f2b9e8e8c8df4db/.command.run: line 200: date: command not found
Command 'ps' required by nextflow to collect task metrics cannot be found

Environment

  • Nextflow version: 19.07.0.5106
  • Java version: openjdk version "10.0.2" 2018-07-17
  • Operating system: Linux

Additional context

None at the moment

Most helpful comment

I've finally managed to fix this issue. As mentioned this was introduced by #1146. The problem was that the environment within the container was not resolved properly because of the \ escaping the quote variable value delimiter. To make it even more tricky it was happening only on specific configuration (ie. when enabling execution metrics).

I've pushed a patch that solves the problem. Thanks for reporting the problem and the troubleshooting.

All 15 comments

I can reproduce the problem in my own pipeline.
By removing the env block from the config, it works again.

@cimendes Nice one! Out of curiosity, how did you trace it to that function?

With a lot of tears, @huguesfontenelle ! ;)
I had environments with different versions of nextflow and then i ran the same pipeline in each. Then I went and compared the .command.run files of those, taking things off and adding tings (sometimes out of desperation) to see if things would run. It ran when I removed the $(nxf_container_env) from the docker command, so that clue me in to the "source" of the problem.

It looks like a side effect of #1146

Digging into this, but I'm not able to replicate the issue. I get the quoted paths as mentioned, but it's not causing any problem.

Looking better your error report it looks strange theres $baseDir unresolved variable in the wrapper. Are you sure you are not escaping that variable in the nextflow config file?

Hello! What do you mean? When you run the test pipeline you don't have the same issue? Is it the same nextflow version I mentioned? I don't have this problem when I downgrade nextflow. The $baseDir variable is not escaped in the config file, no. :/

Hi, I have the same type problem with this type of variable: OMP_NUM_THREADS=1. When I downgrade nextflow, the variable is taking into account whereas with nextflow 19.07, this variable is not usable. In the command.run the variable looks like OMP_NUM_THREADS=\"1\"

Hi,

in bash, having "'s surrounding the PATH environment variable, results in an invalid PATH.

I haven't got to much into it (it seems that some paths are always being searched, like /bin), but you can reproduce it like this:

t0rrant@marvin:~$ echo $PATH
/home/t0rrant/bin:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
t0rrant@marvin:~$ ls bin 
terraform
t0rrant@marvin:~$ which terraform
/home/t0rrant/bin/terraform
t0rrant@marvin:~$ PATH=\"$PATH\"
t0rrant@marvin:~$ echo $PATH
"/home/t0rrant/bin:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games"
t0rrant@marvin:~$ which terraform
t0rrant@marvin:~$ PATH=/home/t0rrant/bin:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
t0rrant@marvin:~$ which terraform
/home/t0rrant/bin/terraform

@rickerp gave an idea for the #1146 issue, instead of surrounding all variables with \", just surround those which contain 1 or more spaces.

Which could be developed to 1 or more special characters that would always need quoting.

I'm also running into this issue.

Escaped quotes (and others) are stored in the environment variable.

$ MYVAR="hi github" env | grep MYVAR
MYVAR=hi github
$ MYVAR=hi\ github env | grep MYVAR
MYVAR=hi github
$ MYVAR=\"hi\ github\" env | grep MYVAR
MYVAR="hi github"

If this variable is used in bash, it may be fine, depending on if you quote it.
However, other tools don't do any quote escaping - I'm using a python script right now that simply does open(os.environ.get("MY_CONFIG_FILE")) - and the difference between trying open('file.json') and open('"file.json"') matters quite a bit. I'm sure there are other places in many tools that expect normal, un-quoted environment variables.

Edit: In the meantime, I'm doing:

input:
val CONFIG_FILE from 'config.json'

script:
"""
export "CONFIG_FILE=${CONFIG_FILE}"
# rest of script...
"""

I've finally managed to fix this issue. As mentioned this was introduced by #1146. The problem was that the environment within the container was not resolved properly because of the \ escaping the quote variable value delimiter. To make it even more tricky it was happening only on specific configuration (ie. when enabling execution metrics).

I've pushed a patch that solves the problem. Thanks for reporting the problem and the troubleshooting.

Good job 馃憤

Thanks @pditommaso !! :)

Well done! =)

Just flagging a potential gotcha for anyone else that ends up here searching this error:

Command 'ps' required by nextflow to collect task metrics cannot be found

Seems pretty obvious, but make sure the ps cmd is available in the base image you want to collect metrics for! 馃槄

Just encountered this error as well. I don't know if this would deserve to be a separate issue. I think it would be nice that these metrics didn't depend on whether ps is installed...

Just flagging a potential gotcha for anyone else that ends up here searching this error:

Command 'ps' required by nextflow to collect task metrics cannot be found

Seems pretty obvious, but make sure the ps cmd is available in the base image you want to collect metrics for! sweat_smile

@toniher The bug above doesn't have to do with ps being installed or not, despite the error message saying that it did.
It had to do with "environment within the container was not resolved properly because of the \ escaping the quote variable value delimiter" (as reported in the above comment).

ps is required for all processes, whether or not you enable the metrics, as it is a core requirement for nextflow to work.
If your image doesn't have ps, you must install it (with apt-get install -y procps for example)

Was this page helpful?
0 / 5 - 0 ratings

Related issues

lindenb picture lindenb  路  3Comments

Z-Zen picture Z-Zen  路  5Comments

apeltzer picture apeltzer  路  6Comments

stevekm picture stevekm  路  5Comments

wikiselev picture wikiselev  路  8Comments