Nextflow: Environment activation fails with recent versions of Conda

Created on 18 Jun 2019  Â·  23Comments  Â·  Source: nextflow-io/nextflow

Bug report

Expected behavior and actual behavior

If the user tries to execute a Nextflow pipeline that contains processes which run in conda environments, the pipeline fails with:

Command wrapper:
  .command.run: line 266: activate: No such file or directory

Steps to reproduce the problem

  • Ensure that you do not already have a conda environment active.
  • Ensure that a recent version of the conda utility is installed (tested with 4.6.12 and 4.6.14)
  • Try to run any Nextflow pipeline that contains a process with a conda directive.

Environment

$ nextflow info
Version: 19.04.1 build 5072
Modified: 03-05-2019 12:29 UTC (14:29 CEST)
System: Linux 4.4.0-116-generic
Runtime: Groovy 2.5.6 on OpenJDK 64-Bit Server VM 1.8.0_212-8u212-b03-0ubuntu1.16.04.1-b03
Encoding: UTF-8 (UTF-8)

$ conda --version
conda 4.6.12

Additional context

The Anaconda project has recently been working on changing the way in which conda environments are activated and deactivated. The activate and deactivate shell scripts are no longer put in the user's path by default — only the conda utility itself is in the user's path.

Activating a conda environment is now no longer done by sourcing the activate shell script, but with conda activate.

Nextflow constructs a .command.run script that tries to activate the conda environment using source activate, which fails due to the activate script not being in the path.

It should be possible to fix by replacing source activate with conda activate.

help wanted

Most helpful comment

I started working on a PR. I'm not so familiar with the code-base, though, so maybe the maintainers can finalize it, now that 'conda pieces' are provided.

In the meanwhile, a workaround is to symlink the activate binary to your PATH, e.g.

ln -s /apps/miniconda/bin/activate /usr/bin/activate

All 23 comments

Any idea since which version conda activate is available? changing it potentially could break users with old Conda version.

Yes, I understand that it can be tricky. :(

According to the conda release notes, conda activate was introduced in 4.4.0 (2017-12-20):

https://conda.io/projects/conda/en/latest/release-notes.html#id149

More than one year old. Seems fair to expect it as a minimum requirement.

Thank you!

@wjv Have you successfully tested this change?

For me this does not work with conda 4.7.10. I ran conda init in my shell and there conda activate works, but when conda activate is run in Nextflow jobs conda complains about a command not found error and that init needs to be run first. Maybe this is a problem with our executor specifically though.

I just tried this on a different cluster with a different executor (SLURM) and a fresh install of conda, and I have the same problem as above.

So after some more looking into this it seems the probem to be that Conda uses BASH functions for conda activate that are not exported by the executor. This is not specific to Nextflow but a problem in any script that is submitted to an executor. In my case even the -V option to qsub, which causes the executor to export all env vars, does not help.

The "solution" is to paste the code block that conda init adds to the .bashrc file into the work script. Adding this code to getCondaActivateSnippet() in BashWrapperBuilder.groovy works, too, as a test. How to actually implement this, I have no idea.

This sounds ugly. What is supposed to be the code block to be added?

@pditommaso I’m currently travelling and have very limited internet access. I’m therefore not able to test the change. :-(

What @Shellfishgene says is how I understand it too: Modern versions of conda rely on certain shell functions being defined, and provides conda init to add a block to your shell startup file.

The block that conda init adds to your shell startup doens’t actually define the required functions; instead it sources a shell script that exists in the Anaconda installation which contains these functions. It may be possible to determine the location of this script programmatically and source it from within Nextflow’s execution environment…?

I’d love to help out with this, but without access to my development servers and network I can’t really do that right now. :-(

The block that conda init adds to your shell startup doens’t actually define the required functions; instead it sources a shell script that exists in the Anaconda installation which contains these functions. It may be possible to determine the location of this script programmatically and source it from within Nextflow’s execution environment…?

So this is for example what conda adds in my .bashrc, but it depends on OS and shell i guess.
```# >>> conda initialize >>>

!! Contents within this block are managed by 'conda init' !!

__conda_setup="$('/home/user/miniconda3/bin/conda' 'shell.bash' 'hook' 2> /dev/null)"
if [ $? -eq 0 ]; then
eval "$__conda_setup"
else
if [ -f "/home/user/miniconda3/etc/profile.d/conda.sh" ]; then
. "/home/user/miniconda3/etc/profile.d/conda.sh"
else
export PATH="/home/user/miniconda3/bin:$PATH"
fi
fi
unset __conda_setup

<<< conda initialize <<<

```

After some discussion on the bioconda channel I also tried to run export -f conda and then use -V scheduler option to export all vars, but that did not work for me eiter.
Nextflow could of course find this block in the user's .bashrc, and add it to the work scripts, but that does not feel very elegant.
Actually just adding source ~/.bashrc might work, not sure if that does not have other uninteded consequences.

Keep in mind not everyone uses bash. My Conda initialisation is in my ~/.zshrc :)

Keep in mind not everyone uses bash

That's not a problem, Bash is a minimal requirement for NF. More worried about env initialisation. It seems that conda init is only available in recent conda version.

That's not a problem, Bash is a minimal requirement for NF. More worried about env initialisation. It seems that conda init is only available in recent conda version.

I don't think conda init is needed, it only checks if it needs to modify files such as .bashrc, it only needs to be run once per user and you need to source .bashrc after you run it. If I run it in a job script it just exits as it sees the files are already modified without doing anything. It's the code block I pasted above that modifes the env.

The code that conda inject in the .bashrc you posted above is not portable. How would NF be aware of how the Conda installation path for example?

Yes, I know that's not an option. I guess you could parse the whole thing out of the .bashrc (or .zshrc), but that sounds terrible. I just wanted to point out that I don't think nextflow running conda init helps here. But I clearly don't know much about all this anyway...
What's also weird, at least in our case with NQSII, ist that .bashrc is actually run by the scheduler, and it does indeed create env vars that are in there. However it does not seem to create the functions from that Conda code, which I don't understand.

Discrete mess, leave this to a Conda guru to figure out how to activate an environment without the need to modify the .bashrc file.

Adding Conda docs for reference.

I think I will roll back the current change.

I guess you could have the user supply the path to conda.sh, which is the script that exports all the functions needed, which is located in /home/user/miniconda3/etc/profile.d/conda.sh, for example. Still not very nice.
Actually you can get the path from $CONDA_PREFIX, too...

I figured out why conda complains about not being initialized properly. It's because the processes are ran inside their own bash subprocess. By default, shell functions are not exported for subprocesses. So they don't know about the shell functions defined by conda (__conda_activate, __conda_reactivate, etc). There are two options. First option is explicitly export the shell functions defined by conda. I do this by putting the snippet below into a shell script that gets executed on activation of the base environment for conda
~~~bash

content of ~/miniconda3/etc/conda/activate.d/export_conda_functions.sh

export -f conda
export -f __conda_activate
export -f __conda_reactivate
export -f __conda_hashr
~~~
Another option is to evaluate the output of conda shell.bash hook inside nxf_main, before calling conda activate .... conda shell.bash hook produces the shell code to define the shell functions used by conda. IMO, this is the more robust solution because the functions defined by conda might change in the future. I found this solution in a related issue here

The activate script still exists in the same directory where the actual conda binary is located and can still be used. This is how it's done in Snakemake.

The rough steps:

  • run conda info --json. The json contains an entry conda_prefix that points to the conda installation. [[ref](https://bitbucket.org/snakemake/snakemake/src/7a9299de29912544fa07d4a59004a89630b9f2b5/snakemake/conda.py#lines-303)]
  • The activate binary is at ${conda_prefix}/bin/activate [[ref](https://bitbucket.org/snakemake/snakemake/src/7a9299de29912544fa07d4a59004a89630b9f2b5/snakemake/conda.py#lines-363)]
  • Use the activate binary in the .run script. [[ref](https://bitbucket.org/snakemake/snakemake/src/7a9299de29912544fa07d4a59004a89630b9f2b5/snakemake/conda.py#lines-364)]

I started working on a PR. I'm not so familiar with the code-base, though, so maybe the maintainers can finalize it, now that 'conda pieces' are provided.

In the meanwhile, a workaround is to symlink the activate binary to your PATH, e.g.

ln -s /apps/miniconda/bin/activate /usr/bin/activate

Hi Paolo and NextFlow community

I am having a similar issue - Nextflow jobs fail when you do

"conda activate"

and give message

"you have to do conda init <shell>"

I saw something similar related to conda on the Conda GH issue tracker

Just wanted to check if this issue has been resolved.

Thanks,

Hello,

I have the same problem. The activate is failing, .command.run: line 260: activate: No such file or directory. Like @haqle314 mentioned I think is because the process is running in a sub-process.

Modifying this line to conda activate does not resolve the issue for me.

The only way I managed to resolved it was by replacing the line:
```
source activate /home/ubuntu/anaconda3/envs/map-gen

with 

source ~/anaconda3/etc/profile.d/conda.sh
conda activate map-gen
```

It worked like a charm. However, I do not know how to modify this on the nxf_main() so in the other processes I have (>10) I do not need to go manually to the .commad.run after it fails and replaced the line.

My conda version is 4.7.12.

Thanks!

Was this page helpful?
0 / 5 - 0 ratings