I have a command that relies on cupy. When I run it with dvc run ..., it raises this error and fails:
/tmp/_MEIeW2y4B/libstdc++.so.6: version `GLIBCXX_3.4.20' not found
If I run it without dvc run, no error is raised. I think this is b/c dvc run uses p = subprocess.Popen(self.cmd, cwd=self.cwd, shell=True) and it's messing up the environment variables. Would passing in the environment variables solve this problem?
env = os.environ
proc = subprocess.Popen(args, env=env)
Hi @yukw777 !
Thank you for the analysis, great catch! We should definitely not clean the env before running the command. I will prepare a patch shortly.
Hm, looking into it closer, it seems like I am not able to reproduce. env seems to be preserved across dvc run. Could you please try running these commands and check if they output the same env:
$ printenv
$ dvc run -f printenv.dvc printenv
and also, just to make sure, could you please try these as well:
$ echo $0
$ dvc run -f sh.dvc 'echo $0'
You can then safely remove sh.dvc and printenv.dvc.
hmm env vars might be a red herring..
Here's the minimal repro steps:
# inside a git repo with dvc initialized
virtualenv venv -p python3
source venv/bin/activate
# now we're in virtualenv
pip install cupy
python -c "import cupy" # this succeeds
dvc run 'python -c "import cupy"' . # this fails
Error message:
Using 'Dvcfile' as a stage file
Reproducing 'Dvcfile':
python -c "import cupy"
Traceback (most recent call last):
File "/awsnas/peter/test/venv/lib/python3.6/site-packages/cupy/__init__.py", line 11, in <module>
from cupy import core # NOQA
File "/awsnas/peter/test/venv/lib/python3.6/site-packages/cupy/core/__init__.py", line 1, in <module>
from cupy.core import core # NOQA
File "cupy/core/core.pyx", line 1, in init cupy.core.core
File "/awsnas/peter/test/venv/lib/python3.6/site-packages/cupy/cuda/__init__.py", line 4, in <module>
from cupy.cuda import compiler # NOQA
File "/awsnas/peter/test/venv/lib/python3.6/site-packages/cupy/cuda/compiler.py", line 12, in <module>
from cupy.cuda import function
File "cupy/cuda/memory.pxd", line 7, in init cupy.cuda.function
ImportError: /tmp/_MEIUoawzf/libstdc++.so.6: version `GLIBCXX_3.4.20' not found (required by /awsnas/peter/test/venv/lib/python3.6/site-packages/cupy/cuda/memory.cpython-36m-x86_64-linux-gnu.so)
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/awsnas/peter/test/venv/lib/python3.6/site-packages/cupy/__init__.py", line 32, in <module>
six.reraise(ImportError, ImportError(msg), exc_info[2])
File "/awsnas/peter/test/venv/lib/python3.6/site-packages/six.py", line 692, in reraise
raise value.with_traceback(tb)
File "/awsnas/peter/test/venv/lib/python3.6/site-packages/cupy/__init__.py", line 11, in <module>
from cupy import core # NOQA
File "/awsnas/peter/test/venv/lib/python3.6/site-packages/cupy/core/__init__.py", line 1, in <module>
from cupy.core import core # NOQA
File "cupy/core/core.pyx", line 1, in init cupy.core.core
File "/awsnas/peter/test/venv/lib/python3.6/site-packages/cupy/cuda/__init__.py", line 4, in <module>
from cupy.cuda import compiler # NOQA
File "/awsnas/peter/test/venv/lib/python3.6/site-packages/cupy/cuda/compiler.py", line 12, in <module>
from cupy.cuda import function
File "cupy/cuda/memory.pxd", line 7, in init cupy.cuda.function
ImportError: CuPy is not correctly installed.
If you are using wheel distribution (cupy-cudaXX), make sure that the version of CuPy you installed matches with the version of CUDA on your host.
Also, confirm that only one CuPy package is installed:
$ pip freeze
If you are building CuPy from source, please check your environment, uninstall CuPy and reinstall it with:
$ pip install cupy --no-cache-dir -vvvv
Check the Installation Guide for details:
https://docs-cupy.chainer.org/en/latest/install.html
original error: /tmp/_MEIUoawzf/libstdc++.so.6: version `GLIBCXX_3.4.20' not found (required by /awsnas/peter/test/venv/lib/python3.6/site-packages/cupy/cuda/memory.cpython-36m-x86_64-linux-gnu.so)
Failed to run command: Stage 'Dvcfile' cmd python -c "import cupy" failed
Actually the env vars are different:
$ printenv
...
LD_LIBRARY_PATH=/usr/local/cuda/lib64
...
$ dvc run -f printenv.dvc printenv
...
LD_LIBRARY_PATH=/tmp/_MEI4nThRR:/usr/local/cuda/lib64
LD_LIBRARY_PATH_ORIG=/usr/local/cuda/lib64
...
The shells are different too.
$ echo $0
-bash
$ dvc run -f sh.dvc 'echo $0'
Reproducing 'sh.dvc':
echo $0
/bin/sh
I still think this is the culprit.
Yes, that was my suspicion, that you are not using a default shell and thus your env is different. That makes sense now. Specifying env explicitly should solve this issue. The fix is going to be released in 0.9.8, that is going to be released in a week or so.
Thanks,
Ruslan
Btw, as a workaround, could you make sure that your default shell matches the one you are using?
I.e. it looks like you are using bash, but for some reason default shell for your user is /bin/sh. Could you try running chsh -s $(which bash) $USER and then check that echo $0 from the previous example shows bash in both cases? That should solve your issue, while I am looking into trying to solve it on dvc side in the mean time.
EDIT: I'm wrong, /bin/sh is the default one for Popen(shell=True) on Unix. Looking into solving this...
mm the env vars are still different, so didn't work :/
Another interesting thing I found:
$ dvc run echo $0
Using 'Dvcfile' as a stage file
Reproducing 'Dvcfile':
echo -bash
-bash
$ dvc run 'echo $0'
Using 'Dvcfile' as a stage file
Reproducing 'Dvcfile':
echo $0
/bin/sh
It probably has to do with how a single string command is run vs an array of strings is run.
@yukw777 thank you for trying that out! It turns out I was wrong(see edit above). We should use executable arg for Popen to specify the same shell that user is using when running a command. Preparing a patch right now.
Another interesting thing I found:
Yep, that is caused by the single quotes around the command itself. So in the first case dvc will receive a command echo -bash because $0 is being evaluated by shell before actually passing it to dvc, but in the second one - echo $0, because shell passed the command to dvc as a constant string.