Dvc: error on dvc get: GLIBC_2.18 not found. installed with fedora/centos

Created on 6 Sep 2019  路  21Comments  路  Source: iterative/dvc

[hadoop@hostname ~]$ dvc get \
>     "https://x-token-auth:$GITHUB_TOKEN@$MODEL_REPO_URL" \
>     experiments/prod/model.pickle \
>     -o /mnt/repo/model.pickle
ERROR: failed to get 'experiments/prod/model.pickle' from 'https://x-token-auth:[email protected]/WPMedia/REPOURL.git' - Failed to clone repo 'https://x-token-auth:[email protected]/WPMedia/REPOURL.git' to '/tmp/tmpxfz0sxdedvc-repo': Cmd('git') failed due to: exit code(128)
  cmdline: git clone --no-single-branch -v https://x-token-auth:[email protected]/WPMedia/REPOURL.git /tmp/tmpxfz0sxdedvc-repo
  stderr: 'Cloning into '/tmp/tmpxfz0sxdedvc-repo'...
/usr/libexec/git-core/git-remote-https: /lib64/libc.so.6: version `GLIBC_2.18' not found (required by /usr/lib/dvc/libstdc++.so.6)
'

Please provide information about your setup
DVC version(i.e. dvc --version), Platform and method of installation (pip, homebrew, pkg Mac, exe (Windows), DEB(Linux), RPM(Linux))

  • version: 0.58.1
  • used fedora/centos install (https://dvc.org/rpm/dvc.repo)
  • on standard emr cluster
bug c8-full-day p0-critical

Most helpful comment

Thanks!!

All 21 comments

it worked with pip. must be an issue with the fedora/centos install in particular

sudo yum remove dvc -y
pip install dvc

Hi @AlJohri !

Could you show echo $LD_LIBRARY_PATH output, please?

Some backstory, so our rpm/deb packages are built with PyInstaller, which sets LD_LIBRARY_PATH to a custom value in order to use the libraries that it carries, instead of system ones. When we are running anything from dvc, we restore original LD_LIBRARY_PATH, and I think that is what went wrong here when we were spawning git. I'm investigating right now.

Ok, found the cause. We forgot to restore original env in https://github.com/iterative/dvc/blob/0.58.1/dvc/external_repo.py#L58 . Will prepare a patch ASAP. Thanks for reporting!

@AlJohri 0.59.0 is out, please give it a try and let us know if it worked for you 馃檪 Thanks for the feedback!

I'm still getting this error:

  stderr: 'Cloning into '/tmp/tmpif97r_sqdvc-repo'...
/usr/libexec/git-core/git-remote-https: /lib64/libc.so.6: version `GLIBC_2.18' not found (required by /usr/lib/dvc/libstdc++.so.6)

You can replicate by starting a vanilla EMR cluster (or perhaps an EC2 with Amazon Linux) and running this command:

sudo wget https://dvc.org/rpm/dvc.repo -O /etc/yum.repos.d/dvc.repo
sudo yum install dvc -y

Am I missing a dependency?

@AlJohri Oh, that's bad. Are you sure you are using the newest version? We'll get back to investigating this ASAP. Thank you for the feedback.

I'm not sure if I was using the latest version (I'll check soon) but I definitely installed it on a fresh cluster so I got whatever version https://dvc.org/rpm/dvc.repo defines.

@AlJohri Got it. I'll also try to reproduce in the morning with some older docker image. :slightly_smiling_face:

Sorry, other stuff got in the way. Will try to get back to this ASAP.

@AlJohri I am trying to reproduce the problem using docker and aws instance, and I am unable to, could you ssh to your instance and check what linux is it running?
something like cat /etc/os-release

Was able to reproduce with following Dockerfile:

FROM centos:6

RUN yum install wget -y
RUN wget https://dvc.org/rpm/dvc.repo -O /etc/yum.repos.d/dvc.repo
RUN yum install dvc -y
RUN dvc get https://github.com/iterative/dvc scripts/innosetup/dvc.ico

Latest amazonlinux image does not suffer from this problem. Emr cluster does not use amazonlinux but Amazon Linux AMI. Investigating the difference.

@pared Just to be clear, that one reproduces not the same but similar problem, which might be just caused by our binary being built on a never machine than centos6. The on in the original issue is about git binary that we are calling.

Ah, right, didn't notice, thanks @efiop

I was able to reproduce the problem on vanilla EMR cluster machine.

@AlJohri another patch on the way.

To put some context for future:
We used fix_env to delete LD_LIBRARY_PATH from env if it was empty before Pyinstaller did its magic.

The problem is that gitpython did not "use" provided env, but used it to update its own git.cmd._environment.
If there was no "LD_LIBRARY_PATH" in provided env, its value has not been updated.

So in theory fix env["LD_LIBRARY_PATH"]=None should work.
Surprisingly it did not, thought env["LD_LIBRARY_PATH"]="" did.

Need to investigate why the first version does not work, might be another bug on our site, or something wrong with gitpython.

@AlJohri Patch is merged to master, I was unable to reproduce the error after patch. A new release should be available soon.

@AlJohri 0.61.0 is out, please feel free to upgrade :slightly_smiling_face:

Thanks!!

Was this page helpful?
0 / 5 - 0 ratings