Dvc: Segfault on `dvc status`

Created on 10 Jun 2020  路  5Comments  路  Source: iterative/dvc

Bug Report

Please provide information about your setup

Output of dvc version:

$ dvc version
DVC version: 1.0.0a10
Python version: 3.7.3
Platform: Linux-4.4.0-1107-aws-x86_64-with-Ubuntu-16.04-xenial
Binary: False
Package: pip
Supported remotes: http, https, s3
Cache: reflink - not supported, hardlink - supported, symlink - supported
Filesystem type (cache directory): ('ext4', '/dev/xvda1')
Repo: dvc, git
Filesystem type (workspace): ('ext4', '/dev/xvda1')

Additional Information (if any):

If applicable, please also provide a --verbose output of the command, eg: dvc add --verbose.

$ dvc status --verbose > dvc.log
Segmentation fault (core dumped)

https://gist.github.com/colllin/7439222da069c36d6501671cae63ea28

awaiting response

Most helpful comment

It worked! Rebuilding the venv. Thank you @efiop!! 馃檹

All 5 comments

Hi @colllin ! Anything special about the files/dirs you have? Could you try to run status with specific targets to possibly pinpoint specific artifact that is causing this?

Also, are you running dvc in a virtualenv? If so, I would re-create it from scratch, just in case, as we've seen virtualenvs suddenly violently crush when you upgrade your python version.

This is not specific to 1.0.0a*... I was seeing it on 0.94.0 and just now upgraded, hoping that the newer versions might fix it. I think I also did this in the past (was on < 0.94.0; upgraded hoping it would fix it)

If it's failing on a particular file, maybe I can help narrow it down by moving that out of the way and trying again. But I need help figuring out which file it's failing on. I already tried deleting a few log files based on error messages I was seeing, but I'm not sure they were the source of the issue.

@efiop I don't think my files are special. There are tensorboard logs, model checkpoints, a csv and an html file. My best guess is that it's failing on a large log file. Possibly related to logging images?

My venv is using pipenv so it's strictly tied to the python version I created it with... But I have no problem recreating it. I'll try it now.

Interesting... it looks like I did upgrade my python3 version behind the scenes at some point... it's now creating the venv tied to 3.7.7. Give me a few mins to see what happens after rebuilding & running dvc status.

My venv is using pipenv so it's strictly tied to the python version I created it with... But I have no problem recreating it. I'll try it now.

Please do! :pray:

In the logs, the last file is artifacts/train/logs/version_0/tf/events.out.tfevents.1589480533.ip-172-31-36-85.96554.0 . How big is it?

Also, did it generate a core dump? If it did, could you share it? If you are familiar with using coredumps yourself, maybe give it a look, it should pinpoint what specifically went wrong, otherwise please share the core dump with us.

This is pretty unusual, so my first guess is that there is something wrong with your environment/python/libraries it uses/etc.

It worked! Rebuilding the venv. Thank you @efiop!! 馃檹

Was this page helpful?
0 / 5 - 0 ratings

Related issues

luchoPipe87 picture luchoPipe87  路  69Comments

pared picture pared  路  73Comments

kevin-hanselman picture kevin-hanselman  路  37Comments

mdekstrand picture mdekstrand  路  43Comments

andrethrill picture andrethrill  路  70Comments