UPDATE: See description in https://github.com/iterative/dvc/pull/2160 instead.
OLD: Some introductory descriptions for DVC packages (from Discord):
You have project named
awesome_project, which hasawesome_datain it
pkg will allow you to "import" the project
dvc pkg import awesome_project://awesome_data(syntax may be incorrect)
- Any DVC repo is a _package_. Outputs inside packages can be called _artifacts_ (that we'll also be able to work with)
- There will be a set of commands to import/export these _packages_ across different repositories and from a repository to a file system
(just a data artifact)- basic use cases include things like: dataset registry (reuse a single dataset in multiple project, while having a single point of responsibility for the dataset itself), easy mechanism to pull a data artifact w/o doing
git clone+dvc pull data.dvc, etc- in the future (no need to focus on this for now) any set of DVC files will be considered as package, thus making pipelines reusable
Main references found in core repo:
Also, funny: dvc pkg doesn't have a -h option. Is that normal? Opened https://github.com/iterative/dvc/issues/2065
@jorgeorpinel Sorry for the confusion, the PR that this relates to is https://github.com/iterative/dvc/pull/2012 , which is not merged yet. I've just created this task in advance.
Oh. @shcheklein asked me to give this one priority. Are you saying I should wait for that PR to close first Ruslan? Makes sense to me 馃槂 although I already opened a WIP PR for this 馃構 (#388)
@jorgeorpinel this is a priority :) we will see if we merge it before pkg code is merged or not. It will also take some time to get the docs for it done end-to-end. It's a big feature.
import was moved to import-url.