it would be nice to have some options (or even a tutorial) on how to inject/map DVC pipelines onto popular frameworks, so that ideally ther won't be any need to work with the "production" code directly.
main problem here, in my humble opinion, is mapping and unifying parametets and paths
It is not very clear how to achieve it in an elegant way in the current state of dvc. We will have to take a closer look at it to see if there are any additions for dvc that will help with the integration. Going to investigate it pretty soon.
Thank you for the feedback!
True, and I am not terribly convinced this SHOULD be part of DVC per se (you know better though), but would love to see/think/talk out loud /
I do intend to work on this, but it would be done outside of DVC.
DVC is pretty agnostic as far as things are going to be executed. I imagine it wouldn't bee too complex to convert a DVC DAG into an Airflow DAG of bash operators that just do dvc repro
on output files. I'm guessing this would be a DagBuilder function that doesn't quite fit in a PR for both projects.
I have my hands full at the moment improving my airflow aws stack, but perhaps I'll have some time by the end of the year.
I like DVC and I think this could be a very integral feature for DVC. I vote for it.
It seems like the issue might not be as straight forward as one would assume at first, due to the point that dvc repro
or dvc run
would keep re-running previous parts of the code for each step in case these are not in the same environment or new commits are not being made every time. This is being discussed in as well #2212
Most helpful comment
I do intend to work on this, but it would be done outside of DVC.
DVC is pretty agnostic as far as things are going to be executed. I imagine it wouldn't bee too complex to convert a DVC DAG into an Airflow DAG of bash operators that just do
dvc repro
on output files. I'm guessing this would be a DagBuilder function that doesn't quite fit in a PR for both projects.I have my hands full at the moment improving my airflow aws stack, but perhaps I'll have some time by the end of the year.