I'm trying to create a Kedro plugin and it took me a long time to figure out that a plugin is a _separate_ package that exists outside of any Kedro project. I kept modifying 'setup.py' within my Kedro project, following the JSON example in the documentation, only to find that the kedro to_json command didn't do anything. In retrospect it now makes sense that kedro plugins are separate python packages, but this is not obvious to new users from the documentation.
New users would benefit from a more complete Kedro plugin example.
Add more detail to the Kedro plugin tutorial. Also add a complete plugin example as its own repository, similar to the kedro-examples repo.
Include labels so that we can categorise your issue:
Hi @benjaminjack! Thank you so much for flagging this. We'll be shortly releasing our Kedro-Docker and Kedro-Viz plugins as examples of how Kedro plugins are built but you have flagged an important concept that our documentation is not clear. We'll get to work on a fix for this.
@yetudada good to know, thanks for the update. Do you have any information on when we can expect Kedro-Airflow as well? I'm working through a prototype and would love to be able to push something that "feels" more productionized than essentially running a script (kedro run) on a remote server.
@njgerner Kedro-Airflow will be out soon, perhaps next week? I'll keep you up to date on this. We're just running through our final checks of the plugin. In the meanwhile, you can check out Kedro-Docker, we just released it today: https://github.com/quantumblacklabs/kedro-docker
Our teams use the Docker functionality to schedule batch-processed runs of Kedro pipelines. Do you have an example of your workflow that you could show us?
@benjaminjack Kedro-Docker is out: https://github.com/quantumblacklabs/kedro-docker
Our teams use the Docker functionality to schedule batch-processed runs of Kedro pipelines. Do you have an example of your workflow that you could show us?
@yetudada it's going to depend heavily on how exactly the Kedro-Airflow plugin works. My expectation is the tool will build an Airflow DAG out of the pipelines where each node is an operator.
If that is the case then we'll be able to throw this in to a typical CI/CD pipeline that will deploy to Airflow running on AWS.
Also, as I write this I realize building out a Kedro-Glue plugin would be useful as well. Converting Kedro pipelines to Glue jobs should be possible via CodeGenNode & related types (https://docs.aws.amazon.com/glue/latest/dg/aws-glue-api-etl-script-generation.html).
That being said, thanks for the info, I'll be on the lookout for Kedro-Airflow.
@njgerner The Kedro-Glue plugin totally makes sense and is a very nice idea. Feel free to take a stab at it if you find yourself in need of such a plugin. Do you mind opening a separate issue for the Kedro-Glue plugin, such that any potential contributors would be able to find the suggestion easier?
@idanov ref #57
@njgerner The kedro-airflow plugin should be out this week! We're very excited about it and thank you for raising #57.
I'm going to close this issue because we used #109 to rectify this.
Most helpful comment
@yetudada it's going to depend heavily on how exactly the
Kedro-Airflowplugin works. My expectation is the tool will build an Airflow DAG out of the pipelines where each node is an operator.If that is the case then we'll be able to throw this in to a typical CI/CD pipeline that will deploy to Airflow running on AWS.
Also, as I write this I realize building out a
Kedro-Glueplugin would be useful as well. Converting Kedro pipelines to Glue jobs should be possible viaCodeGenNode& related types (https://docs.aws.amazon.com/glue/latest/dg/aws-glue-api-etl-script-generation.html).That being said, thanks for the info, I'll be on the lookout for
Kedro-Airflow.