Kedro: Kedro with custom execution engine? (Ray)

Created on 7 Aug 2020 · 3Comments · Source: quantumblacklabs/kedro

Potential user here. I'm interested in using Kedro, but we use Ray Distributed instead of PySpark for our execution engine. Do your pipelines support this?

Question

Source

crypdick

Most helpful comment

Update: I confirmed that Ray works fine :)

crypdick on 9 Aug 2020

🚀2

All 3 comments

Hi, thank you for your interest for Kedro! We don't natively support Ray DataSet nor RayRunner so far, but you can add both as a custom dataset/runner. See https://kedro.readthedocs.io/en/latest/06_nodes_and_pipelines/02_pipelines.html?#using-a-custom-runner for using a custom runner, and https://kedro.readthedocs.io/en/latest/07_extend_kedro/01_custom_datasets.html for custom dataset.

921kiyo on 7 Aug 2020

👍2

Update: I confirmed that Ray works fine :)

crypdick on 9 Aug 2020

🚀2

Just a quick note on this. @crypdick followed this tutorial from @dataengineerone, except instead of using multiprocessing inside each node he used ray.

Thanks for investigating this @crypdick ✨