Kedro: Kedro with custom execution engine? (Ray)

Created on 7 Aug 2020  Â·  3Comments  Â·  Source: quantumblacklabs/kedro

Potential user here. I'm interested in using Kedro, but we use Ray Distributed instead of PySpark for our execution engine. Do your pipelines support this?

Question

Most helpful comment

Update: I confirmed that Ray works fine :)

All 3 comments

Hi, thank you for your interest for Kedro! We don't natively support Ray DataSet nor RayRunner so far, but you can add both as a custom dataset/runner. See https://kedro.readthedocs.io/en/latest/06_nodes_and_pipelines/02_pipelines.html?#using-a-custom-runner for using a custom runner, and https://kedro.readthedocs.io/en/latest/07_extend_kedro/01_custom_datasets.html for custom dataset.

Update: I confirmed that Ray works fine :)

Just a quick note on this. @crypdick followed this tutorial from @dataengineerone, except instead of using multiprocessing inside each node he used ray.

Thanks for investigating this @crypdick ✨

Was this page helpful?
0 / 5 - 0 ratings