Kedro: [KED-2131] Incomplete documentation about the Spaceflights tutorial

Created on 27 Sep 2020  路  6Comments  路  Source: quantumblacklabs/kedro

Description

The docs regarding the Spaceflights tutorial are incomplete, which makes it harder to successfully finish it.

This has been partially discussed in https://github.com/quantumblacklabs/kedro-examples/issues/58 (including the issue reproducibility). Therefore, I will discuss it in a complementary way.

Context

There is apparently an ongoing internal issue about improving the organization and sync between the repos kedro-examples, kedro-training, and kedro-starter-spaceflights (https://github.com/quantumblacklabs/kedro-training/pull/1).

As I understood, spaceflight full-repo is moving from kedro-examples/kedro-tutorial to kedro-training/kedro/exercises/spaceflight.

I'm not sure about what is being tracked internally, so I will list what I've found related to Spaceflights' requirements.txt:

Actionable

  1. kedro-examples/kedro-tutorial's requirements.txt should be updated to contain kedro[pandas.CSVDataSet,pandas.ExcelDataSet].
  2. Docs should be improved to show that kedro[pandas.CSVDataSet,pandas.ExcelDataSet] is required. Set up the spaceflights project#Install project dependencies is probably the right place.
  3. This also applies to kedro-training docs at Create a new project#kedro install.
  4. Latest and stable Kedro docs points to kedro-examples as the full source to the spaceflights project. I'm not sure, but I guess that this will/should be eventually changed to kedro-training/kedro-exercises/spaceflight.

I could work on 1., 2., and 3. if it makes sense (note that they are at 3 different repos).

PS: Sorry for the cross-repos references everywhere. I considered that here was the best place to report it.

Starter Bug Report Sprint Activity good first issue

Most helpful comment

Thanks for addressing it, @921kiyo.

If there is still time, I'd like to add three more notes regarding the docs and kedro-training:

shuttles:
  type: kedro_tutorial.io.xls_local.ExcelLocalDataSet
  filepath: data/01_raw/shuttles.xlsx
  layer: raw

All 6 comments

Thank you for reporting it! We will address your feedback in the docs and spaceflight example code.

Thanks for addressing it, @921kiyo.

If there is still time, I'd like to add three more notes regarding the docs and kedro-training:

shuttles:
  type: kedro_tutorial.io.xls_local.ExcelLocalDataSet
  filepath: data/01_raw/shuttles.xlsx
  layer: raw

Hey there @falcaopetri.

Thanks for reporting this. It is not too late at all!

I will add your additional comments to our ticket. In the meantime, if you'd like to fix the issues, feel free to make a PR and work on it 馃槃 We truly appreciate it!

Chat soon.

From @falcaopetri's comment:

Doc's Data science pipeline#Update dependencies adds the scikit-learn dependency to src/requirements.txt and then runs kedro install. Shouldn't it be src/requirements.in + kedro build-deps && kedro install instead?

I have the same question. Adding it to src/requirements.in makes more sense to me too.

@guludo Apologies, just noticed this has been merged into develop, which is not visible on ReadTheDocs. Indeed it should be in src/requirements.in, the next version will have the correct docs.

Closing this as resolved through linked PRs/issues, as well as https://github.com/quantumblacklabs/kedro/commit/589d6a7a329f453ac91662814ec044dc41d1063c and https://github.com/quantumblacklabs/kedro/commit/0fd6b623bdaa2f69623ad4fdacb022a5862fb590 . Please feel free to open a new issue if there are other observations!

Was this page helpful?
0 / 5 - 0 ratings

Related issues

yetudada picture yetudada  路  3Comments

josephhaaga picture josephhaaga  路  3Comments

WaylonWalker picture WaylonWalker  路  3Comments

kaemo picture kaemo  路  3Comments

torazem picture torazem  路  3Comments