CSVBlobDataSet class in kedro.contrib.io.azure.CSVBlobDataSet does not have versioning implemented and it would be great if we could also enable it in this dataset.
N/A
kedro.io@921kiyo I'm happy to pick this up
Here's the link for the pull request - #76
@921kiyo - on https://kedro.readthedocs.io/en/latest/04_user_guide/07_advanced_io.html?#versioning it mentions kedro.io.core.FilepathVersionMixin but I can't find it in kedro.io.core. Should it be AbstractDataSet?
@evanmiller29
I think the statement is out of date. We have just release 0.15.0 yesterday and merged FilepathVersionMixIn and S3VersionMixIn under one abstract class AbstractVersionedDataSet. See the "Breaking changes to the API" section for 0.15.0 in RELEASE.md. I will update the docs.
@evanmiller29 Seems like the docs did not build properly, we have manually rebuilt it just now and the docs should be up to date for 0.15.0 now (browser might have cached the old version) :)
Ahhh cool. No worries - I thought I'd just raise it anyway so you know.
I'll give it a read. It's looking good to me so far
@921kiyo - I'm looking at the CSVBlobDataSet docs and I'm wondering how you pass in the account credentials to kedro to enable saving a CSV to Azure?
A simple code stub would be great. Once this is all done I'd be happy to add it to the docs.
Evan
EDIT: Also wondering if adding it to the example catalog.yml might be a good idea too
@evanmiller29 We have a documentation about how to feed credentials to a dataset in https://kedro.readthedocs.io/en/latest/04_user_guide/04_data_catalog.html#feeding-in-credentials :)
@921kiyo - just an FYI:
import pandas as pd
data = pd.DataFrame({'col1': [1, 2], 'col2': [4, 5],
'col3': [5, 6]})
data_set = CSVBlobDataSet(filepath="test.csv",
bucket_name="test_bucket",
load_args=None,
save_args={"index": False})
TypeError: __init__() got an unexpected keyword argument 'bucket_name'
I think bucket_name should be container_name
@evanmiller29 Thanks for letting us know. It is a typo in the docstring. We will fix it :)
It looks like this hasn't been updated in a while, so I'm happy to make a PR (I have a copy in my fork https://github.com/mzjp2/kedro/commit/8db124b1a7c2301e09965c652e2aae8c42cf7873) but equally, also happy to defer to Evan if he's back. :)
Hey Zain,
Please go ahead. I'm super snowed under at the moment.
Evan
On Fri, Sep 27, 2019 at 2:57 PM Zain Patel notifications@github.com wrote:
It looks like this hasn't been updated in a while, so I'm happy to make a
PR (I have a copy in my fork mzjp2@8db124b
https://github.com/mzjp2/kedro/commit/8db124b1a7c2301e09965c652e2aae8c42cf7873)
but equally, also happy to defer to Evan if he's back. :)—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/quantumblacklabs/kedro/issues/74?email_source=notifications&email_token=AB5QM7RYYPCUB2QTQPTSS53QLYGMLA5CNFSM4ILLP5T2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD7Y7JHQ#issuecomment-535950494,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AB5QM7WMD2336E2NN6BRC33QLYGMLANCNFSM4ILLP5TQ
.
Thanks @evanmiller29 ! :)