Say I want to save a file to S3 using a specific account:
df.to_csv('s3://foo/bar/temp.csv')
where my accounts are listed in ~/.aws/credentials:
[default]
aws_access_key_id = XXXX
aws_secret_access_key = XXXX
[foo]
aws_access_key_id = XXXX
aws_secret_access_key = XXXX
[bar]
aws_access_key_id = XXXX
aws_secret_access_key = XXXX
What's the best or recommended way to do this with Pandas 0.20.2?
Any way to use/specify what account to use when we have multiple of them?
Perhaps related: Does Pandas use boto or boto3?
As of 0.20, pandas uses http://s3fs.readthedocs.io/en/latest/
I believe you should be able to do
import pandas as pd
import s3fs
fs = s3fs.S3FileSystem(profile_name='foo')
f = fs.open("my-bucket/file.csv", "wb")
df.to_csv(f)
Could you try that out, and if it works make a pull request for the documentation? I don't have a test bucket handy at the moment.
I know this post is quite old at this point. However @TomAugspurger 's solution certainly works. For py3, I did the small change of using 'w' instead of 'wb'.
Would a solution to this be allowing a dask style storage_options parameter on its read_x functions? It's a little frustrating not being able to just pass these things through, most frequently i'm trying to pass in credentials rather than let boto search my system for them.
Yes, I think that request has come up in a few places. I'd be happy to see something like that.
Ref similar issue: https://github.com/pandas-dev/pandas/issues/33639
Most helpful comment
Yes, I think that request has come up in a few places. I'd be happy to see something like that.