Keras: Keras with data directory in AWS S3?

Created on 4 Jan 2017 · 6Comments · Source: keras-team/keras

I want to try a Keras model in AWS EC2 instance with dataset in AWS S3.

However, I cannot directly use flow_from_directory pointing to S3 url. Is there an alternative way to do it?

Thanks!

stale

Source

parkerzf

Most helpful comment

S3 latency is actually fairly low and if you have multiple processes fetching data it is not a problem (we've done this in other contexts).

We had experimented with this over a year ago, so I misremembered slightly in that TensorFlow was not directly doing the S3 fetches. We experimented with presenting S3 files as a file system (like https://fullstacknotes.com/mount-aws-s3-bucket-to-ubuntu-file-system/). Our data size was small enough to fit on an EBS volume, though, so ultimately we went that route to remove a piece in the architecture. Keep in mind that EBS reads are also going over the network just like S3 reads.

jerheff on 7 Jan 2017

👍4 👀1

All 6 comments

You need to download the files first. It would be way too slow to read directly from S3.

marcj on 4 Jan 2017

👎10

Thanks!

parkerzf on 4 Jan 2017

We used TensorFlow's feature to form training batches from distributed files and fed these batches to Keras through the generator capability.

jerheff on 5 Jan 2017

How do you do that? Could you provide a code sample?

parkerzf on 5 Jan 2017

S3 latency is actually fairly low and if you have multiple processes fetching data it is not a problem (we've done this in other contexts).

jerheff on 7 Jan 2017

👍4 👀1

🙏🏽 jeremy
On Sat, Jan 7, 2017 at 10:44 AM Jeremy Heffner notifications@github.com
wrote:

S3 latency is actually fairly low and if you have multiple processes
fetching data it is not a problem (we've done this in other contexts).

We had experimented with this over a year ago, so I misremembered slightly
in that TensorFlow was not directly doing the S3 fetches. We experimented
with presenting S3 files as a file system (like
https://fullstacknotes.com/mount-aws-s3-bucket-to-ubuntu-file-system/).
Our data size was small enough to fit on an EBS volume, though, so
ultimately we went that route to remove a piece in the architecture. Keep
in mind that EBS reads are also going over the network just like S3 reads.

—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
https://github.com/fchollet/keras/issues/4913#issuecomment-271101828,
or mute the thread
https://github.com/notifications/unsubscribe-auth/ACWGWHbtWOXfkRnq5aV60SVADI1uk5ewks5rP90KgaJpZM4La2w-
.