Keras: mxnet as backend

Created on 19 Dec 2015 · 22Comments · Source: keras-team/keras

as tensorflow already being added as a backend, can we add mxnet as a new backend?
it seems mxnet is faster than theano and has a smaller memory usage!

stale

Source

ares128

👍8

Most helpful comment

@fchollet could you explain a bit more on why MXNet is less a good match than TensorFlow to Keras? would you accept a MXNet backend PR if someone submit it?

gwding on 13 Jan 2016

👍14 ❤1

All 22 comments

MXNet is cool, but it doesn't look like it would be good match to Keras. What's the rationale here? "It's fast" doesn't cut it.

fchollet on 19 Dec 2015

😕8 👎8 👍4

@fchollet could you explain a bit more on why MXNet is less a good match than TensorFlow to Keras? would you accept a MXNet backend PR if someone submit it?

gwding on 13 Jan 2016

👍14 ❤1

I think mxnet implements both a front-end and a backend. The backend could probably be hooked up to the keras frontend.

zachmayer on 18 Feb 2016

Sounds nice.

fivejjs on 18 Aug 2016

I think MXNet is a good choice because it supports multiple gpus and multiple nodes.

bobchennan on 29 Nov 2016

👍3

the problem is the API. if you can figure a way to write to the mxnet backend in a way that has autodiff, lazy graph compilation, etc (just like tensorflow and theano) it would be possible. In other words its not just a matter of writing all the layers in Keras with MXNet it has to fit Keras backend API. It was doable because Tensorflow and Theano have similar guiding principles

That being said. MXNet is gonna be even bigger now they are getting official support from industry. I'd strongly recommend somebody to make a Keras-like frontend for MXNet to make it easy for Keras users to use MXNet in production, AWS, etc. Even if it does not fit here the main project, it would be super worthy standalone on itself. Se Elaphas for example, it makes it easy for Keras users to work with Scala.

EderSantana on 29 Nov 2016

The big question, of course, is who will write this new backend. If you can
write it or create a team that will, then go for it.

On Nov 29, 2016 10:34 AM, "Eder Santana" notifications@github.com wrote:

the problem is the API. if you can figure a way to write to the mxnet
backend in a way that has autodiff, lazy graph compilation, etc (just like
tensorflow and theano) it would be possible. In other words its not just a
matter of writing all the layers in Keras with MXNet it has to fit Keras
backend API. It was doable because Tensorflow and Theano have similar
guiding principles

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/fchollet/keras/issues/1313#issuecomment-263657240, or mute
the thread
https://github.com/notifications/unsubscribe-auth/AArWb-_08tgDTFLGgaNqb4S3kOzA5Xlzks5rDHBMgaJpZM4G4oR9
.

fchollet on 29 Nov 2016

👍3

yeah, that is what I mean @fchollet . It will be hard work. But if they are really interested in getting people to do that. I'm sure the community will be happy with them.

EderSantana on 29 Nov 2016

👍1

I think MXNet is a good choice because it supports multiple gpus and multiple nodes.

This is the most important think Keras is missing IMHO. Keras should have a simple interface that lets you specify GPU IDs upfront and let it handle the rest of the backend stuff. This is a huge advantage that MXNET has and probably the main reason to make the switch from Keras to MXNET. No use having a powerful multi-gpu box when you can only use one GPU.

pGit1 on 30 Nov 2016

@pGit1 there is #3582 for that as well.

But regarding extra backends, I actually saw people complaining the opposite as well. They argued that the fact that keras doesn't just pick one and focus is holding it back. Deep learning is moving really fast and we can't pick both multiple backends and staying up to date with modern layers. Either way is fine, but we have to have realistic expectations. For example, with multiple backends we will have a bunch of features with warnings "this only works on backend XXX" like we already have and people complaining about bugs...

EderSantana on 2 Dec 2016

@EderSantana I will checkout the link. I am not advocating for another back end by the way. I just want to utilize the full power of GPU box without worrying about GPU workload assignments and memory usage.

pGit1 on 2 Dec 2016

👍1

Multi-GPU support will improve drastically in the near future. However,
this support will largely target TF only.

On 2 December 2016 at 09:35, pGit1 notifications@github.com wrote:

@EderSantana https://github.com/EderSantana I will checkout the link. I
am not advocating for another back end by the way. I just want to utilize
the full power of GPU box without worrying about GPU workload assignments
and memory usage.

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/fchollet/keras/issues/1313#issuecomment-264512993,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AArWb3Ltbr2cjb05WHnzW-x1xJuDEItcks5rEFbIgaJpZM4G4oR9
.

fchollet on 2 Dec 2016

🎉3

will largely target TF only.

Well I better rewrite all my code with TF backend (kind of sux I hear people complain about how RNNs are much better on Theano). Thankfully I think the only think the only thing that will change for me is image input dimensions for Conv layers.

pGit1 on 3 Dec 2016

If you are using Keras then you don't need to "rewrite" any code in order
to use the TF backend. Any code that works with Theano-Keras will run on
TF-Keras.

On Dec 2, 2016 3:02 PM, "pGit1" notifications@github.com wrote:

will largely target TF only.

Well I better rewrite all my code with TF backend (kind of sux I hear
people complain about how RNNs are much better on Theano). Thankfully I
think the only think the only thing that will change for me is image input
dimensions for Conv layers.

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/fchollet/keras/issues/1313#issuecomment-264586072,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AArWb-5yjBzVziyntkRRCcPGDO3rgL8iks5rEKOdgaJpZM4G4oR9
.

fchollet on 3 Dec 2016

👍1

I am working on this. I feel it is worthwhile endeavor since Keras will benefit from the MXNET’s multi-device/multi-node support and MXNET will get broader exposure along with some updates to its API capabilities. Keras's well written backend interface and tests are very helpful so far, I have been able to quickly get off the ground and handle the low hanging APIs. I am simultaneously working with MXNET to add in the missing ones. I would say around 25% of the APIs are covered till now. You can monitor the progress at these following locations.

Keras : https://github.com/shivarajugowda/keras
MXNET: https://github.com/dmlc/mxnet/issues/4173

shivarajugowda on 12 Dec 2016

👍9 ❤1

Keras with MXNET backend...

Multi-GPU - Check
Insanely Fast - Check

On Mon, Dec 12, 2016 at 12:53 AM, Shiv Gowda notifications@github.com
wrote:

I am working on this. I feel it is worthwhile endeavor since Keras will
benefit from the MXNET’s multi-device/multi-node support and MXNET will get
broader exposure along with some updates to its API capabilities. Keras's
well written backend interface and tests are very helpful so far, I have
been able to quickly get off the ground and handle the low hanging APIs. I
am simultaneously working with MXNET to add in the missing ones. I would
say around 25% of the APIs are covered till now. You can monitor the
progress at these following locations.

Keras : https://github.com/shivarajugowda/keras
MXNET: dmlc/mxnet#4173 https://github.com/dmlc/mxnet/issues/4173

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/fchollet/keras/issues/1313#issuecomment-266348876,
or mute the thread
https://github.com/notifications/unsubscribe-auth/ANU-SttGXUcNkkfCPepQjIy7k3MdRC_nks5rHOFYgaJpZM4G4oR9
.

pGit1 on 12 Dec 2016

@pGit1
Multi-Machine - Check
Linear Scaling to 256 GPUs - Check

smolix on 16 Dec 2016

@shivarajugowda that's very cool. Feel free to close this issue and open a new one to track progress on this new backend and to raise any issues you encounter in the process.

fchollet on 16 Dec 2016

👍1

@fchollet thanks, will do that.

shivarajugowda on 17 Dec 2016

@smolix

Thats just insanity... 👍

@shivarajugowda/@fchollet

Thanks for the valuable contributions!!

pGit1 on 20 Dec 2016

ok.the mxnet backend now can run. you can download in https://github.com/dmlc/keras
or
pip install git+https://github.com/dmlc/keras

yajiedesign on 6 May 2017

👍8 ❤1 🎉1

@yajiedesign

Can you please provide two examples?

Simple ConvNet on MNIST utilizing multiple GPUs and (2 or more). Hopefully this can demonstrate the linear scaling across GPUs on a single machine.
Using a pre-trained model with Mxnet backend.

pGit1 on 7 May 2017

Was this page helpful?

0 / 5 - 0 ratings