Test script:
# embedding.py
import mxnet as mx
data = mx.sym.Variable('data')
embed = mx.sym.Embedding(
data = data,
input_dim = 4000,
output_dim = 128,
name = 'embed'
)
executor = embed.simple_bind(
ctx = mx.gpu(1),
data = (8, 30, 100),
)
executor.forward(is_train = True)
# not done yet, another line of code to execute, see below
Run ipython -i embedding.py, then try to backward:
In [1]: executor.backward(out_grads = executor.outputs[0])
[20:07:28] /home/zhenghuabin/mxnet/dmlc-core/include/dmlc/logging.h:235: [20:07:28] src/operator/./embedding-inl.h:86: Check failed: (req[embedding::kData]) == (kNullOp) Embedding layer doesn't support calculate data gradient
[20:07:28] /home/zhenghuabin/mxnet/dmlc-core/include/dmlc/logging.h:235: [20:07:28] src/engine/./threaded_engine.h:306: [20:07:28] src/operator/./embedding-inl.h:86: Check failed: (req[embedding::kData]) == (kNullOp) Embedding layer doesn't support calculate data gradient
An fatal error occurred in asynchronous engine operation. If you do not know what caused this error, you can try set environment variable MXNET_ENGINE_TYPE to NaiveEngine and run with debugger (i.e. gdb). This will force all operations to be synchronous and backtrace will give you the series of calls that lead to this error. Remember to set MXNET_ENGINE_TYPE back to empty after debugging.
terminate called after throwing an instance of 'dmlc::Error'
what(): [20:07:28] src/engine/./threaded_engine.h:306: [20:07:28] src/operator/./embedding-inl.h:86: Check failed: (req[embedding::kData]) == (kNullOp) Embedding layer doesn't support calculate data gradient
An fatal error occurred in asynchronous engine operation. If you do not know what caused this error, you can try set environment variable MXNET_ENGINE_TYPE to NaiveEngine and run with debugger (i.e. gdb). This will force all operations to be synchronous and backtrace will give you the series of calls that lead to this error. Remember to set MXNET_ENGINE_TYPE back to empty after debugging.
Aborted
Based on MXNet's newest version today (2016.11.26).
@WarBean I've tried it in the NNVM branch and the code runs well. I've missed the final line of the code executor.backward(out_grads = executor.outputs[0]) and the problem exists in the new branch.
@WarBean In fact I think it may not be appropriate to call it a bug since we cannot propagate gradient to the data. We need to set grad_req of data to null in this case.
@sxjscience Any idea on the possible reason? I suggest the problem were not caused by the implementation of Embedding itself, because it functioned well before.
@sxjscience
1.How to manually set grad_req in Python API?
2.I use Embedding before in the same way as I post in the test script, but everything is ok.
@WarBean I suggest you using bind instead of single_bind when dealing with the embedding layer, i.e, to set the grad_req of data to null.
Or you can explicitly set the grad_req for the simple bind like the following:
import mxnet as mx
data = mx.sym.Variable('data')
embed = mx.sym.Embedding(
data = data,
input_dim = 4000,
output_dim = 128,
name = 'embed'
)
executor = embed.simple_bind(
ctx = mx.gpu(0),
data = (8, 30, 100),
grad_req= {'data':'null', 'embed_weight':'write'}
)
executor.forward(is_train = True)
executor.backward(out_grads = executor.outputs[0])
It works. I haven't used grad_reqoption in simple bind. Thank you.
@WarBean However, this solution is not that good if your network contains lots of parameters. In that case, it would be better to directly use the bind API and do not include data when setting the gradient.
Some modification to work around for lots of parameters, assuming other grad_req to be 'write':
data = mx.sym.Variable('data')
embed = mx.sym.Embedding(
data = data,
input_dim = 4000,
output_dim = 128,
name = 'embed'
)
executor = embed.simple_bind(
ctx = mx.gpu(0),
data = (8, 30, 100),
grad_req= {
name: 'null' if name == 'data' else 'write'
for name in embed.list_arguments()
},
)
Is it appropriate?
@WarBean Yes, it works.