For a simple HybridBlock, saving and deserializing the symbol fails with mxnet 1.3 when an embedding layer is used multiple times. This used to work with mxnet 1.2
It may or may not be related to this issue: https://github.com/apache/incubator-mxnet/issues/12783
----------Python Info----------
Version : 3.6.3
Compiler : GCC 4.2.1 Compatible Apple LLVM 9.0.0 (clang-900.0.39.2)
Build : ('default', 'Mar 20 2018 21:25:13')
Arch : ('64bit', '')
------------Pip Info-----------
Version : 18.1
Directory : /Users/.../.pyenv/versions/3.6.3/Python.framework/Versions/3.6/lib/python3.6/site-packages/pip
----------MXNet Info-----------
Version : 1.3.0
Directory : /Users/.../.pyenv/versions/3.6.3/Python.framework/Versions/3.6/lib/python3.6/site-packages/mxnet
Commit Hash : b3be92f4a48bce62a5a8424271871c2f81c8f7f1
----------System Info----------
Platform : Darwin-16.7.0-x86_64-i386-64bit
system : Darwin
node : ...
release : 16.7.0
version : Darwin Kernel Version 16.7.0: Thu Jun 21 20:07:39 PDT 2018; root:xnu-3789.73.14~1/RELEASE_X86_64
----------Hardware Info----------
machine : x86_64
processor : i386
b'machdep.cpu.extfeatures: SYSCALL XD 1GBPAGE EM64T LAHF LZCNT PREFETCHW RDTSCP TSCI'
b'machdep.cpu.leaf7_features: SMEP ERMS RDWRFSGS TSC_THREAD_OFFSET BMI1 HLE AVX2 BMI2 INVPCID RTM SMAP RDSEED ADX IPT SGX FPU_CSDS MPX CLFSOPT'
b'machdep.cpu.features: FPU VME DE PSE TSC MSR PAE MCE CX8 APIC SEP MTRR PGE MCA CMOV PAT PSE36 CLFSH DS ACPI MMX FXSR SSE SSE2 SS HTT TM PBE SSE3 PCLMULQDQ DTES64 MON DSCPL VMX SMX EST TM2 SSSE3 FMA CX16 TPR PDCM SSE4.1 SSE4.2 x2APIC MOVBE POPCNT AES PCID XSAVE OSXSAVE SEGLIM64 TSCTMR AVX1.0 RDRAND F16C'
b'machdep.cpu.brand_string: Intel(R) Core(TM) i7-7660U CPU @ 2.50GHz'
----------Network Test----------
Setting timeout: 10
Timing for MXNet: https://github.com/apache/incubator-mxnet, DNS: 0.0440 sec, LOAD: 0.9637 sec.
Timing for Gluon Tutorial(en): http://gluon.mxnet.io, DNS: 0.0922 sec, LOAD: 1.0902 sec.
Timing for Gluon Tutorial(cn): https://zh.gluon.ai, DNS: 0.0733 sec, LOAD: 0.8020 sec.
Timing for FashionMNIST: https://apache-mxnet.s3-accelerate.dualstack.amazonaws.com/gluon/dataset/fashion-mnist/train-labels-idx1-ubyte.gz, DNS: 0.0545 sec, LOAD: 0.6035 sec.
Timing for PYPI: https://pypi.python.org/pypi/pip, DNS: 0.0400 sec, LOAD: 1.1772 sec.
Timing for Conda: https://repo.continuum.io/pkgs/free/, DNS: 0.0443 sec, LOAD: 0.2266 sec.
Package used (Python/R/Scala/Julia):
I'm using python
(Paste the complete error message, including stack trace.)
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-8-628e9346e9c4> in <module>()
21 test_op.export('/tmp/bla')
22
---> 23 mx.gluon.SymbolBlock.imports('/tmp/bla-symbol.json', param_file='/tmp/bla-0000.params', input_names=['data0', 'data1'])
~/.pyenv/versions/3.6.3/Python.framework/Versions/3.6/lib/python3.6/site-packages/mxnet/gluon/block.py in imports(symbol_file, input_names, param_file, ctx)
1021 input_names = [input_names]
1022 inputs = [symbol.var(i) for i in input_names]
-> 1023 ret = SymbolBlock(sym, inputs)
1024 if param_file is not None:
1025 ret.collect_params().load(param_file, ctx=ctx)
~/.pyenv/versions/3.6.3/Python.framework/Versions/3.6/lib/python3.6/site-packages/mxnet/gluon/block.py in __init__(self, outputs, inputs, params)
1049 row_sparse_storage = ndarray.ndarray._STORAGE_TYPE_STR_TO_ID['row_sparse']
1050 for i in out:
-> 1051 for j in i.get_internals():
1052 assert(j.attr("__storage_type__") != str(row_sparse_storage)), \
1053 "SymbolBlock doesn't support Parameter '%s' because its storage " \
~/.pyenv/versions/3.6.3/Python.framework/Versions/3.6/lib/python3.6/site-packages/mxnet/symbol/symbol.py in <genexpr>(.0)
91 <Symbol _plus0>
92 """
---> 93 return (self[i] for i in self.list_outputs())
94
95 def __add__(self, other):
~/.pyenv/versions/3.6.3/Python.framework/Versions/3.6/lib/python3.6/site-packages/mxnet/symbol/symbol.py in __getitem__(self, index)
515 if name == index:
516 if idx is not None:
--> 517 raise ValueError('There are multiple outputs with name \"%s\"' % index)
518 idx = i
519 if idx is None:
ValueError: There are multiple outputs with name "testop1_embedding0_fwd_output"
import mxnet as mx
from mxnet import gluon
class TestOp(gluon.HybridBlock):
def __init__(self, n_in, n_out):
super().__init__()
with self.name_scope():
self.embed = mx.gluon.nn.Embedding(n_in, n_out)
def hybrid_forward(self, F, x, y):
a = self.embed(x)
b = self.embed(y)
return a + b
test_op = TestOp(n_in=5, n_out=2)
test_op.initialize()
test_op.hybridize()
test_op(mx.nd.array([0,1,2]), mx.nd.array([1,2,3]))
test_op.export('/tmp/bla')
gluon.SymbolBlock.imports(
'/tmp/bla-symbol.json',
param_file='/tmp/bla-0000.params',
input_names=['data0', 'data1'])
Run the code.
@srochel @lupesko
@mxnet-label-bot [Bug, Gluon]
This seems similar to #12783
Looks like the problem occurs with Dense as well, so the issue probably lies in the +:
import mxnet as mx
class MyBlock(mx.gluon.HybridBlock):
def __init__(self):
super().__init__()
with self.name_scope():
self.model = mx.gluon.nn.Dense(units=5)
def hybrid_forward(self, F, x, y):
return self.model(x) + self.model(y)
block = MyBlock()
block.initialize()
block.hybridize()
output = block(mx.nd.random_normal(shape=(100,)), mx.nd.random_normal(shape=(100,)))
block.export(path="./model", epoch=0)
symbol = mx.gluon.SymbolBlock.imports(
symbol_file="./model-symbol.json",
input_names=["data0", "data1"],
param_file="./model-0000.params",
ctx=mx.Context.default_ctx
)
gives
Traceback (most recent call last):
File "2018-10-11-very-weird-issue.py", line 23, in <module>
ctx=mx.Context.default_ctx
File "[...]/lib/python3.6/site-packages/mxnet/gluon/block.py", line 1023, in imports
ret = SymbolBlock(sym, inputs)
File "[...]/lib/python3.6/site-packages/mxnet/gluon/block.py", line 1051, in __init__
for j in i.get_internals():
File "[...]/lib/python3.6/site-packages/mxnet/symbol/symbol.py", line 93, in <genexpr>
return (self[i] for i in self.list_outputs())
File "[...]/lib/python3.6/site-packages/mxnet/symbol/symbol.py", line 517, in __getitem__
raise ValueError('There are multiple outputs with name \"%s\"' % index)
ValueError: There are multiple outputs with name "myblock0_dense0_fwd_output"
I observed that the issue occurs when 2 inputs pass through the same block. Trying to understand the root cause for this as it works fine with mxnet 1.2.1
https://github.com/apache/incubator-mxnet/blob/master/python/mxnet/gluon/block.py#L1050
this is the new addition after 1.2.1 which calls output symbol.get_internals() and then finds duplicate names and fails, exposing the duplicate names issue from 1.3 onwards.
Duplicate output names when we have 2 inputs passing through the same block was always the case, I am not very sure why this has not created issues for our users.
Trying to root cause and solution with the help of @safrooze and @zhreshold
@szha
@lostella while the issue is being root caused, one work around in this case would be to use different blocks with shared parameters:
class MyBlock(mx.gluon.HybridBlock):
def __init__(self):
super().__init__()
with self.name_scope():
self.model0 = mx.gluon.nn.Dense(units=5)
self.model1 = mx.gluon.nn.Dense(units=5, params=self.model0.collect_params())
def hybrid_forward(self, F, x, y):
return self.model0(x) + self.model1(y)
i use mxnet 1.3.0 also meet this problem, In my code ,it was caused by code ' sym.get_internals()', after I deleted it , then it can run .
Need to check and see if issue is resolved in https://github.com/apache/incubator-mxnet/issues/14619#issuecomment-504249531
The "Minimum reproducible example" works for me on 1.5 and current master. This can probably be closed?