Pysyft: Error in Introduction-to-TrainConfig tutorial

Created on 12 Jun 2019 · 8Comments · Source: OpenMined/PySyft

Describe the bug
I am trying to run the tutorial Introduction to TrainConfig, but I am getting an error in the following tep :
Step --> Send TrainConfig to worker

Actual Error --> TypeError: Cannot serialize <torch._C.Function object at 0x133459af0>

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-11-dbfb7bd00645> in <module>
      1 # Send train config
----> 2 train_config.send(alice)

~/anaconda3/envs/pysyft/lib/python3.7/site-packages/syft-0.1.18-py3.7.egg/syft/federated/train_config.py in send(self, location)
    117 
    118         # Send loss function
--> 119         self.loss_fn_ptr, self._loss_fn_id = self._wrap_and_send_obj(self.loss_fn, location)
    120 
    121         # Send train configuration itself

~/anaconda3/envs/pysyft/lib/python3.7/site-packages/syft-0.1.18-py3.7.egg/syft/federated/train_config.py in _wrap_and_send_obj(self, obj, location)
     94         """Wrappers object and send it to location."""
     95         obj_with_id = pointers.ObjectWrapper(id=sy.ID_PROVIDER.pop(), obj=obj)
---> 96         obj_ptr = self.owner.send(obj_with_id, location)
     97         obj_id = obj_ptr.id_at_location
     98         return obj_ptr, obj_id

~/anaconda3/envs/pysyft/lib/python3.7/site-packages/syft-0.1.18-py3.7.egg/syft/workers/base.py in send(self, obj, workers, ptr_id, local_autograd, preinitialize_grad)
    333             pointer = obj
    334         # Send the object
--> 335         self.send_obj(obj, worker)
    336 
    337         return pointer

~/anaconda3/envs/pysyft/lib/python3.7/site-packages/syft-0.1.18-py3.7.egg/syft/workers/base.py in send_obj(self, obj, location)
    504                 receive the object.
    505         """
--> 506         return self.send_msg(codes.MSGTYPE.OBJ, obj, location)
    507 
    508     def request_obj(self, obj_id: Union[str, int], location: "BaseWorker") -> object:

~/anaconda3/envs/pysyft/lib/python3.7/site-packages/syft-0.1.18-py3.7.egg/syft/workers/base.py in send_msg(self, msg_type, message, location)
    218 
    219         # Step 1: serialize the message to simple python objects
--> 220         bin_message = sy.serde.serialize(message)
    221 
    222         # Step 2: send the message and wait for a response

~/anaconda3/envs/pysyft/lib/python3.7/site-packages/syft-0.1.18-py3.7.egg/syft/serde.py in serialize(obj, simplified, force_no_compression, force_no_serialization, force_full_simplification)
    131         return simple_objects
    132     else:
--> 133         binary = msgpack.dumps(simple_objects)
    134 
    135     # 3) Compress

~/anaconda3/envs/pysyft/lib/python3.7/site-packages/msgpack/__init__.py in packb(o, **kwargs)
     44     See :class:`Packer` for options.
     45     """
---> 46     return Packer(**kwargs).pack(o)
     47 
     48 

~/anaconda3/envs/pysyft/lib/python3.7/site-packages/msgpack/fallback.py in pack(self, obj)
    898     def pack(self, obj):
    899         try:
--> 900             self._pack(obj)
    901         except:
    902             self._buffer = StringIO()  # force reset

~/anaconda3/envs/pysyft/lib/python3.7/site-packages/msgpack/fallback.py in _pack(self, obj, nest_limit, check, check_type_strict)
    885                 self._pack_array_header(n)
    886                 for i in xrange(n):
--> 887                     self._pack(obj[i], nest_limit - 1)
    888                 return
    889             if check(obj, dict):

~/anaconda3/envs/pysyft/lib/python3.7/site-packages/msgpack/fallback.py in _pack(self, obj, nest_limit, check, check_type_strict)
    885                 self._pack_array_header(n)
    886                 for i in xrange(n):
--> 887                     self._pack(obj[i], nest_limit - 1)
    888                 return
    889             if check(obj, dict):

~/anaconda3/envs/pysyft/lib/python3.7/site-packages/msgpack/fallback.py in _pack(self, obj, nest_limit, check, check_type_strict)
    885                 self._pack_array_header(n)
    886                 for i in xrange(n):
--> 887                     self._pack(obj[i], nest_limit - 1)
    888                 return
    889             if check(obj, dict):

~/anaconda3/envs/pysyft/lib/python3.7/site-packages/msgpack/fallback.py in _pack(self, obj, nest_limit, check, check_type_strict)
    885                 self._pack_array_header(n)
    886                 for i in xrange(n):
--> 887                     self._pack(obj[i], nest_limit - 1)
    888                 return
    889             if check(obj, dict):

~/anaconda3/envs/pysyft/lib/python3.7/site-packages/msgpack/fallback.py in _pack(self, obj, nest_limit, check, check_type_strict)
    894                 default_used = 1
    895                 continue
--> 896             raise TypeError("Cannot serialize %r" % (obj, ))
    897 
    898     def pack(self, obj):

TypeError: Cannot serialize <torch._C.Function object at 0x133459af0>

To Reproduce
Follow the steps in the tutorial sequentially.

Reference Screenshots

screencapture-localhost-8888-notebooks-Introduction-to-TrainConfig-ipynb-2019-06-12-17_23_58 (1)

Desktop (please complete the following information):

 System Version:    macOS 10.14.4 (18E226)
 Kernel Version:    Darwin 18.5.0
 Boot Volume:   Coyote_HD
 Boot Mode: Normal
 torch                1.1.0   
 torchvision          0.3.0
 syft                0.1.18
 python           Python 3.7.3 (default, Mar 27 2019, 16:54:48)

Thank you !

Source

akaanirban

Most helpful comment

I can confirm this works with torch 1.0.1. Thanks a lot for the reply.

But any new installation of pysyft 0.1.18 has dependency of torch>=1.1.0. Anybody trying this with fresh installation might have issues. :smiley:

akaanirban on 13 Jun 2019

🎉2

All 8 comments

Hey @akaanirban,

There is an error in torch 1.1 (check https://github.com/pytorch/pytorch/issues/20017 for details), if you downgrade torch to 1.0.1 should work as expected :blush: .

mari-linhares on 12 Jun 2019

🎉2

I can confirm this works with torch 1.0.1. Thanks a lot for the reply.

But any new installation of pysyft 0.1.18 has dependency of torch>=1.1.0. Anybody trying this with fresh installation might have issues. :smiley:

akaanirban on 13 Jun 2019

🎉2

@mari-linhares , the vanilla tutorial worked flawlessly. However, I tried modifying the script for training MNIST model where the remote device contains the dataset, instead of sending/federating data to workers from the central scheduler.

I am in Torch 1.0.1, the vanilla TrainConfig tutorial works. But when I try to use a CNN model , I get the following error :

screencapture-localhost-8888-notebooks-TrainConfigMnist-ipynb-2019-06-13-18_06_43

Any idea why this is happening?

Thanks in advance.

akaanirban on 13 Jun 2019

Hey @akaanirban from this part of the notebook I'm not sure what can be wrong, can you send the file so I can try to execute it?

Also, maybe @midokura-silvia can help, she's the author of this PR that uses MNIST with train config for async federated training https://app.reviewnb.com/OpenMined/PySyft/pull/2217/files/

mari-linhares on 13 Jun 2019

👍1

@mari-linhares and @midokura-silvia I saw the PR. I am so excited about it. That was exactly what I was trying to do (though not the async part).

Now, @midokura-silvia's branch has torch>=1.1.0 and torchvision>=0.3.0. I tried with both combinations of (torch1.1.0+torchvision0.3) and (torch1.0.1 +torchvision0.2.2 , because of the bug mentioned in this tracker). I get error in both. Results of the following two cases:

1. Torch 1.1.0, torchvision 0.3 ==> I am getting the error mentioned in this bug. TypeError: Cannot serialize <torch._C.Function object at 0x134a7c2b0>, and also while converting the loss function to jit trace, I get the error

# Loss function 
@torch.jit.script
def loss_fn(output, target):
    return F.nll_loss(output, target)
type(loss_fn)
---------------------------------------------------------------------------
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
~/anaconda3/envs/pysyft/lib/python3.7/site-packages/torch/jit/annotations.py in parse_type_line(type_line)
     94     try:
---> 95         arg_ann = eval(arg_ann_str, _eval_env)
     96     except (NameError, SyntaxError) as e:

<string> in <module>

NameError: name 'Optional' is not defined

During handling of the above exception, another exception occurred:

RuntimeError                              Traceback (most recent call last)
<ipython-input-14-9fce5d7056f9> in <module>
      1 # Loss function
----> 2 @torch.jit.script
      3 def loss_fn(output, target):
      4     return F.nll_loss(output, target)
      5 

~/anaconda3/envs/pysyft/lib/python3.7/site-packages/torch/jit/__init__.py in script(obj, optimize, _frames_up, _rcb)
    822     else:
    823         ast = get_jit_def(obj)
--> 824         fn = torch._C._jit_script_compile(ast, _rcb, get_default_args(obj))
    825         # Forward docstrings
    826         fn.__doc__ = obj.__doc__

~/anaconda3/envs/pysyft/lib/python3.7/site-packages/torch/jit/annotations.py in get_signature(fn)
     53         return None
     54 
---> 55     return parse_type_line(type_line)
     56 
     57 

~/anaconda3/envs/pysyft/lib/python3.7/site-packages/torch/jit/annotations.py in parse_type_line(type_line)
     95         arg_ann = eval(arg_ann_str, _eval_env)
     96     except (NameError, SyntaxError) as e:
---> 97         raise RuntimeError("Failed to parse the argument list of a type annotation: {}".format(str(e)))
     98 
     99     if not isinstance(arg_ann, tuple):

RuntimeError: Failed to parse the argument list of a type annotation: name 'Optional' is not defined

The following is the file related to this version. Jupyter Notebook LINK I have put it on google colab with the results. The results in the notebook are from running it on my laptop.

2. Torch 1.0.1, torchvision 0.2.2 ==> To address the problem described in the original bug for this issue, I downgraded to 1.0.1, but then the vanilla tutorial works. But a MNIST related CNN throws the following error :

traced_model = torch.jit.trace(model, data)
---------------------------------------------------------------------------
---------------------------------------------------------------------------
PureTorchTensorFoundError                 Traceback (most recent call last)
~/anaconda3/envs/pysyft/lib/python3.7/site-packages/syft-0.1.18-py3.7.egg/syft/frameworks/torch/tensors/interpreters/native.py in handle_func_command(cls, command)
    198             new_args, new_kwargs, new_type, args_type = syft.frameworks.torch.hook_args.hook_function_args(
--> 199                 cmd, args, kwargs, return_args_type=True
    200             )

~/anaconda3/envs/pysyft/lib/python3.7/site-packages/syft-0.1.18-py3.7.egg/syft/frameworks/torch/hook/hook_args.py in hook_function_args(attr, args, kwargs, return_args_type)
    157         # Run it
--> 158         new_args = args_hook_function(args)
    159 

~/anaconda3/envs/pysyft/lib/python3.7/site-packages/syft-0.1.18-py3.7.egg/syft/frameworks/torch/hook/hook_args.py in <lambda>(x)
    341 
--> 342     return lambda x: f(lambdas, x)
    343 

~/anaconda3/envs/pysyft/lib/python3.7/site-packages/syft-0.1.18-py3.7.egg/syft/frameworks/torch/hook/hook_args.py in three_fold(lambdas, args, **kwargs)
    511     return (
--> 512         lambdas[0](args[0], **kwargs),
    513         lambdas[1](args[1], **kwargs),

~/anaconda3/envs/pysyft/lib/python3.7/site-packages/syft-0.1.18-py3.7.egg/syft/frameworks/torch/hook/hook_args.py in <lambda>(i)
    319         # Last if not, rule is probably == 1 so use type to return the right transformation.
--> 320         else lambda i: forward_func[type(i)](i)
    321         for a, r in zip(args, rules)  # And do this for all the args / rules provided

~/anaconda3/envs/pysyft/lib/python3.7/site-packages/syft-0.1.18-py3.7.egg/syft/frameworks/torch/hook/hook_args.py in <lambda>(i)
     50     if hasattr(i, "child")
---> 51     else (_ for _ in ()).throw(PureTorchTensorFoundError),
     52     torch.nn.Parameter: lambda i: i.child

~/anaconda3/envs/pysyft/lib/python3.7/site-packages/syft-0.1.18-py3.7.egg/syft/frameworks/torch/hook/hook_args.py in <genexpr>(.0)
     50     if hasattr(i, "child")
---> 51     else (_ for _ in ()).throw(PureTorchTensorFoundError),
     52     torch.nn.Parameter: lambda i: i.child

PureTorchTensorFoundError: 

During handling of the above exception, another exception occurred:

AttributeError                            Traceback (most recent call last)
<ipython-input-10-1e5257ea3eb4> in <module>
      1 # Create the trace jit version
----> 2 traced_model = torch.jit.trace(model, data)

~/anaconda3/envs/pysyft/lib/python3.7/site-packages/torch/jit/__init__.py in trace(func, example_inputs, optimize, check_trace, check_inputs, check_tolerance, _force_outplace)
    634     var_lookup_fn = _create_interpreter_name_lookup_fn(0)
    635     module._create_method_from_trace('forward', func, example_inputs,
--> 636                                      var_lookup_fn, _force_outplace)
    637 
    638     # Check the trace against new traces created from user-specified inputs

~/anaconda3/envs/pysyft/lib/python3.7/site-packages/torch/nn/modules/module.py in __call__(self, *input, **kwargs)
    485             hook(self, input)
    486         if torch._C._get_tracing_state():
--> 487             result = self._slow_forward(*input, **kwargs)
    488         else:
    489             result = self.forward(*input, **kwargs)

~/anaconda3/envs/pysyft/lib/python3.7/site-packages/torch/nn/modules/module.py in _slow_forward(self, *input, **kwargs)
    475         tracing_state._traced_module_stack.append(self)
    476         try:
--> 477             result = self.forward(*input, **kwargs)
    478         finally:
    479             tracing_state.pop_scope()

<ipython-input-7-ca434fff0989> in forward(self, x)
     10     def forward(self, x):
     11         x = F.relu(self.conv1(x))
---> 12         x = F.max_pool2d(x, 2, 2)
     13         x = F.relu(self.conv2(x))
     14         x = F.max_pool2d(x, 2, 2)

~/anaconda3/envs/pysyft/lib/python3.7/site-packages/syft-0.1.18-py3.7.egg/syft/frameworks/torch/hook/hook.py in overloaded_func(*args, **kwargs)
    703             cmd_name = f"{attr.__module__}.{attr.__name__}"
    704             command = (cmd_name, None, args, kwargs)
--> 705             response = TorchTensor.handle_func_command(command)
    706             return response
    707 

~/anaconda3/envs/pysyft/lib/python3.7/site-packages/syft-0.1.18-py3.7.egg/syft/frameworks/torch/tensors/interpreters/native.py in handle_func_command(cls, command)
    224             # in the execute_command function
    225             if isinstance(args, tuple):
--> 226                 response = eval(cmd)(*args, **kwargs)
    227             else:
    228                 response = eval(cmd)(args, **kwargs)

~/anaconda3/envs/pysyft/lib/python3.7/site-packages/syft-0.1.18-py3.7.egg/syft/frameworks/torch/tensors/interpreters/native.py in <module>

AttributeError: module 'torch._jit_internal' has no attribute 'native_fn'

The following is the file related to this version. Jupyter Notebook LINK.

@midokura-silvia since you submitted the PR, it would be very helpful if you could please let me know what configuration should work (like torch version/torchvision/syft version combination) . I tried both and getting these error.

For relevance, this is my system information:

System Version: macOS 10.14.4 (18E226)
 Kernel Version:    Darwin 18.5.0
 Boot Volume:   Coyote_HD
 Boot Mode: Normal
 syft                0.1.18
 python           Python 3.7.3 (default, Mar 27 2019, 16:54:48)

Thanks a lot. You folks are doing an excellent job here 😃 Please let me know if we can take this offline. It makes no sense to continue this chain on a closed issue tracker. Also I apologise if these errors are due to some stupid mistake I am making somewhere you can spot.

akaanirban on 14 Jun 2019

I am using torch==1.0.1 and torchvision==0.2.2.post3. torch=1.1.0 will not work. Will take a look at your issue later, latest Monday.

midokura-silvia on 14 Jun 2019

👍1

@midokura-silvia @mari-linhares It is working now. I installed pysyft from the pull request commit in midokura's fork. Seems like, there were several other changes in the pull request which were not in the dev version, but were needed in order for the example to run. This is probably why it was not working.

It'll be great when it is merged with the master branch. Thanks a lot for your help.

akaanirban on 16 Jun 2019

Should this issue be reopened? I still get TypeError: can not serialize 'torch._C.Function' object with Torch 1.1.
Or TrainConfig still only works with Torch 1.0.1? @midokura-silvia @mari-linhares

Thanks!