```
import numpy as np
import torch
import torch.utils.data as data
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim
from torchvision import datasets, transforms
import torch
from torch import nn
from torch import optim
from torchvision.datasets.mnist import MNIST
import pdb
import syft as sy
class Net(nn.Module):
def __init__(self):
super(Net, self).__init__()
self.conv1 = nn.Conv2d(1, 20, 5, 1)
self.conv2 = nn.Conv2d(20, 50, 5, 1)
self.fc1 = nn.Linear(4450, 500)
self.fc2 = nn.Linear(500, 10)
def forward(self, x):
x = F.relu(self.conv1(x))
x = F.max_pool2d(x, 2, 2)
x = F.relu(self.conv2(x))
x = F.max_pool2d(x, 2, 2)
x = x.view(-1, 4*4*50)
x = F.relu(self.fc1(x))
x = self.fc2(x)
return F.log_softmax(x, dim=1)
hook = sy.TorchHook(torch)
device = torch.device("cuda")
model = Net().to(device)
print(model) ```
Can you add the stacktrace?
One good-first-issue would be in the case you have only cpu, and try to call .to(device), to fix the error which is only due to us not serializing devices. This could be easily handled.
_The part with the gpu and cuda is not part of the good-first-issue._
Traceback (most recent call last):
File "test.py", line 37, in <module>
model = Net().to(device)
File "/network/home/maloneyj/.local/lib/python3.6/site-packages/torch/nn/modules/module.py", line 381, in to
return self._apply(convert)
File "/network/home/maloneyj/.local/lib/python3.6/site-packages/torch/nn/modules/module.py", line 187, in _apply
module._apply(fn)
File "/network/home/maloneyj/.local/lib/python3.6/site-packages/torch/nn/modules/module.py", line 193, in _apply
param.data = fn(param.data)
File "/network/home/maloneyj/PySyft/syft/frameworks/torch/hook.py", line 339, in data
self.native_param_data.set_(new_data) # .wrap()
RuntimeError: Expected object of backend CPU but got backend CUDA for argument #2 'source'
in data(): new_data is cuda whereas native_param_data is cpu
fix should be moving new_data to cpu while setting native_param_data
Please correct me if I'm wrong
FYI, for the code snippet above I'm able to fix the error adding this line after importing torch:
torch.set_default_tensor_type(torch.cuda.FloatTensor)
Not sure if this is ideal for every case...
FYI, for the code snippet above I'm able to fix the error adding this line after importing torch:
torch.set_default_tensor_type(torch.cuda.FloatTensor)Not sure if this is ideal for every case...
@mari-linhares nice finding. This is changing default tensor type and should be considered as a work around for now. What we are truely looking for is to to data transfer over cuda
@bhushan23 do you know why changing the tensor to FloatTensor works? I'm not using my work computer right now, but I assume since there's no error that the data is transferred over to cuda. I'll check when I get home.
@mari-linhares
try changing torch.set_default_tensor_type(torch.cuda.FloatTensor) with torch.set_default_tensor_type(torch.FloatTensor) you will be able to reproduce original error.
set_default_tensor_type creates tensors of specified type and I think, setting to cuda float tensor is leading to ensuring hook is also using cuda.floattensor and hence no error as both new_data and native_param_data are on cuda
Ref: https://pytorch.org/docs/stable/torch.html#torch.set_default_tensor_type
This issue is breaking tutorial 8 at the moment when used with CUDA.
[Python 3.7, PySyft from master branch]
I've tried to apply @mari-linhares 's workaround to no success so far:
IndexError: list index out of range---------------------------------------------------------------------------
KeyError Traceback (most recent call last)
/home/xxx/PySyft/venv/lib/python3.7/site-packages/syft-0.1.14a1-py3.7.egg/syft/frameworks/torch/hook/hook_args.py in hook_function_args(attr, args, kwargs, return_args_type)
135 # TODO rename registry or use another one than for methods
--> 136 hook_args = hook_method_args_functions[attr]
137 get_tensor_type_function = get_tensor_type_functions[attr]
KeyError: 'torch.set_default_tensor_type'
During handling of the above exception, another exception occurred:
IndexError Traceback (most recent call last)
<ipython-input-6-fb50db14b868> in <module>
3 if device.type == "cuda":
4 os.environ["CUDA_VISIBLE_DEVICES"] = "0"
----> 5 torch.set_default_tensor_type(torch.cuda.FloatTensor)
/home/xxx/PySyft/venv/lib/python3.7/site-packages/syft-0.1.14a1-py3.7.egg/syft/frameworks/torch/hook/hook.py in overloaded_func(*args, **kwargs)
691 cmd_name = f"{attr.__module__}.{attr.__name__}"
692 command = (cmd_name, None, args, kwargs)
--> 693 response = TorchTensor.handle_func_command(command)
694 return response
695
/home/xxx/PySyft/venv/lib/python3.7/site-packages/syft-0.1.14a1-py3.7.egg/syft/frameworks/torch/tensors/interpreters/native.py in handle_func_command(cls, command)
191 # Note that we return also args_type which helps handling case 3 in the docstring
192 new_args, new_kwargs, new_type, args_type = syft.frameworks.torch.hook_args.hook_function_args(
--> 193 cmd, args, kwargs, return_args_type=True
194 )
195 # This handles case 3: it redirects the command to the appropriate class depending
/home/xxx/PySyft/venv/lib/python3.7/site-packages/syft-0.1.14a1-py3.7.egg/syft/frameworks/torch/hook/hook_args.py in hook_function_args(attr, args, kwargs, return_args_type)
141 except (IndexError, KeyError, AssertionError): # Update the function in case of an error
142 args_hook_function, get_tensor_type_function = build_hook_args_function(
--> 143 args, return_tuple=True
144 )
145 # Store the utility functions in registries
/home/xxx/PySyft/venv/lib/python3.7/site-packages/syft-0.1.14a1-py3.7.egg/syft/frameworks/torch/hook/hook_args.py in build_hook_args_function(args, return_tuple)
171 # Build a function with this rule to efficiently the child type of the
172 # tensor found in the args
--> 173 get_tensor_type_function = build_get_tensor_type(rule)
174 return args_hook_function, get_tensor_type_function
175
/home/xxx/PySyft/venv/lib/python3.7/site-packages/syft-0.1.14a1-py3.7.egg/syft/frameworks/torch/hook/hook_args.py in build_get_tensor_type(rules, layer)
392
393 if first_layer:
--> 394 return lambdas[0]
395 else:
396 return lambdas
IndexError: list index out of range
import torch, data distribution fails with RuntimeError: expected type torch.FloatTensor but got torch.cuda.FloatTensorRuntimeError Traceback (most recent call last)
<ipython-input-8-4085cd6569bc> in <module>
5 transforms.Normalize((0.1307,), (0.3081,))
6 ]))
----> 7 .federate((bob, alice)), # <-- NEW: we distribute the dataset across all the workers, it's now a FederatedDataset
8 batch_size=args.batch_size, shuffle=True, **kwargs)
9
/home/xxx/PySyft/venv/lib/python3.7/site-packages/syft-0.1.14a1-py3.7.egg/syft/frameworks/torch/federated/dataset.py in dataset_federate(dataset, workers)
89 datasets = []
90 data_loader = torch.utils.data.DataLoader(dataset, batch_size=data_size)
---> 91 for dataset_idx, (data, targets) in enumerate(data_loader):
92 worker = workers[dataset_idx % len(workers)]
93 logger.debug("Sending data to worker %s", worker.id)
/home/xxx/PySyft/venv/lib/python3.7/site-packages/torch/utils/data/dataloader.py in __next__(self)
613 if self.num_workers == 0: # same-process loading
614 indices = next(self.sample_iter) # may raise StopIteration
--> 615 batch = self.collate_fn([self.dataset[i] for i in indices])
616 if self.pin_memory:
617 batch = pin_memory_batch(batch)
/home/xxx/PySyft/venv/lib/python3.7/site-packages/torch/utils/data/dataloader.py in <listcomp>(.0)
613 if self.num_workers == 0: # same-process loading
614 indices = next(self.sample_iter) # may raise StopIteration
--> 615 batch = self.collate_fn([self.dataset[i] for i in indices])
616 if self.pin_memory:
617 batch = pin_memory_batch(batch)
/home/xxx/PySyft/venv/lib/python3.7/site-packages/torchvision/datasets/mnist.py in __getitem__(self, index)
93
94 if self.transform is not None:
---> 95 img = self.transform(img)
96
97 if self.target_transform is not None:
/home/xxx/PySyft/venv/lib/python3.7/site-packages/torchvision/transforms/transforms.py in __call__(self, img)
58 def __call__(self, img):
59 for t in self.transforms:
---> 60 img = t(img)
61 return img
62
/home/xxx/PySyft/venv/lib/python3.7/site-packages/torchvision/transforms/transforms.py in __call__(self, tensor)
161 Tensor: Normalized Tensor image.
162 """
--> 163 return F.normalize(tensor, self.mean, self.std, self.inplace)
164
165 def __repr__(self):
/home/xxx/PySyft/venv/lib/python3.7/site-packages/torchvision/transforms/functional.py in normalize(tensor, mean, std, inplace)
206 mean = torch.tensor(mean, dtype=torch.float32)
207 std = torch.tensor(std, dtype=torch.float32)
--> 208 tensor.sub_(mean[:, None, None]).div_(std[:, None, None])
209 return tensor
210
/home/xxx/PySyft/venv/lib/python3.7/site-packages/syft-0.1.14a1-py3.7.egg/syft/frameworks/torch/hook/hook.py in overloaded_native_method(self, *args, **kwargs)
637 except BaseException as e:
638 # we can make some errors more descriptive with this method
--> 639 raise route_method_exception(e, self, args, kwargs)
640
641 else: # means that there is a wrapper to remove
/home/xxx/PySyft/venv/lib/python3.7/site-packages/syft-0.1.14a1-py3.7.egg/syft/frameworks/torch/hook/hook.py in overloaded_native_method(self, *args, **kwargs)
631 try:
632 if isinstance(args, tuple):
--> 633 response = method(*args, **kwargs)
634 else:
635 response = method(args, **kwargs)
RuntimeError: expected type torch.FloatTensor but got torch.cuda.FloatTensor
This might also break other tutorials when using GPUs, have not tried yet.
_Disclaimer: I needed this running (asap) to train a large-ish dataset on GPU, so it's more like a hacky workaround than an actual solution, but I'll be happy if it helps anyone with the same problem. Or can be used to build up a proper fix._
So I was experiencing the same problem as @jopasserat . I think the problem was that the function that handles function commands in hooks.py needed to convert a _non-tensor_ to a _torch.tensor_ , but this exception was not contemplated. So basically I am catching the IndexError that happens in those cases.
Apart from that I'm using torch.set_default_tensor_type(torch.cuda.FloatTensor). I tried to convert the non-cuda tensors to cuda but for some reason in the wrapped tensors .to() method doesn't seem to work for me. So unless I set it to default I get the problem @bhushan23 was mentioning. Plus, I need to set the num_workers=0 (I think the method has been overwritten so it does not run anymore) and pin_memory=False (as that will only work with dense CPU tensors).
FYI, for the code snippet above I'm able to fix the error adding this line after importing torch:
torch.set_default_tensor_type(torch.cuda.FloatTensor)Not sure if this is ideal for every case...
I am so glad to see this, it really solved my problem of worrying all afternoon
I found a workaround that worked for my needs. The trick is to hook pysyft after you move your model to cuda.
I found a workaround that worked for my needs. The trick is to hook pysyft after you move your model to cuda.
what do you mean? Hook the worker?But if you hook after these, how can you make federated dataset? For example,in the tutorial example>advanced>CIFAR10. Can you post your code
Most helpful comment
@mari-linhares
try changing
torch.set_default_tensor_type(torch.cuda.FloatTensor)withtorch.set_default_tensor_type(torch.FloatTensor)you will be able to reproduce original error.set_default_tensor_typecreates tensors of specified type and I think, setting to cuda float tensor is leading to ensuring hook is also using cuda.floattensor and hence no error as bothnew_dataandnative_param_dataare on cudaRef: https://pytorch.org/docs/stable/torch.html#torch.set_default_tensor_type