The program fails to create an iterator for a DataLoader object when the used dataset is LSUN and the amount of workers is greater than zero. I do not have such an error when work with other datasets. Something tells me that the issue might be caused by lmdb. I run on Windows 10, CUDA 10.
Code:
import torch.utils.data
import torchvision.datasets as dset
import torchvision.transforms as transforms
dataset = dset.LSUN(root='D:/bedroom_train_lmdb', classes=['bedroom_train'],
transform=transforms.Compose([
transforms.Resize((64, 64)),
transforms.ToTensor(),
transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5)),
]))
dataloader = torch.utils.data.DataLoader(dataset, batch_size=128,
shuffle=True, num_workers=4)
for data in dataloader:
print(data)
Error:
Traceback (most recent call last):
File "C:/Users/x/.PyCharm2018.3/config/scratches/scratch.py", line 15, in <module>
for data in dataloader:
File "C:\Anaconda3\lib\site-packages\torch\utils\data\dataloader.py", line 819, in __iter__
return _DataLoaderIter(self)
File "C:\Anaconda3\lib\site-packages\torch\utils\data\dataloader.py", line 560, in __init__
w.start()
File "C:\Anaconda3\lib\multiprocessing\process.py", line 112, in start
self._popen = self._Popen(self)
File "C:\Anaconda3\lib\multiprocessing\context.py", line 223, in _Popen
return _default_context.get_context().Process._Popen(process_obj)
File "C:\Anaconda3\lib\multiprocessing\context.py", line 322, in _Popen
return Popen(process_obj)
File "C:\Anaconda3\lib\multiprocessing\popen_spawn_win32.py", line 65, in __init__
reduction.dump(process_obj, to_child)
File "C:\Anaconda3\lib\multiprocessing\reduction.py", line 60, in dump
ForkingPickler(file, protocol).dump(obj)
TypeError: can't pickle Environment objects
This seems to be a Windows-specific issue.
But note that even if we address this particular issue (I have no idea how to do it though), you would probably hit another issue further on, which is https://github.com/pytorch/vision/issues/619
this issue also appear in linux, the reason is the opened lmdb env can not be pickled
@Santiago810 Do you know how to diagnose the issue of an un-pickleable lmdb env?
I have the same issue with dataloader when I do not use lmdb dataset.
I think this is a limitation of LMDB in python (and LSUN which uses LMDB internally), and I think there is not much we can do on torchvision side unfortunately.
I implemented my own LMDB dataset and had the same issue when using LMDB with num_workers > 0 and torch multiprocessing set to spawn.
It is very similar to this project's LSUN implementation, in my case the issue was with this line:
https://github.com/pytorch/vision/blob/master/torchvision/datasets/lsun.py#L18
When set to fork it works fine, but when using spawn it seems to try to pickle the dataset object which has the self.env attribute which is a lmdb Environment.
Just use it and discard the reference in the __init__ then instantiate it again in the getitem and save the reference in the class.
@4knahs if you think you could send a PR fixing the LSUN implementation it would be great!
I saw a solution somewhere else by adding __getstate__ and __setstate__.
def __getstate__(self):
state = self.__dict__
state["db_txn"] = None
return state
def __setstate__(self, state):
self.__dict__ = state
env = lmdb.open(self.db_path, subdir=os.path.isdir(self.db_path),
readonly=True, lock=False,
readahead=False, meminit=False,
map_size=1099511627776 * 2,)
self.db_txn = env.begin(write=False)
This also doens't save self.env but instead of saving the txn.
Solution: open lmdb in worker_init_fn of torch.utils.data.DataLoader
Could you elaborate or give an example @Santiago810 ?
Most helpful comment
this issue also appear in linux, the reason is the opened lmdb env can not be pickled