cd examples/text-classification
./run_pl.sh
trains for 1 epoch, tests 12/13 steps, then fails with the following traceback
/home/shleifer/.conda/envs/nb/lib/python3.7/site-packages/pytorch_lightning/utilities/warnings.py:18: UserWarning: The dataloader, test dataloader 0, does not have many workers which may be a bottleneck. Consider increasing the value of the `num_workers` argument` in the `DataLoader` init to improve performance.
warnings.warn(*args, **kwargs)
Testing: 92%|鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅 | 12/13 [00:00<00:00, 14.67it/s]Traceback (most recent call last):
File "run_pl_glue.py", line 195, in <module>
trainer.test(model)
File "/home/shleifer/.conda/envs/nb/lib/python3.7/site-packages/pytorch_lightning/trainer/trainer.py", line 894, in test
self.fit(model)
File "/home/shleifer/.conda/envs/nb/lib/python3.7/site-packages/pytorch_lightning/trainer/trainer.py", line 704, in fit
self.single_gpu_train(model)
File "/home/shleifer/.conda/envs/nb/lib/python3.7/site-packages/pytorch_lightning/trainer/distrib_parts.py", line 477, in single_gpu_train
self.run_pretrain_routine(model)
File "/home/shleifer/.conda/envs/nb/lib/python3.7/site-packages/pytorch_lightning/trainer/trainer.py", line 819, in run_pretrain_routine
self.run_evaluation(test_mode=True)
File "/home/shleifer/.conda/envs/nb/lib/python3.7/site-packages/pytorch_lightning/trainer/evaluation_loop.py", line 374, in run_evaluation
eval_results)
File "/home/shleifer/.conda/envs/nb/lib/python3.7/site-packages/pytorch_lightning/trainer/logging.py", line 107, in process_output
for k, v in output.items():
AttributeError: 'NoneType' object has no attribute 'items'
File "/home/shleifer/.conda/envs/nb/lib/python3.7/site-packages/pytorch_lightning/trainer/logging.py", line 107, in process_output
for k, v in output.items():
AttributeError: 'NoneType' object has no attribute 'items'
I encountered this bug
Removing the test_end method in BaseTransformer (file: https://github.com/huggingface/transformers/blob/master/examples/lightning_base.py) solved the issue for me.
From my understanding, what is happening here is that at the end of testing, test_end is being called which is just returning None and that's why we are getting the error.
test_end is actually deprecated now in favor of test_epoch_end. In run_pl_glue.py we are using test_epoch_end whereas in lightning_base.py, we are using test_end. So if we have both of them, then I think only test_end will be called. So if we just remove it, we will have a call to test_epoch_end which will give us the correct result.
@divkakwani Thank you! Perfect!
@divkakwani nice catch! Will remove test_end from BaseTransformer when I submit PR for #4494.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
It looks like this particular issue has already been fixed (test_end is no more in master), not in a PR suggested in https://github.com/huggingface/transformers/issues/4214#issuecomment-632988310, but elsewhere, so this issue can be closed.
Most helpful comment
Removing the
test_endmethod inBaseTransformer(file: https://github.com/huggingface/transformers/blob/master/examples/lightning_base.py) solved the issue for me.From my understanding, what is happening here is that at the end of testing,
test_endis being called which is just returningNoneand that's why we are getting the error.test_endis actually deprecated now in favor oftest_epoch_end. Inrun_pl_glue.pywe are usingtest_epoch_endwhereas inlightning_base.py, we are usingtest_end. So if we have both of them, then I think onlytest_endwill be called. So if we just remove it, we will have a call totest_epoch_endwhich will give us the correct result.