Pytorch-lightning: No progress bar when training on Google Colab

Created on 10 Mar 2020  Ā·  20Comments  Ā·  Source: PyTorchLightning/pytorch-lightning

šŸ› Bug

To Reproduce

Steps to reproduce the behavior:

  1. Go to https://colab.research.google.com/drive/1W-_30tbOBMz_t0_yozzwJzlcu6m3xd8W
  2. Run the Trainer section of the MNIST
  3. It downloads the MNIST dataset and keeps spinning for a while and thats it, no progress bar or anything.

Environment

Google Colab, with current github version of pytorch-lightning installed.
PyTorch version: 1.4.0
Is debug build: No
CUDA used to build PyTorch: 10.1

OS: Ubuntu 18.04.3 LTS
GCC version: (Ubuntu 7.4.0-1ubuntu1~18.04.1) 7.4.0
CMake version: version 3.12.0

Python version: 3.6
Is CUDA available: Yes
CUDA runtime version: 10.0.130
GPU models and configuration: GPU 0: Tesla K80
Nvidia driver version: 418.67
cuDNN version: /usr/lib/x86_64-linux-gnu/libcudnn.so.7.6.5

Versions of relevant libraries:
[pip3] numpy==1.17.5
[pip3] pytorch-lightning==0.7.1
[pip3] torch==1.4.0
[pip3] torchsummary==1.5.1
[pip3] torchtext==0.3.1
[pip3] torchvision==0.5.0
[conda] Could not collect

bug / fix help wanted

Most helpful comment

This seems to be a tqdm issue.
If I do from tqdm import tqdm it seems to work fine. Lightning, however, imports tqdm via from tqdm.auto import tqdm (link) which in turn imports tqdm via notebook from .notebook import tqdm, trange (link). When I run tqdm via notebook import, I get the HBox text.

All 20 comments

Hi! thanks for your contribution!, great first issue!

we test this very rigorously... look at the docs and the MNIST example then check your code.

I'm also having issues with the progress bar. Instead of a progress bar, I got

HBox(children=(FloatProgress(value=1.0, bar_style='info', layout=Layout(flex='2'), max=1.0), HTML(value='')), …

This happened to me only when using tpus and num_tpu_cores=8 (1 tpu core works just as expected). Interestingly, it is just the epoch progress bar, the validation progress bar shows as intended.
To reproduce:

  1. Open the mnist tpu notebook
  2. Factory reset to clear all saved states
  3. Run all

@williamFalcon I literally took the "MNIST on TPU" from the docs page and ran in on Google colab, and it showed no progress bar or anything.

I am having the same issue on Sagemaker too. No Progress bar, just the HBox(children=(FloatProgress(value=1.0, bar_style='info', layout=Layout(flex='2'), max=1.0), HTML(value='')) text.

@jwallat and @lezwon I'm suspicious the issue happens upstream with tqdm itself. The issue seems to be related with the cancelling (or failing) of a running operation. They suggest the following workaround:

https://github.com/tqdm/tqdm/issues/548#issuecomment-457291936

We could implement that into Lightning but other issue such as this one make me pessimistic about it working.

Any ideas?

@luiscape I have another notebook (without PL) wherein I use tqdm. The progress bar seems to be working fine there. Not sure why it isn't working with PL.
Used it in the following way:
for bi, data in tqdm(enumerate(data_loader), total=int(len(dataset) / data_loader.batch_size)):

check following fix, it should help #1093

@Borda I tried it on NextJournal, it still shows HBox(children=(FloatProgress(value=0.0, description='Validating', layout=Layout(flex='2'), max=157.0, style=Pr…

Screenshot 2020-03-17 at 2 12 35 AM

i don’t really know what this is either. this seems to be a colab thing.
just restart the environment

i see this issue when:

  1. i’m training
  2. colab times out or i stop execution or something
  3. then restart training.

fixes when i reset environment.

this is not a lightning issue though... might just be tqdm or colab

@williamFalcon I have tried this on Sagemaker and Nextjournal. tqdm works fine if I run it myself. When using Lightning though it shows the HBox text. I have tried restarting kernel, upgrading tqdm etc. does not seem to work.

This seems to be a tqdm issue.
If I do from tqdm import tqdm it seems to work fine. Lightning, however, imports tqdm via from tqdm.auto import tqdm (link) which in turn imports tqdm via notebook from .notebook import tqdm, trange (link). When I run tqdm via notebook import, I get the HBox text.

Even though I'm using from tqdm import tqdm I'm facing same issue, is there any other suggestions from anyone?
Thanks in advance😊

mind upgrade on 0.9 and try again... :]

Hi, I am facing the same problem, no progress bar on Jupyter Lab.

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validation sanity check', layout=Layout…

I am using pytorch_lightning 1.0.2;
Jupyter Lab 2.2.6;
Torch: 1.6.0;
Python: 3.6;

@czrcbl are you using sagemaker? or running jupyter lab locally?

I am running the Jupyter Lab in an Amazon ECS instance and connecting to it through ssh port forwarding.

I think this is because the ipywidgets notebook extension has not been enabled. Can you execute the instructions mentioned here before installing lightning? Let us know if it works :)

This worked for me
Thanks @lezwon

Was this page helpful?
0 / 5 - 0 ratings

Related issues

williamFalcon picture williamFalcon  Ā·  3Comments

chuong98 picture chuong98  Ā·  3Comments

edenlightning picture edenlightning  Ā·  3Comments

remisphere picture remisphere  Ā·  3Comments

jcreinhold picture jcreinhold  Ā·  3Comments