Ignite: Rerun tutorials on Colab

Created on 5 Nov 2020 · 7Comments · Source: pytorch/ignite

🚀 Feature

Currently, we propose to our users the following list of tutorials that could be run on Google Colab:

https://github.com/pytorch/ignite#tutorials

The idea is to ensure that they are running without exceptions due to the API changes with the latest release (v0.4.2) and do not contain any usage of deprecated API.

Note:
Training of CycleGAN tutorials (Training Cycle-GAN on Horses to Zebras with Nvidia/Apex, Another training Cycle-GAN on Horses to Zebras with Native Torch CUDA AMP) can take more than 12h. It is sufficient to verify if training can accomplish a single epoch.

For PyDataGlobal contributors, feel free to ask questions for details if any and say that you would like to tackle the issue.
Please, take a look at CONTRIBUTING guide. This issue can be assigned to multiple persons.

enhancement good first issue help wanted

Source

vfdev-5

All 7 comments

Take

abdulelahsm on 15 Nov 2020

Hey @vfdev-5 It's taking me more than 6 hours to train 200 epochs of CycleGAN_with_ignite_and_torch_cuda_amp.ipynb, maybe we should consider reducing it to 20 or 50. I think 200 epochs are a bit too much for a tutorial.

abdulelahsm on 15 Nov 2020

Running for 20 or 50 epochs wont give a proper model that correctly translates one images to another. Our idea of the tutorial is to however produce the same training as the original cycle gan. Maybe, we add a note that it could run for more than 12h. For this issue, it can be sufficient to run for 2-5 epochs as a smoke test.

vfdev-5 on 16 Nov 2020

@vfdev-5 great approach

abdulelahsm on 16 Nov 2020

👍1

Hey, I'm facing some issues with tensorboard in the cloud TPU notebook, I think nothing is getting uploaded to "/tmp/tb_logs"

I'm not familiar with tensorboard, I'll try to learn to fix it which might take me until this weekend.

abdulelahsm on 18 Nov 2020

@abdulelahsm thanks for the feedback! Which TPU notebook you are running ? Let me also check that from my side to understand the issue.

vfdev-5 on 18 Nov 2020

@vfdev-5 examples/notebooks/MNIST_TPU.ipynb

abdulelahsm on 18 Nov 2020

👍1

Was this page helpful?

0 / 5 - 0 ratings

Related issues

Metrics for GANs

vfdev-5 · 3Comments

Accumulate output of .run()

andreydung · 4Comments

Communication between callbacks?

samarth-robo · 3Comments

PyTorch dependency is lacking version constraint

sisp · 3Comments

Examples are not working

karfly · 4Comments