Pytorch-lightning: Model summarize displayed twice before training starts

Created on 11 Sep 2020 · 15Comments · Source: PyTorchLightning/pytorch-lightning

🐛 Bug

Model summarize is called in two spots, which results in duplication in the logs:

It's called once in the training loop: https://github.com/PyTorchLightning/pytorch-lightning/blob/master/pytorch_lightning/trainer/training_loop.py#L160

And I'm unsure how it's called again

119083522_2815247905426328_3914141584028152470_n

The first time is with the logger.info(...) since it's formatted. The second time it appears to be logged with print(...) since the table formatting doesn't handle newlines

To Reproduce

Run any training with weights_summary set on the trainer

Expected behavior

We only call summarize once

bug / fix good first issue help wanted

Source

ananthsub

👍1

Most helpful comment

Tested, I am good on my side

JVGD on 1 Oct 2020

🚀1 👍1

All 15 comments

Tried on master. Working fine. Is it happening with ddp backend only?

rohitgr7 on 12 Sep 2020

Yes this happens with DDP backend

ananthsub on 12 Sep 2020

on my windows machine, single gpu, happens too. but the logs also appear twice, including all warnings.

TPU available: False, using: 0 TPU cores
TPU available: False, using: 0 TPU cores
CUDA_VISIBLE_DEVICES: [0]
CUDA_VISIBLE_DEVICES: [0]

¨
which means it is logging related, not model summary.
I will investigate

awaelchli on 12 Sep 2020

👍2

Just started migration of our codebase to Lightning and the first trainer I write have this issue... Could this be something related with duplicated handlers in logger? It might be that the logger.getLogger() which is a singleton is called twice, so it might be that 2 handlers are initialized, the root handler and another stream handler

This is my case:

JVGD on 22 Sep 2020

yes, I think that's the case. does my PR not resolve the problem for you? which os are you on?

awaelchli on 22 Sep 2020

I am using Ubuntu 18.04.

I tried you PR but unfortunately it does not solve the issue. Maybe I am doing something wrong?

I think this is what might be happening: https://jdhao.github.io/2020/06/20/python_duplicate_logging_messages/

JVGD on 22 Sep 2020

@JVGD could you try again with my updated branch. sorry, I don't have much experience with this python logging, for me the duplicate messages are resolved locally. I need you as the tester 😄
I tried my windows and ubunut multi-gpu

awaelchli on 22 Sep 2020

Unfortunately is still showing duplicated logs. I am going to do further testing, but as far as I know I think the pytroch_lightning logger adds an extra StreamHandler to the base root logger. Maybe that is why there are duplicates

JVGD on 22 Sep 2020

👍1

indeed, if we remove the streamhandler, this also solves the problem (for me). can you confirm?
https://github.com/PyTorchLightning/pytorch-lightning/pull/3514/files

awaelchli on 22 Sep 2020

❤2

Amazing! it was just deleting that line and now it works like a charm. By the way, I did it in the pytorch-lightning==0.9.0 version, I was in my development environment. So I haven't been able to test it on your branch. But tomorrow I will!

Thank you for the support. Now it looks great:

I am not sure if that change alone suffices for avoiding duplicated logging in other platforms, but in my case it is indeed.

JVGD on 22 Sep 2020

👍1

@JVGD this is all really strange. I checked again with a different machine (also ubuntu 18) and now It's not printing the model summary, while on my notebook it did.
I need to go through the python versions and see if there is a difference. ugh :(

awaelchli on 23 Sep 2020

Mmm really strange @ashwinb, I think we almost got this bug hunted down.

I got an hypothesis: maybe you don't see any logs because the root logger does not have any handlers. This means, pytorch_lightning did not add any handlers (because we removed it), and app you are running does not add any other handlers.

I think it works in my case because my app already adds a StreamHandler at the beginning. So there is at least one StreamHandler.

If it helps, in my python project I check explicitly that there is only one StreamHandler in the root base logger. Maybe doing something like this could help. The idea is to add a StreamHandler only if there is none, but if it is at least one, then use that one (instead of adding another that will print duplicated logs).

        # Checking if there is already a stream
        # handler for avoinding log repetition
        console_hanlder_configured = any(
            [isinstance(handler, logging.StreamHandler)
             for handler in root_logger.handlers])

        # If console handler not configured, then add it
        if not console_hanlder_configured:
            # Handler for logging to stdout
            console_handler = logging.StreamHandler()
            console_handler.setFormatter(formatter)
            console_handler.setLevel(level)

            # Adding handler to logger
            root_logger.addHandler(console_handler)

JVGD on 23 Sep 2020

👍1

are we good on this? i also made a bunch of stability fixes to ddp including duplicate logging and so on