Pytorch-lightning: Understanding Metrics + DDP

Created on 24 Jun 2020  Â·  8Comments  Â·  Source: PyTorchLightning/pytorch-lightning

To my understanding the new metrics package aggregates the metrics between the different nodes if using DDP. So if I use DDP with 2 GPUs then validation_epoch_end will be called 2 times, each time on a subset of the validation data. If I calculate the F1 score for example this will give me 2 different scores.
Now if I use from pytorch_lightning.metrics.functional import f1_score then this will internally aggregate the F1 score for both processes (at least that's what I think it does). But I still get different F1 scores for each process.

This is my code:

from sklearn.metrics import f1_score
from pytorch_lightning.metrics.functional import f1_score as f1_score_sync

class MyModel(pl.LightningModule):
   ...
    def validation_epoch_end(self, outputs):
        pred = torch.cat([x["pred"] for x in outputs])
        target = torch.cat([x["target"] for x in outputs])

        # This will get printed 2 times if gpus=2 and DDP
        print(f1_score(targer, pred, average="macro"))  # this is different for each process
        print(f1_score_sync(pred, target))   # I expect this to be the same for both processes but it is different

Do I have to use f1_score_sync differently?

Metrics question won't fix

Most helpful comment

If I am using 2 gpus with DDP I will get two processes and each is calculating his own loss on his share of the data. Process 1 gives me the loss for the first half of the validation data and process 2 gives me the loss for the second half. Now I want to get the mean of both of them to get one final loss for the entire validation data.

All 8 comments

Hi! thanks for your contribution!, great first issue!

mind have look @justusschock @SkafteNicki ^^

So the functional interface does not come with ddp support, these are just native torch implementations of the respective metrics. The modular interface comes with ddp support. Here you got two options, either the native or sklearn backend. For f1 score you can do:

from pytorch_lightning.metrics import F1  # native backend
from pytorch_lightning.metrics.sklearn import F1 # sklearn backend

then in the init of your model you initialize these as any other module: self.f1_metric = F1()
and then in validation_epoch_end you can call it as in your code.

I see. Thank you!
What would be the best way to aggregate the validation loss across all processes?

Can you elaborate what you mean by aggregate? do you mean to over multiple batches?

If I am using 2 gpus with DDP I will get two processes and each is calculating his own loss on his share of the data. Process 1 gives me the loss for the first half of the validation data and process 2 gives me the loss for the second half. Now I want to get the mean of both of them to get one final loss for the entire validation data.

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

This issue has been automatically marked as stale because it hasn't had any recent activity. This issue will be closed in 7 days if no further activity occurs. Thank you for your contributions, Pytorch Lightning Team!

Was this page helpful?
0 / 5 - 0 ratings