Pytorch-lightning: Add support to log hparams and metrics to tensorboard?

Created on 24 Mar 2020  路  50Comments  路  Source: PyTorchLightning/pytorch-lightning

How can I log metrics (_e.g._ validation loss of best epoch) together with the set of hyperparameters?

I have looked through the docs and through the code.
It seems like an obvious thing, so maybe I'm just not getting it.

Currently, the only way that I found was to extend the logger class:

class MyTensorBoardLogger(TensorBoardLogger):

    def __init__(self, *args, **kwargs):
        super(MyTensorBoardLogger, self).__init__(*args, **kwargs)

    def log_hyperparams(self, *args, **kwargs):
        pass

    @rank_zero_only
    def log_hyperparams_metrics(self, params: dict, metrics: dict) -> None:
        params = self._convert_params(params)
        exp, ssi, sei = hparams(params, metrics)
        writer = self.experiment._get_file_writer()
        writer.add_summary(exp)
        writer.add_summary(ssi)
        writer.add_summary(sei)
        # some alternative should be added
        self.tags.update(params)

And then I'm writing the hparams with metrics in a callback:

def on_train_end(self, trainer, module):
        module.logger.log_hyperparams_metrics(module.hparams, {'val_loss': self.best_val_loss})

But that doesn't seem right.
Is there a better way to write some metric together with the hparams as well?

Environment

  • OS: Ubuntu18.04
  • conda4.8.3
  • pytorch-lightning==0.7.1

    • torch==1.4.0

discussion help wanted question won't fix

Most helpful comment

yes, let's officially make this a fix! @mRcSchwering want to submit a PR?

All 50 comments

Hi! thanks for your contribution!, great first issue!

it seems to be duplicated, pls continue in #1225

Actually #1225 is not related. In that issue it's about providing a Namespace as hparams. Here, its about logging a metric such as validation accuracy together with a set of hparams.

at which point would you like to log that? a) at each training step b) at the end of training c) something else?

Did you try this: In training_step for example:

self.hparams.my_custom_metric = 3.14
self.logger.log_hyperparams(hparams)

Yes I tried it. It seems that updating the hyperparams (writing a seond time) doesn't work. That's why I overwrite the original log_hyperparams to do nothing, and then only call my own implementation log_hyperparams_metrics at the very end of the training. (log_hyperparams is called automatically by the pytorch lightning framework at some point during the start of the trianing).

I want to achieve b).
_E.g._ I run 10 random hparams sampling rounds, then I want to know which hparam set gave the best validation loss.

I am maybe something missing on your use-case... you are running some hyperparameter search with just one logger for all results? Not sure if storing all into single logger run is a good idea :rabbit:

Usually I am training and validating a model with different sets of hyperparameters in a loop. After each round the final output is often something like "val-loss". This would be the validation loss of the best epoch achieved in this particular round.
Eventually, I have a number of "best validation losses" and each of these represents a certain set of hyper parameters.

After the last training round I am looking at the various sets of hyperparameters and compare them to their associated best validation loss. Tensorboard already provides tools for that: Visualize the results in TensorBoard's HParams plugin.

In your TensorBoardLogger you are already using the hparams function to summarize the hyperparameters. So, you are almost there. This function can also take metrics as a second argument. However, in your current implementation you always pass a {}. That's why I had to overwrite your original implementation. Furthermore you are writing this summary once in the beginning of the round. But the metrics are only known at the very end.

I wonder, how do you compare different hyperparameter sets? Maybe there is a functionality that I didn't find...

I am also seeing this same issue. No matter what I write to with log_hyperparams, no data is outptu to Tensorboard. I only see a line in tensorboard for my log with no data for each log. The input I am using is a dict with values filled. I tried both before and after trainer.fit() and no results.

So it appears like Trainer.fit() is calling run_pretrain_routine which checks if the trainer has the hparams attribute. Since this attribute is defined in the __init__ function of Trainer, even though it is set to None by default, Lightning will still write out the empty hyperparameters to the logger. In the case of Tensorfboard, this causes all subsequent writes to the hyper-parameters to be ignored. You can sole this in one of two ways:

  1. Call delattr(model, "hparams") on your LightningModule before trainer.fit() to ensure the hyperparameters are not automatically written out.
  2. Use @mRcSchwering code above. This will not write initially since he is passing the original log_hyperparams.
class MyTensorBoardLogger(TensorBoardLogger):

    def __init__(self, *args, **kwargs):
        super(MyTensorBoardLogger, self).__init__(*args, **kwargs)

    def log_hyperparams(self, *args, **kwargs):
        pass

    @rank_zero_only
    def log_hyperparams_metrics(self, params: dict, metrics: dict) -> None:
        from torch.utils.tensorboard.summary import hparams
        params = self._convert_params(params)
        exp, ssi, sei = hparams(params, metrics)
        writer = self.experiment._get_file_writer()
        writer.add_summary(exp)
        writer.add_summary(ssi)
        writer.add_summary(sei)
        # some alternative should be added
        self.tags.update(params)

Tip: I was only able to get metric values to display when they matched the tag name of a scalar I had logged previously in the log.

umm. so weird. my tb logs hparams automatically when they are attached tk the module (self.hparams=hparams).

and metrics are logged normally through the log key.

did you try that? what exactly are you trying to do?

What log key are you using? I was never able to get metrics to show up in the hyperparameters window without using the above method. I can log the metrics to the scalars tab but not the hparams tab. I am going for something like this in my hparams tab:
image
I was able to get this using the above code. Setting the hparams attribute gave me the first column with the hyperparameters but I could not figure out how to add the accuracy and loss columns without the modified logger. Thanks!

Maybe a full simple example in Colab could be easier for this discussion?

@SpontaneousDuck I tried your fix with pytorch-lightning 0.7.4rc7 but it did not work for me.
I got the hparams in tb but the metrics are not enabled.

and metrics are logged normally through the log key.

Can you please give a simple example?

@mRcSchwering

  1. if the cluster or (you) kill your job, how would you log the parameters?
  2. What if I need to know the params AS the job is running so that i can see the effect each has on the loss curve?

These are the reasons we do this in the beginning of training... but agree this is not optimal. So, this seems to be a fundamental design problem with tensorboard unless I'm missing something obvious?

FYI:
@awaelchli , @justusschock

@williamFalcon indeed both cases wont work with my solution.
Currently, I just wrote a function that writes the set of hparams with the best loss metric into a csv and updates it after every round.

It's not really nice because I have all the epoch-wise metrics nicely organized in the tensorboard, but the hparams overview is in a separate .csv without all the tensorboard features.

Just a question: If you log them multiple times, would they be overwritten on tensorboard? Because then we could probably also log them each epoch.

From some comments I get the feeling we might be talking about 2 things here.
There are 2 levels of metrics here. Let me clarify with an example:

Say I have a model which has the hyperparameter learning rate, it's a classifier, I want to train it for 100 epochs, I want to try out 2 different learning rates (say 1e-4 and 1e-5).

On the one hand, I want to describe the training progress. I will do that with a training/validation loss and accuracy, which I write down for every epoch. Let me call these metrics epoch-wise metrics.

On the other hand, I want to have an overview of which learning rate worked better.
So maybe 1e-4 had the lowest validation loss at epoch 20 with 0.6, and 1e-5 at epoch 80 with 0.5.
This summary might get updates every epoch, but it only summarizes the best achieved epoch of every training run. In this summary I can see what influence the learning rate has on the training. Let me call this run metrics.

What I was talking about the whole time is the second case, the run metrics.
The first case (epoch-wise metrics) work fine for me.
What I was trying to get is the second case.
In Tensorboard there is a extra tab for this (4_visualize_the_results_in_tensorboards_hparams_plugin).

Okay, so just to be sure: for your run metrics, you would like to log metrics and hparams in each step?

I think, that's a bit overkill to add it as default behaviour. If it's just about the learning rate maybe #1498 could help you there. Otherwise I'm afraid I can't think of a generic implementation here, that won't add much overhead for most users :/

Yes one solution would be to log every epoch. And no, I think it should not be automatically because the kind of metrics you are interested in depends heavily on the whole experiment.

I just described an easy example above with 1 hyperparameter (learning rate). In reality you have many hyperparameters which each can have different reasonable values. Eventually, you are looking for the best set of values. For that it is important to have an overview, where you can see how each set of values influence the training (_e.g._ the best achieved training loss in a training run).

As I described here the feature is _almost_ already in pytorch lightning.
Personally, I think it would be cool if log_hyperparams would already take the metrics argument. Then, I can inject that using a callback class which updates the _best achieved loss_ every epoch.

Maybe we could add an optional argument for that

Although I'm afraid, that it will be hard to change the time of logging HParams per default (since I think @williamFalcon has a point there).

So I think we can definitely update the logger, but you would still have to deal with your time of logging manually. Maybe you and we can also think of a better solution to automate the time of logging...

If the hyperparameters can be updated multiple times throughout a run, log_hyperparams accepting metrics is enough I think. The user can define a callback class that updates the appropriate things at the appropriate time.

I'm currently investigating the time issue. But I think updating the logger is a step to the right direction

Hi, unfortunately it doesn't work yet.
I tried out the change and logged hparams like this:

current_loss = float(module.val_loss.cpu().numpy())
if current_loss < self.best_loss:
    self.best_loss = current_loss
    metrics = {'hparam/loss': self.best_loss}
    module.logger.log_hyperparams(params=module.hparams, metrics=metrics)

The parameter appears as a column in the _hparams_ tab but it doesnt get a value.

hparams

Btw, if I use the high level API from tensorboard, everything works fine:

with SummaryWriter(log_dir=log_dir) as w:
    w.add_hparams(module.hparams, metrics)

Is there a reason why you don't use the high level API in the tensoboard logger?

Sorry, that was my mistake. I thought this was all handled by pytorch itself. Can you maybe try #1647 ?

So theoretically it works. log_hyperparams can log metrics now.
However, every call to log_hyperparams creates a new event file. So, I would still have to use my hack in order to make sure log_hyperparams is only called once.

but how do you do this if you kill training prematurely or the cluster kills your job prematurely?

your solution assumes the user ALWAYS gets to training_end

The current log_hyperparams does follow the default SummaryWriter outcome now. (creating a separate tensorboard file for each call to log_hyperparams) The problem we are seeing here is the default performance does not match hour need case or flexibility. Pytorch Lightning saw this problem which is why they did not use this implementation in TensorBoardLogger. This breaks the link between all other metrics you logged for the training session so you have one file with all your training logs then a separate one with just hyperparameters.

It also looks like the Keras way of doing this is writing the hyperparameters at the beginning of training then writing the metrics and status message at the end. Not sure if that i possible in PyTorch right now though...

The best solution to this I believe is just allowing the user more control. The code @mRcSchwering wrote above does this mostly. Having metrics be an optional parameter would solve this. If you call our modified TensorBoardLogger with metrics logging, as long as your tags for metrics you want to display are matching other logs, you can write the hyper parameters with dummy metrics at the beginning of training then tensorboard will automatically update the metrics with the most recently logged data. Example steps below:

  1. tb_logger.log_hyperparams_metrics(parameters, {"val/accuracy": 0, "val/loss": 1})
  2. In pl.LightningModule, log metrics with matching tags:
    def validation_epoch_end(self, outputs): ... return {"log": {"val/accuracy": accuracy, "val/loss": loss}}
  3. Tensorboard will pull the most recent value for these metrics from training. This gets around the issue of early stopping and still allows the user to log metrics.

I found that if those many files written by log_hyperparams are in the same directory (and have the same set of hyperparameters) tensorboard also correctly interprets them as a metric that was updated.
Here is a implementation using callbacks for reporting metrics, a module that collects results, and the logger workaround .

Here is a implementation using callbacks for reporting metrics, a module that collects results, and the logger workaround .

Would it be possible to extend your pattern to collect Trainer.test results? If you define on_test_start and test_epoch_end in MetricsAndBestLossOnEpochEnd, would it create new event files? Will the test metrics appear in same row as the best/val log and simply have columns for test metrics?

@reactivetype actually, now I just got what @SpontaneousDuck meant. Here is an example. In the beginning I write all hyperparameter metrics that I want. In my case I use this logger. I do this again with a module base class which writes a hyperparameter metric best/val-loss at the beginning of the training run.
During the training I can update this by just adding the appropriate key to the log dictionary of the return key. _E.g._

    def validation_epoch_end(self, train_steps: List[dict]) -> dict:
        loss = self._get_loss(train_steps)
        log = {
            'loss/val': loss,
            'best/val-loss': self._update_best_loss(loss)}
        return {'val_loss': loss, 'log': log}

Then, it doesn't matter where you do this. You could also do this in the test_epoch_end. Everything is written into one file.

During the training I can update this by just adding the appropriate key to the log dictionary of the return key

Does this mean we can further simplify the callback pattern in your examples?

During the training I can update this by just adding the appropriate key to the log dictionary of the return key

Does this mean we can further simplify the callback pattern in your examples?

Yes, I'm not using callbacks anymore. Everything is inherited (a base module class). I usually prefer callbacks but in this case the on_epoch_end callback function doesn't get the predictions and targets of the epoch. So, I would have to write them onto a module attribute. The on_epoch_end hook however has all the targets and predictions as an argument.

Are there any plans to include this into lightning? Would be really nice to be able to use metrics inside of Tensorboard hparams without any hacking.

yes, let's officially make this a fix! @mRcSchwering want to submit a PR?

@mRcSchwering so it is solved, right? :raccoon:

@williamFalcon I just took a look at it. Wouldn't the pre-train routine be the place where the initial metrics have to be logged?

One could add another attribute to the lightning module which will be added as metrics to the call.
That would also mean, in both the pre-train routine and the LoggerCollection one would have to check the logger type (only adding the metrics if it is the tensorboard logger).
That solution looks more messy to me than just extending the logger (as discussed above).

Is there a reason why log_hyperparams is called in the pre-train routine and not in on_train_start?
That would make changing the log_hyperparams behavior by the user more transparent.

@Borda yes, with https://github.com/PyTorchLightning/pytorch-lightning/issues/1228#issuecomment-620558981 it is possible to do it, currently.

Thanks all for your contributions in solving this issue that I have also been struggling with.

Would anyone be able to summarize what the current recommended approach is and maybe edit the documentation? https://pytorch-lightning.readthedocs.io/en/latest/loggers.html

My current strategy is still quite hacky.

For those trying to solve this, here's another proposed solution: https://github.com/PyTorchLightning/pytorch-lightning/issues/1778#issuecomment-653733246
I think the solution in https://github.com/PyTorchLightning/pytorch-lightning/issues/1228#issuecomment-620558981 is still a cleaner way of solving this, but not sure

Following the idea from @SpontaneousDuck I found the following way to bypass the problem without modifying the framework code: add a dummy call to log_hyperparams with a metrics placeholder before Trainer.run_pretrain_routine is called, for example:

class MyModule(LightningModule):
    # set up 'test_loss' metric before fit routine starts
    def on_fit_start(self):
        metric_placeholder = {'test_loss': 0}
        self.logger.log_hyperparams(self.hparams, metrics=metric_placeholder)

    # at some method later
    def test_epoch_end(self, outputs):
        metrics_log = {'test_loss': something}
        return {'log': metrics_log}

The last metric will show in the TensorBoard HPARAMS table as desired, albeit the metric graph will include the first dummy point.

Thanks @gwiener for the quick workaround.
Though, with this method, dummy points are added every time the training is restored from checkpoints - still looking for a solution here.

Likewise still looking for a solution here. And the solutions provided above did not work for me.

we're discussing this in #2974

This issue has been automatically marked as stale because it hasn't had any recent activity. This issue will be closed in 7 days if no further activity occurs. Thank you for your contributions, Pytorch Lightning Team!

Was this page helpful?
0 / 5 - 0 ratings

Related issues

jeremycochoy picture jeremycochoy  路  25Comments

mateuszpieniak picture mateuszpieniak  路  28Comments

Darktex picture Darktex  路  26Comments

polars05 picture polars05  路  34Comments

hadim picture hadim  路  29Comments