Log unreduced results as histogram with EvalResult

Hi,

i have recently adopted EvalResult with version 0.9.0 and really like it so far :+1:. Thanks for the great work on it and Pytorch-Lightning overall!

I am using result.log("val/accuracy", acc, on_epoch=True) to a log an aggregated version of the validation accuracy for each epoch. I was wondering whether it is possible to additionally log the unreduced validation results as a histogram to ones favorite logger (e.g. Tensorboard) for each epoch as well. This way one can get a more in-depth view at the validation results.

I have implemented a validation_epoch_end method which itself works well.

def validation_epoch_end(self, validation_step_output_result):
        # log non-aggregated validation results as histogram.
        # This gives us a more in-depth view at the results
        for k, v in validation_step_output_result.items():
            if "val" in k:
                self.logger.experiment.add_histogram(k, v, self.global_step)
        return validation_step_output_result

But it only works for the pre-training check and the first epoch and crashes in logging.py line 110 (the last line of the metrics_to_scalars function). The exact error is Exception has occurred: ValueError only one element tensors can be converted to Python scalars.

From looking at lines 486-495 of evaluation_loop.py it seems like I would need to implement the reduction methods myself, which would be redundant to Pytorch-Lightnings implementation.

So I am wondering, whether there is a way of (additionally) logging evaluation values as histograms but using Pytorch-Lightnings inbuild aggregation and logging?

The reduce function in-built to the Result class only gets called if the epoch_end methods aren’t overridden, and that check (whether the epoch_end method is overridden) happens in evaluation_loop.py, so I don’t see a good way of getting around this while also using PL’s inbuilt aggregation without just overriding the epoch_end method and aggregating it yourself.

2 Likes

Thank you for your reply!

Just to recap that I understand the implementation correctly, if I override this method, the results are gathered (I am using DDP) by __gather_epoch_end_eval_results and then passed to my _epoch_end method, which has to reduce it itself.
If the _epoch_end methods are not overridden, __auto_reduce_result_objs is called, which loops over the results of each dataloader, reduces them and returns the reduced results as a list.
Both ways a list with reduced results must be returned, correct?

Do you think it would be possible to just reuse the reduction implementation from the EvalResult class? By returning something like return EvalResult.SOMEMAGICMETHOD(validation_step_output_result)?

Currently I am using this, although this relies on the hugly hack that all values are named val/...:

    def validation_epoch_end(self, validation_step_output_result):
        """log non-aggregated validation results as histogram. This gives us a more in-depth view at the results."""
        for k, v in validation_step_output_result.items():
            if "val" in k:
                self.logger.experiment.add_histogram(k, v, self.global_step)
                validation_step_output_result[k] = v.mean()
        return validation_step_output_result

And as a follow up question: Is there an _epoch_end method which operates on all ddp results gathered from the individual processes? So the analog of validation_step_end, but executed after the epoch has finished?

You probably could by accessing whatever reduce function you specify when you add anything to the result for any given step.

Do you definitively need to do this gathering on epoch ending? Is there any way you can refactor to just gather the results using the step end methods? I’m not sure otherwise how to gather, other than to use PyTorch’s distributed API.