Logging a tensor

Hi all,

I am using LSTM Autoencoder for time series data. I want to return the input and reconstructed output and visualize them in my own. I tried to add them in the log but it is not working, I keep gettting the error

ValueError: only one element tensors can be converted to Python scalars

The self.log functionality of LightningModule only supports logging scalar values so that it can be compatible with all of the loggers that lightning supports. If, for example, you know that you will be using Tensorboard, you can access the tensorboard object through self.trainer.logger.experiment, and then use any of the methods included with torch.utils.tensorboard.writer.SummaryWriter.

Follow-up question when trying to avoid this error.

Is there a way to check if a Metric is loggable? I’m going over a dict with metrics to log them, but I’m also hit with the ValueError: only one element tensors can be converted to Python scalars.

I checked the insides:

> model.metrics["metric_val"]["confmat"].__dict__
{'training': False,
 '_parameters': OrderedDict(),
 '_buffers': OrderedDict([('confmat',
               tensor([[0., 0., 0., 0., 0., 0.],
                       [0., 0., 0., 0., 0., 0.],
                       [0., 0., 0., 0., 0., 0.],
                       [0., 0., 0., 0., 0., 0.],
                       [0., 0., 0., 0., 0., 0.],
                       [0., 0., 0., 0., 0., 0.]], device='cuda:0'))]),
 '_non_persistent_buffers_set': set(),
 '_backward_hooks': OrderedDict(),
 '_forward_hooks': OrderedDict(),
 '_forward_pre_hooks': OrderedDict(),
 '_state_dict_hooks': OrderedDict(),
 '_load_state_dict_pre_hooks': OrderedDict(),
 '_modules': OrderedDict(),
 'dist_sync_on_step': False,
 'compute_on_step': True,
 'process_group': None,
 '_to_sync': True,
 'update': <function pytorch_lightning.metrics.classification.confusion_matrix.ConfusionMatrix.update(preds: torch.Tensor, target: torch.Tensor)>,
 'compute': <function pytorch_lightning.metrics.classification.confusion_matrix.ConfusionMatrix.compute() -> torch.Tensor>,
 '_computed': tensor([[4., 5., 4., 1., 6., 2.],
         [3., 1., 3., 1., 3., 4.],
         [0., 2., 1., 4., 2., 1.],
         [1., 3., 3., 5., 1., 2.],
         [1., 4., 2., 2., 4., 0.],
         [0., 2., 2., 3., 0., 3.]], device='cuda:0'),
 '_forward_cache': tensor([[0., 1., 0., 0., 1., 0.],
         [0., 0., 0., 0., 0., 0.],
         [0., 0., 0., 1., 0., 0.],
         [0., 0., 0., 0., 0., 0.],
         [0., 1., 0., 0., 0., 0.],
         [0., 0., 0., 0., 0., 1.]], device='cuda:0'),
 '_reductions': {'confmat': <function pytorch_lightning.metrics.utils.dim_zero_sum(x)>},
 '_defaults': {'confmat': tensor([[0., 0., 0., 0., 0., 0.],
          [0., 0., 0., 0., 0., 0.],
          [0., 0., 0., 0., 0., 0.],
          [0., 0., 0., 0., 0., 0.],
          [0., 0., 0., 0., 0., 0.],
          [0., 0., 0., 0., 0., 0.]])},
 'num_classes': 6,
 'normalize': None,
 'threshold': 0.5,
 '_cache': {'confmat': tensor([[4., 5., 4., 1., 6., 2.],
          [3., 1., 3., 1., 3., 4.],
          [0., 2., 1., 4., 2., 1.],
          [1., 3., 3., 5., 1., 2.],
          [1., 4., 2., 2., 4., 0.],
          [0., 2., 2., 3., 0., 3.]], device='cuda:0')}}

However, a solution like this proved not to be robust:

# only log scalars (e.g. not confusion matrix)
if metric._computed.numel() == 1:
    self.log(f"{metric_name}/{phase}_step", metric)

Currently there is no solution towards logging more than one value, this means trying to log a confusion matrix will fail.

Currently there is no solution towards logging more than one value, this means trying to log a confusion matrix will fail.

Yes, so that’s why I want to check whether a Metric is loggable or not before calling self.log("metric_name", some_metric).
However, I’m not sure what’s the best way to check if a Metric computes a scalar or not.

You could do

if isinstance(your_value, torch.nn.Tensor) and your_value.numel() == 1:

as a check.

@awaelchli On which Tensor inside a Metric would you call this check? E.g. _computed fails:

if isinstance(metric._computed, torch.Tensor) and metric._computed.numel() == 1:
    self.log(f"{metric_name}/val_step", metric)
# Fails: metric._computed is None

Logs nothing, because in validation_step the value has not been computed yet.

I call also not blindly check all Tensors inside a Metric (e.g. in metric._defaults), because even if those are not all scalars, the computed might be.

To log, you need to compute the metric, i.e. value
= metric.compute(). Then you can do the check before logging.

What can be done for now (which is a bit hacky), since confusion matrix is the only metric that returns non-scalar tensors:

for m in self.metrics:
    val = m(preds, target)
    if not isinstance(m, pl.metrics.ConfusionMatrix):
        self.log("metric_name", m)
    else:
        fig=plt.figure(); plt.imshow(val)
        self.logger.experiment.add_figure("confmat", fig)

To log, you need to compute the metric, i.e. value
= metric.compute(). Then you can do the check before logging.

I was relying on PL to do this automatically, as PL automatically sets self.log(on_step=, on_epoch=) when you call log in e.g. validation_step, which also takes care of calling compute()

Let me try if it is possible to do the compute and check manually in validation_epoch_end.

@SkafteNicki
I was not aware that val = m(preds, target) returns a value (I thought this only happend when .compute() was called).

With this knowledge, I think the following is the easiest?:

def validation_step(self, batch, batch_idx):
    # ...
    for metric_name, metric in self.metrics[f"metric_val"].items():
        val = metric(predictions, targets)
        if val.numel() == 1:
            self.log(f"{metric_name}/val_step", metric)

# EDIT:
def validation_epoch_end(self, outs):
    # compute full confusion matrix
    confmat_tensor = self.metrics["metric_val"]["confmat"].compute()
    # turn confusion matrix into a figure (Tensor cannot be logged as a scalar)
    fig = plt.figure()
    plt.imshow(confmat_tensor.cpu().numpy())
    # log figure
    self.logger.experiment.add_figure('epoch_confmat', fig, global_step=self.global_step)

This seems to work, thank you all for your input!

Edit: As I’m not interested in a confusion matrix of a step, I call only the specific logging code for ConfusionMatrix in validation_epoch_end.
Now I only log scalars automatically, but I can still call custom logging code on non-scalars.