Monitored quantity in ModelCheckpoint does not exist

Hi,

I’m using PL V0.9, and I’m using pl.callbacks.ModelCheckpoint to monitor a quantity named val_loss, to save only the best models.

However, after updating my code base, there is no quantity named val_loss in my validation loop, because I changed it’s name, but my callback still save checkpoints!

I want to know, in this case where the monitored quantity does not exist, which quantity is used by the callback to make the decision of saving?

Thank you in advance!

You did not include how you are creating the ModelCheckpoint, so assuming it is set to monitor=None, it will try to save every epoch.

You can set the monitor argument to your quantity name in order to track it instead.

Hi @carmocca
Thank you for your response.
Actually, I create the ModelCheckpoint in the following way:

checkpoint_callback = pl.callbacks.ModelCheckpoint(
        monitor="val_loss",
        mode="min",
        save_last=True,
        save_top_k=5,
        verbose=False,
    )

However, there is no metric called val_loss, but ModelCheckpoint still save the models, and only 5 at the time + the last one.

5 at the time + last is correct considering you have save_top_k=5, save_last=True.

There is this check to stop the run if the monitor is not found:


So you must be actually monitoring something.

Can you share your training and validation step?