How to get the checkpoint path?

Hi
could you tell me how I can get the checkpoint path in this function? thanks

def on_save_checkpoint(self, checkpoint: Dict[str, Any]) -> None:
      save_path = checkpoint['filepath'] ??? where is the path ???
      self.model.save_pretrained(save_path)
      self.tokenizer.save_pretrained(save_path)

The checkpoint path will be whatever specified by the ModelCheckpoint callback. By default this will be lightning_logs/version_{version number}/epoch_{epoch number}.ckpt.

Hi, I need to define a checkpoint which is called 5 times during the training, how would I know inside the ModelCheckpoint, which iteration number this is ? thanks I appreciate an example, on how to save the model every k steps/epochs

To save a model every k epochs, use the period argument to ModelCheckpoint along with save_top_k=-1. This will save every model every k epochs.

Hi
I need the “path” to the place checkpoint saved inside the callback, could you tell me how to get that checkpoint path? thanks

So when you say “version_{version number}/epoch_{epoch number}” how do you get ecpoh number and version number inside the callback? thanks

FYI. it’s now
lightning_logs/version_{version number}/epoch_{epoch number}-step_{global_step}.ckpt

and to access them:

version_number -> trainer.logger.version
epoch_number -> trainer.current_epoch
global_step -> trainer.global_step

You can explore your Checkpoint attributes, for example, you want best_model_path

In addition to what @goku said, you can get the log directory + version number with: trainer.logger.log_dir, so if you add what you want as a callback:

from pytorch_lightning.callbacks import Callback

class OnCheckpointSomething(Callback):
    def on_save_checkpoint(self, trainer, pl_module):
        save_path = f"{trainer.logger.log_dir}/checkpoints/epoch={trainer.current_epoch}.ckpt"
1 Like

Also ModelCheckpoint has a method called format_checkpoint_name that is actually called when saving checkpoints and does the overall formatting. The callback itself can be accessed by trainer.checkpoint_callback

As an example, if you want to save the weights of your model before training, you can add the following hook to your LightningModule:

def on_train_start(self):
    self.trainer.save_checkpoint(self.trainer.checkpoint_callback.format_checkpoint_name(dict(epoch=0, step=0)))

I believe this still does not answer the original question.
When on_save_checkpoint is called, how do I tell if the checkpoint will be saved in self.best_model_path or self.last_model_path?
(In the very common case where we are saving both the best and the last model)

For my use-case, I’m using the following:

def on_save_checkpoint(self, checkpoint):
    filepath = reach("filepath")

where reach is defined as:

def reach(name):
    for f in inspect.stack():
        if name in f[0].f_locals:
            return f[0].f_locals[name]
    return None 

As seen in: https://stackoverflow.com/questions/15608987/how-can-i-access-variables-from-the-caller-even-if-it-isnt-an-enclosing-scope

best model and last model path would be different if your best model is not your last model.

Model checkpoint callback will save the models in a folder like this - my/path/epoch=0-step=10.ckpt.
Once your training is completed you can access the location of best model and last model using the attributes

checkpoint_callback = ModelCheckpoint(dirpath='my/path/')
print(checkpoint_callback.best_model_path)
print(checkpoint_callback.last_model_path)