I wanted to save checkpoints to Azure blob storage via available interface tools to data containers (e.g., AzCopy).
I wanted to do this every time a model checkpoint is saved locally via a ModelCheckpoint specified callback, and then running a subroutine that copies the checkpoints/logs to the external container.
The method on_checkpoint_save
seems to run before the checkpoint is created, so I do not think it would be adequate to implement it because it would save an old checkpoint. Do you have any suggestions on how I should do this?
I thought of defining a new custom callback that inherits from ModelCheckpoint, and at the end of the save_checkpoint
method, add another line that calls my subroutine to save in blob storage. Do you think this is a good way or would there be a better option?
Thanks.