I don’t understand how to resume the training (from the last checkpoint).
The following:
trainer = pl.Trainer(gpus=1, default_root_dir=save_dir)
saves but does not resume from the last checkpoint.
The following code starts the training from scratch (but I read that it should resume):
logger = TestTubeLogger(save_dir=save_dir, name="default", version=0)
trainer = pl.Trainer(gpus=1, default_root_dir=save_dir, logger = logger)