Assigning to hparams not recommend
Apparently assigning directly to
self.hparams is not recommended (and nearly removed from PyTorch Lightning) according to the discussion found here: https://github.com/PyTorchLightning/pytorch-lightning/pull/4417#discussion_r514582193
I have the following transfer learning situation:
- Pre-trained weights come from PyTorch (without) lightning, and therefore do not contain the right hparams
- Want to allow for (use-cases):
- Training model from scratch in PL
- Load PyTorch weights and fine-tune model on a new task
- Load newly trained PL WITHOUT needing access to the original PyTorch weights
Based on the sample rate of the audio, some hyper parameters have to be set to specific values for the pre-trained weights (2). Currently I’ve encoded this in the model, so that the user only has to provide the sample rate to correctly set the hparams (2), but it should still be allowed to set any hparam value when training the model from scratch (1).
Currently my code looks like this:
class Model(pl.LightningModule): def __init__(self, transfer_learning: bool, sample_rate: int, **kwargs): super().__init__() self.save_hyperparameters() # if using transfer learning if transfer_learning: self.set_pretrained_hparams_and_load_weights() def set_pretrained_hparams_and_load_weights(self): if self.hparams["sample_rate"] == 8000: self.hparams["window_size"] = 256 self.hparams["hop_size"] = 80 elif self.hparams["sample_rate"] == 16000: self.hparams["window_size"] = 512 self.hparams["hop_size"] = 160 ... # code to automatically set path to correct weights matching sample rate # code to load path with weights
With the above code
self.save_hyperparameters() will set
self.hparams["sample_rate"], but not e.g.
As for case 3 (after performing 2), I would load the model (on a system without the original PyTorch weights) with:
as this allows for not needing to have access to the PyTorch weights (we have the PL weights after all).
The hparams like
self.hparams["window_size"] still exist in the PL checkpoint, as I’ve manually assigned them, so there is no problem here.
How would I support all 3 use-cases if assigning to
self.hparams would be removed?
Possible ideas (and issues)
- Working with e.g. setting .json’s. This is arguably more complex and creates more moving parts, which makes portability harder and riskier (e.g. when also using the model in C++, you would need special code to take care of this).
- Forcing the user to allows provide all required self.hparams. User unfriendly, increases chance of mistakes, and creates a higher barrier to using the model (need to understand more). The above example does not show all hparams that need to be set.
Unfortunately there is (what I consider to be) a bug that when assigning to
self.hparams, this change is not reflected in