How to save hparams when not provided as argument (apparently assigning to hparams is not recomended)?

Assigning to hparams not recommend

Apparently assigning directly to self.hparams is not recommended (and nearly removed from PyTorch Lightning) according to the discussion found here: https://github.com/PyTorchLightning/pytorch-lightning/pull/4417#discussion_r514582193

Use-cases

I have the following transfer learning situation:

  • Pre-trained weights come from PyTorch (without) lightning, and therefore do not contain the right hparams
  • Want to allow for (use-cases):
    1. Training model from scratch in PL
    2. Load PyTorch weights and fine-tune model on a new task
    3. Load newly trained PL WITHOUT needing access to the original PyTorch weights

Based on the sample rate of the audio, some hyper parameters have to be set to specific values for the pre-trained weights (2). Currently I’ve encoded this in the model, so that the user only has to provide the sample rate to correctly set the hparams (2), but it should still be allowed to set any hparam value when training the model from scratch (1).

Currently my code looks like this:

class Model(pl.LightningModule):
    def __init__(self, transfer_learning: bool, sample_rate: int, **kwargs):
        super().__init__()
        self.save_hyperparameters()

        # if using transfer learning
        if transfer_learning:
            self.set_pretrained_hparams_and_load_weights()

    def set_pretrained_hparams_and_load_weights(self):
        if self.hparams["sample_rate"] == 8000:
            self.hparams["window_size"] = 256
            self.hparams["hop_size"] = 80

        elif self.hparams["sample_rate"] == 16000:
            self.hparams["window_size"] = 512
            self.hparams["hop_size"] = 160
        ...
        # code to automatically set path to correct weights matching sample rate
        # code to load path with weights

With the above code self.save_hyperparameters() will set self.hparams["sample_rate"], but not e.g. self.hparams["window_size"].

As for case 3 (after performing 2), I would load the model (on a system without the original PyTorch weights) with:
Model.load_from_checkpoint(checkpoint_location, pretrained_hparams=False)
as this allows for not needing to have access to the PyTorch weights (we have the PL weights after all).
The hparams like self.hparams["window_size"] still exist in the PL checkpoint, as I’ve manually assigned them, so there is no problem here.

Question

How would I support all 3 use-cases if assigning to self.hparams would be removed?

Possible ideas (and issues)

  • Working with e.g. setting .json’s. This is arguably more complex and creates more moving parts, which makes portability harder and riskier (e.g. when also using the model in C++, you would need special code to take care of this).
  • Forcing the user to allows provide all required self.hparams. User unfriendly, increases chance of mistakes, and creates a higher barrier to using the model (need to understand more). The above example does not show all hparams that need to be set.

Others

Unfortunately there is (what I consider to be) a bug that when assigning to self.hparams, this change is not reflected in hparams.yaml: https://github.com/PyTorchLightning/pytorch-lightning/issues/4316#issuecomment-719021302

pings

@awaelchli @goku @s-rog

initializing self.hparams will be deprecated you still can access them using self.hparms and assign other params as well self.hparms['some_param'] = some_value.

1 Like

While I can confirm @goku 's comment, I’d like to expand a bit on what hyperparametes are and what they are not. In Lightning, we define the hyperparameters as follows:

Hyperparameter: the set of arguments in the LightningModule’s init method.

This means the hyperparameters are exactly the parameters that we need to instantiate a LightningModule object.

By this definition, your window size is not a hyperparameter (and besides that it is also not a hyperparameter because you hardcode the value).

For these reasons, I recommend that you change your code in the following way:

def set_window_and_hop_size(self):
        if self.hparams["sample_rate"] == 8000:
            self.window_size = 256
            self.hop_size = 80

        elif self.hparams["sample_rate"] == 16000:
            self.window_size = 512
            self.hop_size = 160

It is important that we distinguish between hyperparameters (as defined above) and attributes of the LightningModule in general, or we land again in the pitfalls we crawled out of several months ago.

Sorry, I think I didn’t explain it clearly enough. window_size and hop_size can be supplied as arguments (and therefore hyper parameters) when retraining the model from scratch. It is only when wanting to use pre-trained weights, these values help lessen the burden for the user.

So basically:

if self.hparams["sample_rate"] == 8000:
    self.hparams["window_size"] = 256
    self.hparams["hop_size"] = 80

can be seen as default values, which would normally be set with:

def __init__(self, transfer_learning: bool, sample_rate: int, window_size=256, hop_size=80)

Unfortunately, a function can only set 1 default value, and in my case the default depends on the given sample rate.


Think of calling the function like this:

Model(transfer_learning=True, sample_rate=8000)
# equals: Model(transfer_learning=True, sample_rate=8000, window_size=256, hop_size=80)
Model(transfer_learning=True, sample_rate=16000)
# equals: Model(transfer_learning=True, sample_rate=8000, window_size=512, hop_size=160)

# training from scratch, any value is acceptable
Model(transfer_learning=False, sample_rate=16000, window_size=1024, hop_size=512)

If opting for a solution as assigning to self.hparms, we would need to copy the same logic for self.hparams, like pickeling and saving to hparams.yaml, overwriting some parameters when loading from a checkpoint will be a pain, and likely more. The whole reason to use Lightning is to avoid this extra engineering.

It’s fine that assigning to self.hparams is not the recommended way, but please don’t deprecated it.
Consider doing a community survey about how people are using assigning to self.hparams, because you might break more workflows than you imagine.

p.s. **kwargs should have been in the __init__, fixed this in the original post.

as @goku already said, we are aiming at deprecating the setter for self.hparams.
The getter (property) self.hparams will remain. This means you can do

# will always work
self.hparams["something"] = something

but

# will probably give you a warning in the future
# and will eventually be unsupported
self.hparams = something 

We’re not breaking any fundamental functionality here, and we will not just remove something from one day to the next. The self.save_hyperparameters was introduced a while ago, so we are gradually updating the docs and examples to make people aware of the new improved way.

Is there something in your code sample that isn’t working? Happy to help.

Aah, it was me that understood it wrong. With “setter”, I assumed assigning values to anything in self.hparams. I was worried that self.hparams["something"] = something would stop working.
Thank you for clarifying this.

I agree with self.hparams = something not being possible is a good idea.


So if self.hparams["something"] = something is supported in PyTorch Lightning, then this (unlike self.save_hyperparameters()) not updating hparams.yaml is a bug?:

I gave a reply on the github issue.

1 Like