How to handle non-learnable params

Chris_Liu · December 15, 2020, 10:30pm

In my model, I have a sub-module that contains some non-learnable tensors that are pre-calculated. Without wrapping them by nn.Parameter, I can’t get pytorch-lightning to automatically move them to GPU. However, if I don’t want them to learn anything I have to set requires_grad=False, which also cuts the gradient at the point. How should I handle this implementation? Thanks.

awaelchli · December 16, 2020, 5:45am

Use buffers:
https://pytorch.org/docs/stable/generated/torch.nn.Module.html#torch.nn.Module.register_buffer

this will move the tensors automatically. It’s just PyTorch, so it will work with Lightning.