LightningModule __init__ vs. setup method

When should I use __init__ or setup when using a ddp backend? Currently I initialise my model in __init__ without obvious bad effects, is this reasonable? Or should I be doing it in setup?

There doesn’t seem to be much guidance in https://pytorch-lightning.readthedocs.io/en/latest/lightning-module.html#lightningmodule-api about when setup should be used over __init__ and what common use cases are.

Thanks

I think that the __init__ here follows the same distributed initialization logic as in nn.Module, since LightningModule subclasses nn.Module - that being said, I don’t think it should make much of a difference if you initialize your model in __init__ or __setup__.

What’s your use-case? Could you provide a MWE?

If there is anything in particular that you want to change when you are running in multiple stages then setup can be used. Like if you need to set up or get the model ready before testing, that can be done with setup('test')

That’s exactly it, I don’t know what the use cases for this functionality are :sweat_smile:, I’m trying to understand when setup is an appropriate choice over just initialising in __init__

The model should be initialized in __init__ and it will be moved to all the devices automatically in a distributed environment. setup should be used to initialize anything else that is required in every sub-process in DDP mode. You can also alter the model in setup too if you want to dynamically change or alter it somehow on each sub-process. If you initialize anything else in __init__ it will be initialized on the main process and won’t be able to access it on the sub-process in case of DDP.

So any nn.Module fields are handled automatically, but beyond that then we need to used setup? e.g. say I have a CSV I read in using pandas in __init__, that field would only be present on the main thread and on the other threads?

Correct.

yes, if you need it in the sub-process then you should define it in setup.

but also, you can use datamodules to decouple the data stuff from your model whenever you want. then the use of setup becomes more clear there.

DataModule:

  • setup
  • prepare data
  • train loader
  • val loader
  • test loader

LightningModule:

  • init
  • train step
  • val step (optional)
  • test step (optional)
  • optimizers
  • (optional forward)