I have a need to use a custom DDP implementation. With Lightning, the API and docs are unclear as to whether I need to extend LightningDistributedDataParallel, or if I can directly extend torch DistributedDataParallel.
The docs suggest configure_ddp
should work with torch DistributedDataParallel: https://pytorch-lightning.readthedocs.io/en/latest/lightning-module.html#configure-ddp
However, there are spots in Lightning which rely on checking isinstance
of the custom Lightning overrides: https://github.com/PyTorchLightning/pytorch-lightning/blob/master/pytorch_lightning/trainer/model_connector.py#L31-L34
The Lightning DDP also forwards calls to train/val/test step. Is this a requirement for custom DDP implementations when used with Lightning?