I am facing an issue using Pytorch Lightning for training on TPU. This is about a competition hosted on Drivendata. I am trying a BERT based classification model and trying out TPU training.

However after 1st epoch, the Kernel dies because of Out of Memory error. Request to please guide me. https://www.kaggle.com/aninda/pytorch-lightning-genetic-transformers