Training hangs at Epoch 0 / 0% on TPU

Hi,

I am very new to PyTorch-Lightning and to Deep Learning as well! I am converting a PyTorch project into Lightning. On Google Colab, when I run the trainer on CPU or GPU it trains the model as expected although I haven’t checked the output model so far but it does something. It can find batch_size, find the initial learning rate, fast_dev_run also runs smoothly.

But when I try to run it on TPU it hangs at

Epoch 0: 0% 0/2 [00:00<?, ?it/s]

I tried with and without fast_dev_run, with 1 and 8 TPU_cores, with a batch_size of 32 and 2, but it always hangs there. I let it run for 45 minutes and it is still there. How can I know where the code is hanging and what I have to change ?

Thank you very much for helping