LOCAL_RANK environment variable


I’m trying to run pytorch lightning (0.8.5) with horovod in a multi-gpu machine.

the issue i’m facing is that rank_zero_only.rank is always zero on each thread (4 gpus machine).

By inspecting the environment, i saw that the 4 threads do not set a LOCAL_RANK environment variable, but instead they have OMPI_COMM_WORLD_LOCAL_RANK (0 to 3).

Is that the cause of rank_zero_only not working? where should the LOCAL_RANK env var come from?


Hello, my apology for the late reply. We are slowly converging to deprecate this forum in favor of the GH build-in version… Could we kindly ask you to recreate your question there - Lightning Discussions