Calling distributed functions in data module setup

dilip · December 6, 2020, 6:53pm

I’d like to call torch distributed functions in my data module’s setup function, but the process group hasn’t been initialized yet. Is there any PyTorch Lightning functionality that I can call to initialize the process group manually, but correctly, e.g. using the local ranks as assigned by PyTorch Lightning?

tchaton · December 8, 2020, 9:10am

Hey there,

Would you mind describing your use-case ?

Best,
T.C

dilip · December 8, 2020, 2:59pm

Sure. I have a large dataset of images for which I would like to extract patches from. These patches are going to be the input to my model. The way I describe these patches is by coordinates and dimensions of the patches. I’d like to distribute this patch extraction across all the nodes I’m using, so ideally, in my setup function, I’d want to have something like:

node_specific_images = get_image_subset(images, local_rank)
local_patch_info = extract_patches(node_specific_images)
all_patch_info = allgather(local_patch_info, local_rank)