Hello,
I’m trying to use the 4 GPUs on my machine to train a huggingface model for a project. Single GPU with 32 bit precision works without any problems (16 bit is not working and I’ve asked a question about it here). Multi GPU with 32 bit precision just hangs. I’m running this on a Jupyter NB and I saw this error on the terminal:
Error: mkl-service + Intel(R) MKL: MKL_THREADING_LAYER=INTEL is incompatible with libgomp-7c85b1e2.so.1 library.
Try to import numpy first or set the threading layer accordingly. Set MKL_SERVICE_FORCE_INTEL to force it.
This seems to be a Pytorch error that is discussed here and there are a couple of solutions proposed one of which is to import numpy
before torch.multiprocessing
. I am importing numpy
first (and don’t actually import torch.multiprocessing
) but I’m not sure how PL does it.
I’m using Pytorch version 1.7.1
and PL version 1.1.2
. Anyone else run into this problem? Is there a solution to this?
Thanks.