How do you maximize GPU utilization in Pytorch? Especially when you use some servers and clusters and don’t know the GPU RAM exactly?
For my local GPU, I usually play with the number of workers and batch size.
And the other question is that when we use several GPUs, is the batch size we set for one GPU or all GPUs in total? Should we change batch size when we increase the number of GPUs?