Apparently my code is moving a lot of stuff to the GPU, thus slowing it down.
Is there a tool or method to easily profile where GPU memory access it coming from (moving stuff to or from the GPU), at a fine-grained level so I can fix the exact problem?
Attached is a graph from wandb.ai dashboard
From lambda labs: “GPU Memory Access %: This is an interesting one. It measures the percent of the time over the past sample period during which GPU memory was being read or written. We should keep this percent low because you want GPU to spend most of the time on computing instead of fetching data from its memory. In the above figure, the busy GPU has around 85% uptime accessing memory. This is very high and caused some performance problem. One way to lower the percent here is to increase the batch size, so data fetching becomes more efficient.”