LR Scheduler with interval="step" not working properly

Hi all,

I’m facing a weird behavior using an ExponentialLR Scheduler.
I want to perform a scheduler step every 200 training steps. Thus my configure_optimizers() returns a scheduler config like this:

    {"scheduler": scheduler, "interval": "step", "frequency": 200}

It looks to me that the step counter used to perform the scheduler step is reset every epoch. In my case, every epoch is 216 steps long and what I observe is that the first scheduler step is performed at global step 200, the second at 416 (216 + 200), the third at 632 (216 + 216 + 200), etc.

Is that the intended behavior? Is it then not possible to perform a scheduler step every N training steps regardless of the epoch length?

Thanks in advance for any help!

It looks like the counter used to perform the check is trainer.batch_index instead of trainer.global_step (code).
It doesn’t make very sense to me.

1 Like