Can we train for a set number of iterations without setting
max_epochs ? I now know there are
limit_train_batches to control how many training steps are taken overall and training steps per epoch, respectively. However,
max_steps runs into issues if
max_steps * batch_size > len(train_dataset) What would break if
epochs was an optional argument to the trainer? Is there an alternative, recommended way to convert training loops of this style over to lightning?
a super super hacky solution would be setting
max_epochs to be an outrageously large value and set
max_steps to my desired iteration count. then we’ll break from the train loop according to
max_steps since we’d hit this first instead of
as suggested by @awaelchli
we could maybe the solutions is simple:
make both optional (default None)
if both are unset: set max_epochs=1000 (current)
if max_steps is set: use that one (keep max_epochs =None)
if both are set, stop based on whatever condition is met first.