Truncated bptt training monitoring and batching

Hi, I’m working with an RNN model using very long sequences, of which I don’t have many. I’ve employed the truncated_bptt_steps attribute, but trouble is that now my dataset size is very small despite each sample from the dataset being split into many smaller samples. It doesn’t seem like the number of splits is tracked inside the Trainer, so I’m not getting loss updates in real time. Instead I have to wait for the entire sequence to be processed, which might be 10-20% of the entire dataset. Are there modifications I can make to the trainer to make monitoring any easier? Overall this feels a bit clunky and presume there must be a better way.

Incidentally, are there out of the box tools for truncated bptt with long sequences that each have a different length and thus need to be masked in some fashion?

Thanks in advance.