Working with multiple datasets

Hello,

First off, I want to doff my hat to the PL devs. You guys have done an amazing job!

I am fairly new to PL and had a question about working with multiple datasets. I have read some of the other threads related to this topic like - How to use multiple train dataloaders with different lengths
But I am not sure if it works for my use case:

I have 2 datasets, where one is much bigger than the other. For every batch of big dataset I want to sample a random batch from the smaller dataset (with repeats). Is using the ConcatDataset approach the best way to do this?

Thanks in Advance,
Gautam

hey @gautamb85

you might want to checkout multiple_trainloader_mode in Trainer.

Check out the docs here.

Also, we have moved the discussions to GitHub Discussions. You might want to check that out instead to get a quick response. The forums will be marked read-only soon.

Thank you