I’m running a DL model to try and classify some data (2 categories: 1/0), for which I admit I’m not sure there is any underlying structure that allows the classification to succeed.
Nonetheless, I don’t understand why the validation score remains identical after each epoch.
Batch size = 1024
Train data = 900_000 rows
Val data = 100_000 rows
...
self.layers = nn.Sequential(
nn.Linear(350, 1024*16),
nn.LeakyReLU(),
nn.Linear(1024*16, 1024*8),
nn.LeakyReLU(),
nn.Linear(1024*8, 1024*8),
nn.LeakyReLU(),
nn.Linear(1024*8, 1024*8),
nn.LeakyReLU(),
nn.Linear(1024*8, 1024*4),
nn.LeakyReLU(),
nn.Linear(1024*4, 1024*4),
nn.LeakyReLU(),
nn.Linear(1024*4, 256),
nn.LeakyReLU(),
nn.Linear(256, 1),
nn.Sigmoid(),
)
def forward(self, x):
return self.layers(x.float())
def training_step(self, batch, batch_idx):
x, y = batch
preds = self.layers(x.float())
loss = self.criterion(preds, y.float()) # nn.BCELoss()
acc = FM.accuracy(preds > 0.5, y)
metrics = {'train_acc': acc.item(), 'train_loss': loss.item()}
self.log_dict(metrics)
return loss
def validation_step(self, batch, batch_idx):
x, y = batch
preds = self(x.float())
loss = self.criterion(preds, y.float()) # nn.BCELoss()
acc = FM.accuracy(preds > 0.5, y)
metrics = {'val_acc': acc.item(), 'val_loss': loss.item()}
self.log_dict(metrics)
return metrics
The val_loss
remains stable at 48.79
after each and every epoch (tested for up to 10 epochs; same true for val_acc
which doesn’t change), which is weird. I would expect some slight variation even if the model doesn’t have much to learn from the data. At least some ovefitting should be possible (model has 300 million+ parameters in total).
However, the train_loss
does vary from batch to batch:
So in conclusion I don’t know why validation loss does not change from one epoch to the next and remains stable at 48.79, am I missing something?