I am using
EvalResult to report metrics like an accuarcy. In addition to that I look at the examples in my batch and group the results by the category the examples belong to and also calculate the accuracy for each grouped subset.
As an example, if my output is
out=[0, 1, 1, 0] and if my ground truth is
gt=[1, 1, 1, 1] my accuracy across the batch is 50%. If my category vector is
cat=[A, B, B, A] my accuracy for category
A would be 0% and for category
B would be 100%.
Sometimes I do not have examples in a batch of specific category (e.g.
cat=[A, A, A, A]) and thus there is no accuracy to compute (in this case for category
B). Therefore, I do not log it with
result.log("accuracy_groupX", ...). I believe that is the best way to handle this instead of logging zero or maybe repeating the last value.
But if some of the logged values in the
EvalResult object do not have the same number of entries as the others, this crashes the
weighted_mean operation in line 366 of the
reduce_on_epoch_end function in
This seems to be the case because the weight for the weighted mean (the
batch_sizes variable) assumes there is a datapoint to use in the weighted mean, but in reality there is none.
len(result[k]) is unequal to
weighted_mean mean function will fail, crashing the entire training.
Is there a way to navigate around this bug or maybe even to fix it?