I understand that the input to the confusion matrix metric has to be in the shape of
batch_size x num_classes, e.g. for a binary classification problem
Sample 1: 0.75 0.25
Sample 2: 0.35 0.65
Also the target is to expected to be of int type rather than float due to the use of torch.bincount.
I am wondering, if it would make sense to change the API so that input of the shape batch_size suffices, i.e.
Sample 1: 1
Sample 2: 0
Also a more consistent metric handling would be desirable because currently for some metrics like Accuracy the user has to manually round output before passing it to the metric whereas for others this is not necessary. I dont think its good to clutter ones entire code with output_transform closures for this purpose
I am wondering, if it would make sense to change the API so that input of the shape batch_size suffices
@CDitzel is your question is about to remove argmax or make it optional ?
Maybe, we should also keep in mind that CM can be also used for semantic segmentation where y_pred.shape=(N, C, H, W) and y.shape=(N, H, W).
Also a more consistent metric handling would be desirable because currently for some metrics like Accuracy the user has to manually round output before passing it to the metric whereas for others this is not necessary.
Well, this is necessary for binary and multilabel cases and holds for Accuracy, Precision, Recall metrics.
If you have an idea how to make it more lean, I'm happy to discuss more about this :)
I specifically mean this line
why do we force users to have multi dimensional prediction tensors?
For a binary classification problem one currently has to define an output_transform, i.e. sth. along the following lines
def transform(out):
res = torch.empty(out.shape[0], 2)
for i, sample in enumerate(out):
if sample > 0.5:
res[i, 0] = sample
res[i, 1] = 1 - sample
else:
res[i, 0] = 1 - sample
res[i, 1] = sample
return res.to(dev)
prior to instantiating the confusion matrix metric to make the prediction tensor artificially two-dimensional
@CDitzel Okay, I see you would like to cover binary case. Maybe we can do something with it.
Concerning your code with transform, there is firstly an assumption about the output to be probabilities (not logits) and, anyway, this can be made a single line like this:
from ignite.utils import to_onehot
transform = lambda out: to_onehot((out > 0.5).long(), 2)
mh so my issue is literally based on my ignorance of the Ignite lib and the Deep Learning Matter as a whole, Embarrassing. Thank you Sir
No problems :) Feel free to close the issue if it answers your question. Thanks
Most helpful comment
mh so my issue is literally based on my ignorance of the Ignite lib and the Deep Learning Matter as a whole, Embarrassing. Thank you Sir