System information
Describe the bug
Hello,
I would to use tfa.metrics.CohenKappa from tensorflow_addons.
I have a problem when I wanted to use it. I have a function where I create a basic convolution network, and I would like to use this metrics.
However, when I do that, it raised an exception
ValueError: Number of samples in
y_trueandy_predare different
So I checked in the code, and it's seam that the shape of the two Tensor are the not the same :
Tensor("Cast:0", shape=(None, None), dtype=int64)
Tensor("Cast_1:0", shape=(None, 5), dtype=int64)
I wanted to know, how I can I precise the shape of the y_pred in order to have the same shape as the y_true.
Code to reproduce the issue
def convolution(categories=5, shape_x=224, shape_y=224, channels=3):
model = tf.keras.models.Sequential([
tf.keras.layers.Conv2D(10, kernel_size=(5, 5), strides=(1, 1), activation=tf.nn.relu, use_bias=True, input_shape=(shape_x, shape_y, channels)),
tf.keras.layers.MaxPooling2D(pool_size=(2, 2), strides=(1, 1), padding='valid'),
tf.keras.layers.Conv2D(10, kernel_size=(5, 5), strides=(1, 1), activation=tf.nn.relu, use_bias=True),
tf.keras.layers.MaxPooling2D(pool_size=(2, 2), strides=(1, 1), padding='valid'),
tf.keras.layers.Flatten(),
tf.keras.layers.Dense(128, activation=tf.nn.relu),
tf.keras.layers.Dense(categories, activation=tf.nn.softmax)
])
model.compile(optimizer='adam', loss='mse', metrics=[tfa.metrics.CohenKappa(num_classes=5)])
return model
Other info / logs
File "/home/rere/Project/Aptos/aptos2019-blindness-detection/model.py", line 38, in convolution
model.compile(optimizer='adam', loss='mse', metrics=[tfa.metrics.CohenKappa(num_classes=5)])
File "/home/rere/.local/lib/python3.6/site-packages/tensorflow_core/python/training/tracking/base.py", line 457, in _method_wrapper
result = method(self, args, *kwargs)
File "/home/rere/.local/lib/python3.6/site-packages/tensorflow_core/python/keras/engine/training.py", line 439, in compile
masks=self._prepare_output_masks())
File "/home/rere/.local/lib/python3.6/site-packages/tensorflow_core/python/keras/engine/training.py", line 2004, in _handle_metrics
target, output, output_mask))
File "/home/rere/.local/lib/python3.6/site-packages/tensorflow_core/python/keras/engine/training.py", line 1955, in _handle_per_output_metrics
metric_fn, y_true, y_pred, weights=weights, mask=mask)
File "/home/rere/.local/lib/python3.6/site-packages/tensorflow_core/python/keras/engine/training_utils.py", line 1155, in call_metric_function
return metric_fn(y_true, y_pred, sample_weight=weights)
File "/home/rere/.local/lib/python3.6/site-packages/tensorflow_core/python/keras/metrics.py", line 196, in __call__
replica_local_fn, args, *kwargs)
File "/home/rere/.local/lib/python3.6/site-packages/tensorflow_core/python/keras/distribute/distributed_training_utils.py", line 1135, in call_replica_local_fn
return fn(args, *kwargs)
File "/home/rere/.local/lib/python3.6/site-packages/tensorflow_core/python/keras/metrics.py", line 179, in replica_local_fn
update_op = self.update_state(args, *kwargs) # pylint: disable=not-callable
File "/home/rere/.local/lib/python3.6/site-packages/tensorflow_core/python/keras/utils/metrics_utils.py", line 76, in decorated
update_op = update_state_fn(args, *kwargs)
File "/home/rere/.local/lib/python3.6/site-packages/tensorflow_core/python/eager/def_function.py", line 568, in __call__
result = self._call(args, *kwds)
File "/home/rere/.local/lib/python3.6/site-packages/tensorflow_core/python/eager/def_function.py", line 615, in _call
self._initialize(args, kwds, add_initializers_to=initializers)
File "/home/rere/.local/lib/python3.6/site-packages/tensorflow_core/python/eager/def_function.py", line 497, in _initialize
args, *kwds))
File "/home/rere/.local/lib/python3.6/site-packages/tensorflow_core/python/eager/function.py", line 2389, in _get_concrete_function_internal_garbage_collected
graph_function, _, _ = self._maybe_define_function(args, kwargs)
File "/home/rere/.local/lib/python3.6/site-packages/tensorflow_core/python/eager/function.py", line 2703, in _maybe_define_function
graph_function = self._create_graph_function(args, kwargs)
File "/home/rere/.local/lib/python3.6/site-packages/tensorflow_core/python/eager/function.py", line 2593, in _create_graph_function
capture_by_value=self._capture_by_value),
File "/home/rere/.local/lib/python3.6/site-packages/tensorflow_core/python/framework/func_graph.py", line 978, in func_graph_from_py_func
func_outputs = python_func(func_args, *func_kwargs)
File "/home/rere/.local/lib/python3.6/site-packages/tensorflow_core/python/eager/def_function.py", line 439, in wrapped_fn
return weak_wrapped_fn().__wrapped__(args, *kwds)
File "/home/rere/.local/lib/python3.6/site-packages/tensorflow_core/python/framework/func_graph.py", line 968, in wrapper
raise e.ag_error_metadata.to_exception(e)
ValueError: in converted code:/home/rere/.local/lib/python3.6/site-packages/tensorflow_addons/metrics/cohens_kappa.py:122 update_state * raise ValueError( ValueError: Number of samples in `y_true` and `y_pred` are different
Thank you in advance for any help :)
Thanks for the report! I suppose you could remove the could directly. This is similar #876 #298. I would go through our metrics/losses on the weekend to check if they are compatible with .compile().
Though that check is redundant (because it would be checked in tf.math.confusion_matrix), I think your example is not going to work with CohenKappa. In your last layer, the output would be shaped [num_samples, num_classes], where num_classes=5, while CohenKappa takes y_pred with shape [num_samples,] as argument.
cc @AakashKumarNain
@WindQAQ Although I favor these checks because they save a lot of potential bugs where broadcasting can happen unknowingly but I think there is no other way except removing them, at least I am not aware of any other way to do proper checks.
@WindQAQ @facaiy I can think of the following scenarios for Kappa calculation:
(None, 1) while the shape of the true labels would be (num_samples,)sigmoid/softmax actiavtion depending in whether it is a binary_classification or multi-class classification. In this case the last layer can have the shape (None, 1) or (None, num_classes). The true labels, on the other hand, can have different shapes. Here are all the scenarios I could think of:
##Regression##
y_true:(batch_size,)
y_pred:(batch_size, 1)
Round the predictions to get the predicted label. We can even include a parameter for the user to provide a custom threshold list.
##Classification##
Case1:
y_true: (None, num_classes) -> OHE
y_pred: (None, num_classes)
Use argmax for both the tensors to find the the labels and calculate kappa afterwards.
Case2:
y_true: (batch_size,) -> Using sparse labels instead of OHE
y_pred: (batch_size, num_classes)
Use argmax for prediction tensor and afterwards calculate kappa
All we need to do include these checks and we are good to go IMO. Let me know what you think.
If it looks okay, I can push a fix
@WindQAQ Tzu-Wei, do you think if is_compatible_with work in the case?
@AakashKumarNain Hi, Aakash, I'm not sure I understand your question throughly. For tf.keras, I believe it use different metrics to handle different label formats (eg: AUC, BinaryAccuracy, CategoricalAccuracy, etc)
I will ping you with the details in the gitter
@WindQAQ Tzu-Wei, do you think if is_compatible_with work in the case?
Thanks you Facai, this should work!
@AakashKumarNain Hi, Aakash, I'm not sure I understand your question throughly. For tf.keras, I believe it use different metrics to handle different label formats (eg: AUC, BinaryAccuracy, CategoricalAccuracy, etc)
This seems to be a good approach. If necessary, I vote for this solution.
@WindQAQ Tzu-Wei, do you think if is_compatible_with work in the case?
Thanks you Facai, this should work!
@AakashKumarNain Hi, Aakash, I'm not sure I understand your question throughly. For tf.keras, I believe it use different metrics to handle different label formats (eg: AUC, BinaryAccuracy, CategoricalAccuracy, etc)
This seems to be a good approach. If necessary, I vote for this solution.
Let's discuss on this in detail. I think we aren't one the same page yet..lol
@WindQAQ @facaiy I have made a separate file for detailed discussion on this. You can find it here. Let me know what you think
@AakashKumarNain Sorry for the delay, Aakash. Thanks for your detailed RFC, which looks really great. As said before, is it possible to create a metric for every case mentioned by you, for example: CohenKappa, BinaryCohenKappa, CategoricalCohenKappa etc (refer to Accuracy, BinaryAccuracy, CategoricalAccuracy, ... ). What do you think? cc @WindQAQ @seanpmorgan
Thanks @facaiy for the review. I discussed on this with Francois as well. I will try to make it more simple in the coming weeks.