System information
Describe the feature and the current behavior/state.
Hamming score is of great interest in multilabel classification.
Will this change the current api? How?
Yes, it will add a new feature
Who will benefit with this feature?
Anyone working with multilabel classification
Any Other info.
Initial colab notebook: https://colab.research.google.com/drive/1Msuv5xUu7lu5wDH1ei-VOPB-UnBolDfB
Clarification:
Do we need to hold state information for this?
@SSaishruthi Yes, we'll need a running sum of hamming loss and count increment every time update_state is called. The result can return the average value of hamming loss.
@Squadrick
Perfect, thanks for the clarification. Will submit a PR soon.
@SSaishruthi Use this: MeanMetricWrapper. Keras already has something that wraps a stateless function and does the aggregate.
@Squadrick
Thanks again for the links. Will keep you posted about the updates.
@Squadrick
I tried wrapping hamming metrics. Below are the observations.
epoch count and dividing it by current results did not provide the desired result.Reference: https://colab.research.google.com/drive/1Msuv5xUu7lu5wDH1ei-VOPB-UnBolDfB#scrollTo=DvlbepsGFZjj&line=4&uniqifier=1
count variable for holding the number of data points in a particular epoch. It worked fine.I am not able to import MeanMetricWrapper so used Mean
If this is fine, I will create a PR with all supporting scripts.
@seanpmorgan @facaiy @WindQAQ
We can't import MeanMetricWrapper using tf.keras.metrics.MeanMetricWrapper, but can be imported using tf.python.keras.metrics.MeanMetricWrapper. Is the latter fine, or should I open a PR for TF master to tf_export the API for MeanMetricWrapper (here).
Exposing MeanMetricWrapper will make the implementation much cleaner.
def hamming_loss(y_true, y_pred, mode='multiclass'):
if mode not in ['multiclass', 'multilabel']:
raise TypeError('mode must be: [None, multilabel])')
if mode == 'multiclass':
nonzero = tf.cast(tf.math.count_nonzero(y_true * y_pred, axis=-1), tf.float32)
return 1.0 - nonzero
else:
nonzero = tf.cast(tf.math.count_nonzero(y_true - y_pred, axis=-1),
tf.float32)
return nonzero / y_true.get_shape()[-1]
class HammingLoss(tf.python.keras.metrics.MeanMetricWrapper):
def __init__(self, name='hamming_loss', dtype=None, mode='multiclass'):
super(HammingLoss, self).__init__(
hamming_loss, name, dtype=dtype, mode=mode)
@seanpmorgan @facaiy @WindQAQ
We can't import
MeanMetricWrapperusingtf.keras.metrics.MeanMetricWrapper, but can be imported usingtf.python.keras.metrics.MeanMetricWrapper. Is the latter fine, or should I open a PR for TF master totf_exportthe API forMeanMetricWrapper(here).
@Squadrick so tf.python is not a public API and we should avoid it. You can bring this up in this issue:
https://github.com/tensorflow/tensorflow/issues/28601 to see what tf-core devs recommend. It may be exposing the API as public or just copying it statically into Addons.
@seanpmorgan @Squadrick
Should I proceed with Mean till we get a response on this?
@seanpmorgan @Squadrick
Are we going have a version of MeanMetricWrapper in addons?
I think so, see https://github.com/tensorflow/tensorflow/issues/28601#issuecomment-505098700
I'm copied the implementation from core TF to TFA: #316. Once that's merged, @SSaishruthi can proceed with the implementation.
Looks like the PR got merged. I will start working on that.
@Squadrick @facaiy
Getting this error when trying to import tensorflow addons in colab.
Any comment on how to get rid of this?
NotFoundError: libtensorflow_framework.so.2: cannot open shared object file: No such file or directory
Any comment on how to get rid of this?
NotFoundError: libtensorflow_framework.so.2: cannot open shared object file: No such file or directory
@SSaishruthi Could you link the colab notebook? Be sure to run !pip install tensorflow==2.0.0-beta1 first. This error likely means that you're running tf2-alpha or tf1.x
@seanpmorgan
Colab link: https://colab.research.google.com/drive/1Msuv5xUu7lu5wDH1ei-VOPB-UnBolDfB
Using tf2-beta1
Colab link: https://colab.research.google.com/drive/1Msuv5xUu7lu5wDH1ei-VOPB-UnBolDfB
Using tf2-beta1
Could you try to reset the runtime and run the cells in order again. I just created a copy and it's working:
https://colab.research.google.com/drive/1wKDdQCirA4LEHdx4bgkQHP1YZZSAT-5I
@seanpmorgan Thanks
I was just resetting the current runtime. Just tried after resetting all the run times and it worked.
I am trying to import MeanMetricWrapper and not able to. Only CohenKappa is available
Please view the same notebook for reference. Not sure if I need to build from source.
Should I do anything from my side?
@seanpmorgan
from tensorflow_addons.metrics.utils import MeanMetricWrapper should work? If you're talking about in a colab notebook you may have to use !pip install tfa-nightly if it was added after 0.4 release
@Squadrick
I am trying to wrap hamming loss using MeanMetricWrapper as per the suggestion. I have some clarifications about the same.
Taking Mean over the total value was not yielding a proper result.
Using the mean method: https://colab.research.google.com/drive/1Msuv5xUu7lu5wDH1ei-VOPB-UnBolDfB#scrollTo=UKTf8PxceWDH
As you can see in the notebook, result does not match.
Whereas, if I hold the state of number of records in every epoch I was able to get the result expected results.
Holding state: https://colab.research.google.com/drive/1Msuv5xUu7lu5wDH1ei-VOPB-UnBolDfB#scrollTo=bGBO5unx33xS
I am not sure if I am missing anything here. Can I use the regular method of using Metric?
Please suggest.
Also, for hamming distance metric, I think it is ok to have a function like below just like euclidean.
If this is fine, I can create a separate PR for this. This can be used as an alternate distance metric
def hamming_distance(actuals, predictions):
result=tf.not_equal(actuals,predictions)
not_eq = tf.reduce_sum(tf.cast(result, tf.float32))
ham_distance = tf.math.divide_no_nan(not_eq, len(result))
return ham_distance
def hamming_loss(y_true, y_pred, mode='multiclass'):
if mode not in ['multiclass', 'multilabel']:
raise TypeError('mode must be: [multiclass, multilabel])')
if mode == 'multiclass':
nonzero = tf.cast(tf.math.count_nonzero(y_true * y_pred, axis=-1), tf.float32)
print(nonzero)
return 1.0 - nonzero
else:
nonzero = tf.cast(tf.math.count_nonzero(y_true - y_pred, axis=-1),
tf.float32)
return nonzero / y_true.get_shape()[-1]
class HammingLoss(tf.python.keras.metrics.MeanMetricWrapper):
def __init__(self, name='hamming_loss', dtype=None, mode='multiclass'):
super(HammingLoss, self).__init__(
hamming_loss, name, dtype=dtype, mode=mode)
This works for me. The idea to to have hamming_loss calculate loss from each sample in the batch separately, and let MeanMetricWrapper do the aggregation.
So:
actuals = tf.constant([[0, 1, 0, 0], [0, 1, 0, 0], [1, 0, 0, 0]],
dtype=tf.int32)
predictions = tf.constant([[1, 0, 0, 0], [1, 0, 0, 0], [0, 1, 0, 0]],
dtype=tf.int32)
print(hamming_loss(actuals, predictions, mode='multiclass').numpy()) #prints [1, 1, 1]
hamm = HammingLoss(mode='multiclass')
hamm.update_state(actuals, predictions)
print(hamm.result().numpy()) # prints 1.0
@Squadrick Thanks for the clarification. Got the idea now. Will create a PR
@Squadrick Also, can we have hamming distance separately as a distance metric?
@SSaishruthi You can call the file hamming.py or hamming_metrics.py and add: hamming_distance, hamming_loss and HammingLoss (as a tf.keras.metrics.Metric).
@Squadrick How would this look as a loss function instead of a metric?
@rjurney The only problem I see is that tf.count_nonzero is non-differentiable which could be solved by rewriting it with a close approximation, resulting in:
def hamming_loss(y_true, y_pred):
diff = tf.cast(y_true - y_pred, dtype=tf.float32)
#Counting non-zeros in a differentiable way
epsilon = K.epsilon()
nonzero = tf.reduce_sum( tf.math.abs( diff / (tf.math.abs(diff) + epsilon) ))
return tf.reduce_mean(nonzero / K.int_shape(y_pred)[-1])
@seanpmorgan why closed?
@seanpmorgan why closed?
Hamming loss was merged in https://github.com/tensorflow/addons/pull/342.
Cool!
On Mon, Dec 2, 2019 at 6:15 AM Sean Morgan notifications@github.com wrote:
@seanpmorgan https://github.com/seanpmorgan why closed?
Hamming loss was merged in #342
https://github.com/tensorflow/addons/pull/342.—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/tensorflow/addons/issues/305?email_source=notifications&email_token=AAAKJJIUHQ7I3KUD3IWUKTTQWUKAJA5CNFSM4HYZG6CKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEFTTZYQ#issuecomment-560413922,
or unsubscribe
https://github.com/notifications/unsubscribe-auth/AAAKJJPUVWC6PIEPLK2O36TQWUKAJANCNFSM4HYZG6CA
.
Most helpful comment
from tensorflow_addons.metrics.utils import MeanMetricWrappershould work? If you're talking about in a colab notebook you may have to use!pip install tfa-nightlyif it was added after 0.4 release