addons 🚀 - Add new metrics for tf.keras

Some of the other metrics/loss functions that could be added are for example pinball loss and SMAPE loss, both from the field of time-series.

armando-fandango on 3 Apr 2019

Per our discussion in the monthly meeting this seems like it would be a good subpackage. To start we could begin by defining the API structure for the subpackage and an example metric. f1 score should be pretty easy to implement, seeing as precision and recall are already built in to tf.keras.metrics

seanpmorgan on 7 Apr 2019

👍3

@AakashKumarNain Hi, Aakash, would you like to make a contribution? By the way, thank you for mentioning Cohen's Kappa, which was implemented by me, haha :-)

facaiy on 10 Apr 2019

@facaiy Sure. Actually, before I was aware of addons, I submitted a PR to TF repo. Check this https://github.com/tensorflow/tensorflow/pull/25051#event-2258495550

AakashKumarNain on 10 Apr 2019

😄1 👍1

@facaiy We need to import class Metric(Layer) in keras_utils.py so that we can directly import it in the metrics subpackage without defining the class explicitly. It will be similar to how LossFunctionWrapper class has been imported. What do you say?

AakashKumarNain on 10 Apr 2019

I would say that it sounds reasonable. cc @seanpmorgan Sean, what do you think?

facaiy on 11 Apr 2019

Yes, but it should also be included in our API issue sent to core. I'll be sure to include it and send the issue this week (was just waiting until we had a complete list... but that may never happen)

seanpmorgan on 11 Apr 2019

👍1

@seanpmorgan @facaiy any updates on this?

AakashKumarNain on 25 Apr 2019

https://github.com/tensorflow/tensorflow/blob/master/tensorflow/python/keras/metrics.py#L65 I believe its public now? Sorry I've built the list of exports we need, but I have to keep updating it as new APIs are released / want to make sure there are not alternatives

seanpmorgan on 25 Apr 2019

Haha I haven't checked it yet @seanpmorgan but I will do it by tomorrow. Btw 1.14 cut was scheduled on 15th. So whatever is in that release, it should be public, right?

AakashKumarNain on 27 Apr 2019

Hi, Aakash. tf_addons depends on tf 2.0 nightly version, rather than 1.x version, so I think it safe to assume that the API is public :-)

@seanpmorgan I forget tf-2.0-alpha, do we need to add a proxy which catches import error and falls back to private API ?

facaiy on 29 Apr 2019

Hey @facaiy Thanks for the info. I am quite busy since last week working on a competition whose deadline is approaching fast. I will look into it on the weekend. Sorry for the delay

AakashKumarNain on 30 Apr 2019

😄1

Hi, Aakash. tf_addons depends on tf 2.0 nightly version, rather than 1.x version, so I think it safe to assume that the API is public :-)

@seanpmorgan I forget tf-2.0-alpha, do we need to add a proxy which catches import error and falls back to private API ?

Yes, that would be the fix but seeing as we're probably not too far from some sort of tf2 release candidate I can just add it to the list of API rollbacks we'll be doing on the release branches while we wait

seanpmorgan on 30 Apr 2019

👍1

@seanpmorgan @facaiy I still don't see the Metric import in the utils file which I was talking about. Anyways, I made some changes in my metric implementation and I am getting a bunch of errors. Specifically, variable assignments aren't allowed and I have to find a way to add a variable in my constructor using add_weight method. The problem is that the variable would be used in update_state method of that class but that variable shouldn't be updated for each iteration/epoch. It is a constant and should be initialized only once in the constructor(ideally)

AakashKumarNain on 4 May 2019

Could you create an PR ? Then we can take a look and check if there's something wrong there.

facaiy on 5 May 2019

👍1

Sure @facaiy. I will do that. Thank you

AakashKumarNain on 6 May 2019

@facaiy before I create a PR, can you please take a look at this notebook? https://colab.research.google.com/drive/1Dd97zdfJhGIrMTpzSKdR6rxnbDGuSBV0

Some of the ops aren't executed in eager mode which is strange. Though if I don't use Metric inheritance, the functions works well

AakashKumarNain on 6 May 2019

@facaiy did you get a chance to look at the problem?

AakashKumarNain on 15 May 2019

@AakashKumarNain Sorry for the delay, Aakash, I've been busy and forgot to reply. I'll take a look this weekend.

By the way, would you mind creating an empty metric subpackage directory at first? I just find that the two issues pending depends on the work :-)

facaiy on 16 May 2019

@shashvatshahi1998 Welcome, Shashvat. Perhaps you'll like to discuss F1 socre implementation with @AakashKumarNain https://github.com/tensorflow/addons/issues/232

facaiy on 16 May 2019

Sorry for the delay @facaiy I was busy in a competition. I have created a PR for the metrics module now

AakashKumarNain on 16 May 2019

😄1

@ychervonyi @shashvatshahi1998 Hi, #247 (thank @AakashKumarNain) metric module has been created in the master branch, so I think we can start the related work from here. Please contact us if any help is needed :-)

facaiy on 20 May 2019

before I create a PR, can you please take a look at this notebook? https://colab.research.google.com/drive/1Dd97zdfJhGIrMTpzSKdR6rxnbDGuSBV0
Some of the ops aren't executed in eager mode which is strange. Though if I don't use Metric inheritance, the functions works well

@AakashKumarNain Aakash, I think you're right after I read tf-2.0 code: the update_state of cusom metric is decorated by tf.function automatically, that's why you find those code runs in graph mode. see https://github.com/tensorflow/tensorflow/commit/39a561786aad35c6c203ba698af501652d67da77 :

https://github.com/tensorflow/tensorflow/blob/2c2d508aa2947ede05cfa195139b176d6cdc9056/tensorflow/python/keras/metrics.py#L149-L152

Please correct me if I'm wrong, @pavithrasv Pavithra :-)

But we shouldn't worry it too much, because tf.function will take care of them in theory(both in graph or eager mode). I take a try to fix your python case, and it works (although the result seems wrong): Note that we use nb_ratings = tf.shape(conf_mtx)[0] to retrieve its dynamic shape.

# -*- coding: utf-8 -*-
"""KappaforTF.ipynb

Automatically generated by Colaboratory.

Original file is located at
    https://colab.research.google.com/drive/1Dd97zdfJhGIrMTpzSKdR6rxnbDGuSBV0
"""

import tensorflow as tf
import tensorflow.keras.backend as K

tf.__version__

# These are the imports which we need in the utils file
from tensorflow.math import confusion_matrix
from tensorflow.keras.metrics import Metric

class CohensKappa(Metric):
    """Computes Kappa score between two raters.
    The score lies in the range [-1,1], where a score of -1 represents
    complete disagreement between two raters whereas a score of 1 represents
    complete agreement between the two raters. A value of 0 means chance by agreement

    Args:
    y1 : array, shape = [n_samples]
        Labels assigned by the first annotator.
    y2 : array, shape = [n_samples]
        Labels assigned by the second annotator. The kappa statistic is
        symmetric, so swapping ``y1`` and ``y2`` doesn't change the value.
    labels : array, shape = [n_classes], optional
        List of labels to index the matrix. This may be used to select a
        subset of labels. If None, all labels that appear at least once in
        ``y1`` or ``y2`` are used.

    Returns
    kappa : float
        The kappa statistic, which is a number between -1 and 1. The maximum
        value means complete agreement; zero or lower means chance agreement.

    """
    def __init__(self, name='cohens_kappa', dtype=tf.float32,):
        super(CohensKappa, self).__init__(name=name, dtype=dtype)
        self.kappa_score = self.add_weight('kappa_score', initializer=None)

    def update_state(self, y_true, y_pred, sample_weight=None):
        y_true = tf.cast(y_true, dtype=tf.int32)
        y_pred = tf.cast(y_pred, dtype=tf.int32)

        # check the tensors
        print("Actauls: ", y_true)
        print("Predictions: ", y_pred)


        # Get the confusion matrix
        # This is where this function throws an error
        conf_mtx = confusion_matrix(labels=y_true, predictions=y_pred)

        # If you print the confusion matrix, you will see that
        # this isn't executed in eager mode. 
        print("Confusion Matrix:")
        print(conf_mtx)

        conf_mtx = K.cast(conf_mtx, dtype=tf.int32)
        nb_ratings = tf.shape(conf_mtx)[0]

        # 2. Create a weight matrix
        if sample_weight is None:
            weight_mtx = tf.ones([nb_ratings, nb_ratings], dtype=tf.int32)
            diagonal = tf.zeros([5], dtype=tf.int32)
            weight_mtx = tf.linalg.set_diag(weight_mtx, diagonal=diagonal)
            weight_mtx = tf.cast(weight_mtx, dtype=tf.float32)

        elif sample_weight=="linear": 
            weight_mtx = tf.ones([nb_ratings, nb_ratings], dtype=tf.int32)
            weight_mtx += tf.range(nb_ratings, dtype=tf.int32)
            weight_mtx = tf.cast(weight_mtx, dtype=tf.float32)
            weight_mtx = tf.abs(weight_mtx - K.transpose(weight_mtx))

        elif sample_weight=="quadratic":
            weight_mtx = tf.ones([nb_ratings, nb_ratings], dtype=tf.int32)
            weight_mtx += tf.range(nb_ratings, dtype=tf.int32)
            weight_mtx = tf.cast(weight_mtx, dtype=tf.float32)
            weight_mtx = K.pow((weight_mtx - K.transpose(weight_mtx)), 2)

        else:
            raise ValueError("Unknown kappa weighting type.")

        actual_ratings_hist = K.sum(conf_mtx, axis=1)
        predicted_ratings_hist = K.sum(conf_mtx, axis=0)

        out_prod = predicted_ratings_hist[..., None] * actual_ratings_hist[None, ...]

        conf_mtx = conf_mtx / K.sum(conf_mtx)
        out_prod = out_prod / K.sum(out_prod)

        numerator = K.sum(tf.cast(conf_mtx, tf.float32) * weight_mtx)
        denominator = K.sum(tf.cast(out_prod, tf.float32) * weight_mtx)          
        kp = 1-(numerator/denominator)
        return self.kappa_score.assign(kp)

    def result(self):
        return self.kappa_score

actuals = tf.convert_to_tensor([4, 4, 3, 4, 2, 4, 1, 1, 2, 0], dtype=tf.int32)
preds = tf.convert_to_tensor([4, 4, 3, 4, 4, 2, 1, 1, 2, 0], dtype=tf.int32)

print('confusion_matrix')
print(confusion_matrix(actuals, preds))

kp = CohensKappa()
print('CohensKappa')
print(kp.variables)
print('init')
print(kp.result())
kp.update_state(actuals, preds, sample_weight=None)
print('result')
print(kp.result())

facaiy on 23 May 2019

@facaiy Thanks for this information. I am on vacation till next Thursday. Once I am back, I will look into it and will correct it.

AakashKumarNain on 23 May 2019

No worries, you're welcome :-)

facaiy on 23 May 2019

@facaiy I fixed it. I have also included a simple function to test the implementation with sklearn implementation. Please take a look again and see if if it is fine.

https://colab.research.google.com/drive/1Dd97zdfJhGIrMTpzSKdR6rxnbDGuSBV0

AakashKumarNain on 30 May 2019

@AakashKumarNain Good news! Could you create a PR, Aakash?

facaiy on 31 May 2019

@facaiy I am almost done. I am stuck at the test case though. I am getting some errors. Can you please look into it again?
https://colab.research.google.com/drive/1Dd97zdfJhGIrMTpzSKdR6rxnbDGuSBV0

Towards the end you will find the Test Case section. Thank you

AakashKumarNain on 31 May 2019

Hi, @AakashKumarNain I think you are looking for

self.assertAllClose(score1, 0.68932) # could feed tensor/np-array/float and tolerance defaults to 1e-5

or

self.assertAlmostEqual(self.evaluate(score1), 0.68932, 5) # value only

Thanks for the contribution btw 😄

WindQAQ on 31 May 2019

Thanks @WindQAQ for the info. Actually I tried both of them and even they are throwing errors.

1) self.assertAllClose(score1, 0.68932)

E0601 04:35:49.201909 140117080422272 test_util.py:1522] 2 root error(s) found.
  (0) Failed precondition: Error while reading resource variable kappa_score_3 from Container: localhost. This could mean that the variable was uninitialized. Not found: Container localhost does not exist. (Could not find resource: localhost/kappa_score_3)
     [[node Identity_3/ReadVariableOp (defined at <ipython-input-27-433e0c86f80d>:17) ]]
     [[Identity_3/_1]]
  (1) Failed precondition: Error while reading resource variable kappa_score_3 from Container: localhost. This could mean that the variable was uninitialized. Not found: Container localhost does not exist. (Could not find resource: localhost/kappa_score_3)
     [[node Identity_3/ReadVariableOp (defined at <ipython-input-27-433e0c86f80d>:17) ]]
0 successful operations.
0 derived errors ignored.

Errors may have originated from an input operation.
Input Source operations connected to node Identity_3/ReadVariableOp:
 kappa_score_3 (defined at <ipython-input-10-2938b575c3c5>:28)

Input Source operations connected to node Identity_3/ReadVariableOp:
 kappa_score_3 (defined at <ipython-input-10-2938b575c3c5>:28)

2) self.assertAlmostEqual(self.evaluate(score1), 0.68932, 5)

E0601 04:37:52.891778 140117080422272 test_util.py:1522] 2 root error(s) found.
  (0) Failed precondition: Error while reading resource variable kappa_score_4 from Container: localhost. This could mean that the variable was uninitialized. Not found: Container localhost does not exist. (Could not find resource: localhost/kappa_score_4)
     [[node Identity_4/ReadVariableOp (defined at <ipython-input-30-c2339c53b877>:17) ]]
  (1) Failed precondition: Error while reading resource variable kappa_score_4 from Container: localhost. This could mean that the variable was uninitialized. Not found: Container localhost does not exist. (Could not find resource: localhost/kappa_score_4)
     [[node Identity_4/ReadVariableOp (defined at <ipython-input-30-c2339c53b877>:17) ]]
     [[Identity_4/_1]]
0 successful operations.
0 derived errors ignored.

Errors may have originated from an input operation.
Input Source operations connected to node Identity_4/ReadVariableOp:
 kappa_score_4 (defined at <ipython-input-10-2938b575c3c5>:28)

Input Source operations connected to node Identity_4/ReadVariableOp:
 kappa_score_4 (defined at <ipython-input-10-2938b575c3c5>:28)

AakashKumarNain on 1 Jun 2019

Hi, after a rough test and digging into keras implementation, I found update_state is forced to execute in graph mode.

https://github.com/tensorflow/tensorflow/blob/a6e5f879e5e36cb33efaed25597cc254ed71bae5/tensorflow/python/keras/metrics.py#L212

And thus, it would be correct to return the update_op in update_state, that is, self.kappa_score.assign(kp).

class CohensKappa(...):
    def update_state(self, ...):
        ...
        return self.kappa_score.assign(kp)

And do tests like the following:

@test_util.run_all_in_graph_and_eager_modes
class CohensKappaTest(tf.test.TestCase):
    def test_config(self):
      kp_obj = CohensKappa(name='cohens_kappa')
      self.assertEqual(kp_obj.name, 'cohens_kappa')

    def test_kappa(self):
      actuals = np.array([4, 4, 3, 4, 2, 4, 1, 1], dtype=np.int32)
      preds = np.array([4, 4, 3, 4, 4, 2, 1, 1], dtype=np.int32)
      sample_weights = 'quadratic'
      kp_obj = CohensKappa()

      actuals = tf.convert_to_tensor(actuals, dtype=tf.int32)
      preds = tf.convert_to_tensor(preds, dtype=tf.int32)

      # self.evaluate(tf.compat.v1.initializers.variables(kp_obj.variables))
      update_op = kp_obj.update_state(actuals, preds, sample_weight=sample_weights)
      self.evaluate(update_op)
      #score2 = cohen_kappa_score(actuals, preds, weights=sample_weights)

      #print(f"This implementation : {score2:>20}")
      self.assertAlmostEqual(self.evaluate(kp_obj.result()), 0.68932, 5)

I also found this function update_confusion_matrix_variables. Not sure if it could help you aggregate the stats or something.

Edit: revised notebook here: https://colab.research.google.com/drive/1NUnHu9IPZqgJAeXxXqcm3K1QXJQlmHQ_

WindQAQ on 1 Jun 2019

Perfect. Thanks @WindQAQ

AakashKumarNain on 1 Jun 2019

Good to close this? Seems like we have a few separate metric's issues with some overlap?

seanpmorgan on 12 Jun 2019

Yup, we can track metrics under #265.

Squadrick on 12 Jun 2019

Yeap. Good to close.

AakashKumarNain on 12 Jun 2019

Addons: Add new metrics for tf.keras

Most helpful comment

All 35 comments

Related issues