I find, in most of test cases of Addons / tf.contrib, we are used to write a duplicated numpy function to verify the correction of our implementations. It's straightforward, but the drawbacks are:
Hence I propose to remove all numpy implementations, and replace them by explicit numerical value (and add necessary comments to describe how to figure it out, say intermediate result of calculation) in test cases.
# No
def np_add(a, b):
return a + b
self.assertEqual(np_add(1, 2), tf_add(1, 2))
# Yes
# Add some comments if necessary:
# c = a + b = 1 + 2 = 3
self.assertEqual(3, tf_add(1, 2))
cc @seanpmorgan @karmel
By the way, do we need to post those discussion needed issues to mailing list? Perhaps we could gain more attention, and collect more feedback there.
Some initial thoughts...
I also think that emailing dicussion needed tags is a great idea. It could probably be made a bot if someone was willing.
I think the chances of hitting such a bug are quite low. (Of course as long as you stay with the „standard“ functions and don’t do something crazy)
Implementing something wrong twice might happen. But it’s still better then implementing it once, since you have to express the algo in two different apis.
Also the amount of Tests you can create without numpy might be quite limited.
Of course the dependencie to numpy it self is bad, but the usefulness of numpy is quite big.
Greets :)
If an input X is given, then its output Y has been determinate. We should explicitly claim what we want, and what the result should be.
Say, Y = f(X) = X + 1, then Y is always 6 when X is 5. It looks strange that you create another function Z = g(X) = X + 1, then assert f(X) == g(X) when X = 5. Worse still, we have to maintain two functions here: f(X) and g(X).
Comparing keras.MeanSquaredErrorTest with addons.LiftedStructLossTest, I think keras.MeanSquaredErrorTest is more easy to understand, less error-prone.
My point is: we don't need to write a numpy function to calculate the result. Use assertEqual(Y, f(X)), rather than assertEqual(g(X), f(X)).
@Smokrow Hi, Moritz:
Also the amount of Tests you can create without numpy might be quite limited.
Would you mind giving an example to help us understand why? Oh, don't get me wrong, I think we can use numpy library in the test case. But we shouldn't write an duplicated numpy implementation like https://github.com/tensorflow/addons/blob/f29c204ef40d2dd84bf92436de438463fcfb38c5/tensorflow_addons/losses/python/triplet_test.py#L56
Let's make it clear: take triplet_loss as an example, we implement it with tf ops. And numpy implementation meanstriplet_loss_np function we created by ourselves in its test python file.
Here the term "numpy implementation" is used for those functions created by Addons (our loss, layer, optimizer ,etc ), rather than for native tensorflow / numpy ops (like tf.exp and np.exp).
@facaiy
Hey Yan.
I think it is okay to use a function like that as long as you make sure that every subpart of the function is already tested.
For example if you have some transformation followed by a multiplication:
You could write tests for both parts (Multiplication and transformation) and make sure each part works correctly.
Afterwards I combine these parts into one separate numpy function to make it easier to test higher level functions for larger scale computations, edge cases or other weird stuff. Especially randomized initialization can become a problem here. I for example used a hybrid approach at normalizations where I initialised the weights with tf and used numpy to calculate the correct output afterwards.
But i get your point. It does not make sense to implement everything 2 times since both implementations could be wrong
The "correct" answer here may depend on context. In general, we tend towards just putting in numbers to avoid the tests getting cluttered. That said, that only works if the numbers are principled/understood in some way. That is, a classic bad habit is for a dev to run the code once, use the output as the test, and never validate that the produced output is actually the technically correct output. So, I would argue, as long as the derivation of the numbers is obvious enough that they can be re-derived, numbers are fine. If the way to derive the numbers in question is to run the same through numpy, may as well include that code for future devs, rather than having magic numbers in there. That said, all of this is opinion, and either way has advantages/disadvantages.
@facaiy Does it seem reasonable to replace numpy calls when the expected values are understood (e.g. simple matrix math), but to continue to use numpy when it is a pre-requiste to getting an answer? (e.g. large matrix math or as an input to a model evaluation)
Thanks for the great input, @Smokrow @karmel :-)
@Smokrow Yes, random data/variable is a good example, because its result is non-deterministic, so we have to write python function to calculate the result dynamically. Make sense for me, thank you :-)
@karmel
a classic bad habit is for a dev to run the code once, use the output as the test, and never validate that the produced output is actually the technically correct output.
Similarly, another bad habit here is that, developer writes a tf implementation and a corresponding numpy implementation, and then just check whether its results are the same by feeding random data. I don't think it's fair to impute bad habit to method/tool.
And to avoid magic number, how about adding detailed comments to describe how to figure it out (say intermediate result of calculation) ? That's what I learned from https://github.com/tensorflow/tensorflow/pull/15443/files#diff-e37fe49087784a68c4a0d4417d941382R6730
What I think is not a good habit is, to use a numpy custom implementation with random data to verify the correctness of tf implementation, for example, TripletSemiHardLossTest. Those tests are what I propose to revise.
And I agree with @karmel @seanpmorgan @Smokrow: it depends on context. And the criterion is easy to understand and maintain.
Thank you all for your great feedback.
That's all what I can think of after the discussion:
I don't know if this is overkill, but the test-numpy function could be implemented by a different developer. In that case we would have 2 different opinions of the same paper. The test implementation would be written first and afterwards somebody else implements the paper in Tensorflow. (Maybe even without looking at the test case)
Of course the drawback would be a lot of extra work. (both on organization and coding)
would be a lot of extra work. (both on organization and coding)
We have to take into account maintainability. If the numpy implementation is so good, I'm afraid it would be better to contribute it to numpy community.
We have to take into account maintainability. If the numpy implementation is so good, I'm afraid it would be better to contribute it to numpy community.
True 😄
Most helpful comment
The "correct" answer here may depend on context. In general, we tend towards just putting in numbers to avoid the tests getting cluttered. That said, that only works if the numbers are principled/understood in some way. That is, a classic bad habit is for a dev to run the code once, use the output as the test, and never validate that the produced output is actually the technically correct output. So, I would argue, as long as the derivation of the numbers is obvious enough that they can be re-derived, numbers are fine. If the way to derive the numbers in question is to run the same through numpy, may as well include that code for future devs, rather than having magic numbers in there. That said, all of this is opinion, and either way has advantages/disadvantages.