From #242 we found that the tfa.image transforms are not compatible with dataset mapping. This is a very likely usecase for tfa.image so we need to address this. Here is a minimal example showing the differences in tfa.rotate(pi/2) and tf.image.rot90:
https://colab.research.google.com/drive/1ZDhnGrorvSf04wzS-0utztZ6Nnzusvl7
Problems seem to occur because Dataset mapping runs in graph mode and this check fails:
https://github.com/tensorflow/addons/blob/master/tensorflow_addons/image/transform_ops.py#L323
By comparison here is tf.image.rot90:
https://github.com/tensorflow/tensorflow/blob/master/tensorflow/python/ops/image_ops_impl.py#L514
Will think about this more tonight
According to https://github.com/tensorflow/tensorflow/issues/27811, TF2 transforms Data mappings to subgraphs for better performance.
Here is a non-performatic workaround to force Eager Mode and use tfa.image until there is a solution:
image_ds = path_ds.map(lambda path: tf.py_function(func=load_and_preprocess_image,inp=[path],Tout=tf.float32))
I noticed tfa.image.rotate is running on the CPU, so it is running very slowly, is this the expected behaviour?
Thanks for the temporary workaround! Yeah gpu kernels are on the way #118
Two thoughts:
transform: https://github.com/tensorflow/addons/blob/master/tensorflow_addons/image/transform_ops.py#L76 Is there a strong technical reason why this restriction exists?rot90() code seems to assume that if no shape information is available, one can assume that the rank is 3. Do we think that's safe for our transformations as well?
- I opened #244 to have a (failing) test case for this issue
Thank you, @kyleabeauchamp!
- The requirement of fixed and static image rank is not only in rotate, but upstream in
transform: https://github.com/tensorflow/addons/blob/master/tensorflow_addons/image/transform_ops.py#L76 Is there a strong technical reason why this restriction exists?
I think it is because the cpp implementation here requires a static rank of 4 and each dimension will be fetched out for later computation.
- The
rot90()code seems to assume that if no shape information is available, one can assume that the rank is 3. Do we think that's safe for our transformations as well?
Well, I do not think it is very safe though. I'd suppose it is designed for tf.dataset. (Not so sure about this). Hi, @facaiy, would you mind taking a look at this?
@WindQAQ Thanks for ping me, Tzu-Wei.
Well, I do not think it is very safe though. I'd suppose it is designed for tf.dataset.
Yeah, it's not safe, and we can relax the restriction if we document it clearly: data must be 4-D.
@mrry Hi, Derek, could you take a look? The rank of tensor seems unknown even in eager mode.
@karmel Karmel, do you know who could answer those questions about tf.data and tf.image? Thank you.
@jsimsa -- can you advise?
One option is to fix the addons logic as discussed in this thread. Another option is to replace decode_image with decode_png. The problem arises because decode_image is executed by tf.data in graph-mode and shape inference fails to determine which one of the decode utilities is used (which seems to trip some addons logic).
@jsimsa thank you for the answer and apologies for the slow reply back. I confirmed that switching to decode_png does corrrectly infer the shape which is great. One lingering question though:
1) It's true that the addons logic got tripped up by the None shape coming out of decode_image... but in tf.image.rot90 None shape is handled by defaulting to a 3D shape. Is there a reason this is safe to do? We can copy the same logic but I'm not sure I'm following the rationale.
I tried digging through git history to find the person who might have the answer for you and as far as I could tell is should be @aselle who introduced that logic in 2016.
Hi @aselle Do you happen to have any comment on why its safe to default to 3D rotate when shape is None?
https://github.com/tensorflow/tensorflow/blob/master/tensorflow/python/ops/image_ops_impl.py#L511
It probably is just so that if you don't know anything about the shape you can still proceed. None of ndims means the rank is not known. so it's just a fallback.
So we should handle unknown rank, because it's valid in graph mode. And it seems that there are at least two solutions:
Most helpful comment
Will think about this more tonight