Scikit-image: Fixes for data types and other common sources of errors

Created on 9 Apr 2018  路  27Comments  路  Source: scikit-image/scikit-image

Description

I recently mentioned on the mailing list that a large proportion of our errors come from new (and old!) users surprised by the behavior of our data type conversions.

Although they are documented, this documentation is not prominent enough. Indeed, we barely even cover these in our scikit-image tutorials! Additionally, our own code base really is not careful enough to guide users along.

I want this issue to compile both common errors due to data types and proposed solutions, including a to-do list of consensus solutions.

Common errors

(Please add links here if you find them.)

  • User has a 12-bit image that is encoded in a uint16 array. When this is converted to uint8 or float, the result is an image that has much lower contrast (16x) than expected.
  • User has a float image with values falling outside the range [0, 1] or [-1, 1]. Some scikit-image functions fail altogether with this kind of input, while others warn. In almost all cases, these warnings are unnecessary.
  • User wants to apply scikit-image functions to an integer array. This array, by default, is of type int32 or int64. scikit-image converts the values in the array to float by dividing by 2^31, resulting in tiny values (which for example are impossible to encode in common image formats)

Proposed solutions

  • Figure out image metadata format and handling. Having correct metadata (bit_depth=12, for example) would alleviate many issues.
  • Clearer separation of data viewing and data processing. See email by @ds604. Specifically, we could tweak io.imshow to clearly display characteristics of the input data, including text info in the figure title/subtitle, colorbar, dtype, etc. Perhaps revisiting skimage.viewer as an integral part of skimage is warranted.

Rejected proposals

  • Magically infer the range of input images (objections here and here.)

To-do list

  • [ ] Add an FAQ section to the website homepage.
  • [ ] Audit the code base to make sure that float images work unimpeded wherever it's possible. (Search for img_as_float calls.) (Detailed suggestion by @grlee77 here.)
  • [ ] Remove automatic range conversion for int32 and int64 images.
  • [ ] Warn on conversion of uint16 images where no value exceeds 4096 (possible 12-bit image)
  • [ ] Add bit-depth option to in_range and out_range parameters for exposure.rescale_intensity.
discussion

Most helpful comment

One firm data point on the insufficiency of uint16 label images: We take microscope images of human tissue samples and segment individual cells. Images are in the multi-gigapixel range and contain on the order of 1 million cells. We absolutely require uint32 for the label masks!

All 27 comments

Remove automatic range conversion for int32 and int64 images.

How would we do this, given that we typically convert to floating point images before processing?

For metadata, perhaps switching to a single I/O backend per format would help.

@stefanv

Remove automatic range conversion for int32 and int64 images.
How would we do this, given that we typically convert to floating point images before processing?

Something along the lines of preserve_range=True implicitly if the dtype is int higher than uint16. And before you complain about the magic of this, we are already magically treating uint32 images differently from uint16 ones!

Personally, I'm against constantly checking bounds. If I want to perform convolutions and scaling on my images, that should be my choice. most algorithms are scale invariant, or require some kind of normalization anyway.

the speed of the algorithms shouldn't be impeded by range checks on my giant matricies multiple times per calls.

what is the point of imshow really? what is deficient in matplotlib.pyplot.imshow? performance, yes, but it autoscales and allows you to do things like add colorbars when you want to. Using it also empowers the user to learn a useful other tool so that they can expand on their skillset later.

I think visualization should be left to the user and not to the processing library.

this kinda relates to my PR #3062 where I dont think we should impose 64 bit floats on users.

I would suggest that if you want to make it new user proof, that you change the image load functions. Those happen once at the beginning and probably have had little processing applied to them.

Many of the functions first call img_as_float. loading images should arguably just convert it to the float by default, alleviating many issues with subsequent conversions.

Could we consider dropping range conversion altogether (like for 1.0 version)?

To answer that, we need to figure out all the places where the assumed range is useful. There is at least one case: RGB float images being displayed by matplotlib are clipped to [0, 1]:

In [1]: import numpy as np

In [2]: import matplotlib.pyplot as plt

In [3]: image = np.random.random((10, 10, 3)) * 255

In [4]: plt.imshow(image)
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
Out[4]: <matplotlib.image.AxesImage at 0x7f204a2e0a90>

I don't want to coordinate a deprecation with matplotlib.

It also matters when downcasting: what do you do when you have out-of-range values? I don't want to trade rescaling error reports for overflow error reports.

OK thanks. Do we do a lot of downcasting?

float -> uint8 is very very very common. I presume that anyone that does uint16 (e.g. 12-bit image) -> float will end up saving their image as uint8.

Probably you are right that rescale_ utility functions is the right answer for this.

I agree but we don't do that in scikit-image code, right? So utility functions should do the job

Still needs to be checked though.

It could be that lab2rgb code is also suffering from data range coercion, see https://github.com/scikit-image/scikit-image/issues/3078

A small update about this issue. I hope to start tackling this more systematically in the coming months. I asked @batson, author of noise2self, for any friction he might have encountered using skimage in the paper. His response (after being very complimentary, I should note! =):

one thing which killed was that the denoising functions (median filter especially, but maybe the others) were not type consistent. it would take in an image as float and return uint8, etc
so i would try to compare PSNR with the output and get something crazy
also NL-means could take in uint8 and return floats, but floats on the 0-255 scale
so not images according to skimage

The last bit is particularly concerning, and I can indeed see that we just do .astype(float64) for input images in the NL means code:

https://github.com/scikit-image/scikit-image/blob/a8692ec9b8c33df9cc6601cb42080a0fb051fbf3/skimage/restoration/_nl_means_denoising.pyx#L203-L206

An algorithm should never return a type of lower precision than the input, unless it implicitly throws away information (thresholding, e.g.).

I don't think we can solve the problem of int types being mapped to float. That is often the only way to produce an accurate answer.

The last bit is particularly concerning, and I can indeed see that we just do .astype(float64) for input images in the NL means code

You have to use floats internally for the denoising methods, but I think the inconsistency is that denoise_nl_means is not doing the conversion via img_as_float while all other denoising functions are. So int8 input to most denoising functions results in float output in the range [0, 1] while for nl-means there is no rescaling. My vote is to add img_as_float to denoise_nl_means, although that will result in a change to the output amplitude for non-float inputs so may need a deprecation cycle?

Deprecate 32 and 64 bits integer types support

I think this discussion needs to take place in the context of #3009.

I am linking here to #4442. My feeling is that we can avoid many bugs/errors by simply advising the user to not use 32 and 64 bits integers...

  • If the user needs fixed points and knows that his data range fits 16 or 8 bits advise him to use respectively (u)int16 or (u)int8,
  • if he doesn't need fixed points, advise casting to float, normalized or not.

I don't see any application requiring more then 16bits integers... If any one knows on,e I would really be happy to here about it...

For raw image data, I think the biggest necessary integer scale we need in medical imaging today is 16 bit signed integer to fit the extended range of CT data (Hounsfield units, in particular of metal on the positive side, can get to around +30k while air is around -1k).

The main reason I personally have needed more integer depth before was in the domain of oversegmentation, to uniquely specify all of the resulting regions. If your algorithm could give you thousands of individual distinct subregions, you run out of space in uint8 and uint12 fairly quickly in volumetric data or larger 2d images. Even so, 65k labels of 16 bit has thus far never limited me. However, it might be worth noting that in a 50 megapixel image your lower bound of average sub-region size in a 16 bit result would be 763 pixels (50 Mpx / 2**16 regions), which is fairly large.

So from the _raw image data_ standpoint I don't know of any current needs for integer images greater than 16 bits. For algorithms which accept _arrays representing subregion labels_, those inputs should probably continue to work with higher integer bit depths.

@JDWarner yes, absolutely, this would not deprecate (u)int32 and (u)int64 for label images. In fact, those types might be very good to reserve for label images. Precisely in the same vein, you would never want to rescale label images: you would want to exactly preserve the type there.

In [2]: np.float32(np.uint32(2**24)) == np.float32(np.uint32(2**24+1))
Out[2]: True

:+1: for not deprecating (u)int32 and (u)int64 for label images, but what about the image like arguments?

@jni

I generally advocate for deprecating implicit rescaling in the library altogether. In that case, it doesn't make sense to deprecate those dtypes.

I agree, I am against implicit rescaling. The preserve_range argument is a good tool for giving the user control on the behavior of the functions.

I opened #4442 because I faced some confusion in #4424 to clearly define what should be the range of an alpha channel when dtype is not float, uint8 or uint16. Is the "junk in -> junk out" the right answer in this case? Isn't at least a warning more helpful?
I propose in #4442 the more radical approach of deprecating int32 and int64 because I don't think that supporting these data types gives much value in practice.

I do not know of any active use - or see a significant limitation - by outright deprecating int32/64 for image data. However, I'd like to reach out to people in astrophysics, high energy physics, and analytical/physical chemistry to get a sense of the data they often engage with and how it comes off their sensors. High precision/bit depth 2D spectrophotometers might, depending on how they are configured, need this sort of range - though in most cases they probably either directly or indirectly output floats today.

@Cadair how does your solar observatory data come in? Do you use high range ints (int32, int64), or know anyone who does?

I will investigate and get back to you. My gut reaction is that such data exists, but I am not sure.

One firm data point on the insufficiency of uint16 label images: We take microscope images of human tissue samples and segment individual cells. Images are in the multi-gigapixel range and contain on the order of 1 million cells. We absolutely require uint32 for the label masks!

This is an interesting discussion.

One of the difficulties in the PR I helped with some weeks ago was caused by the desire to limit floating-point images to the [0,1] range.

IMO, there is no natural requirement for an image to have a bounded range. But my point of view is often different because I come to the field from the measurement sciences angle. To me a pixel is a measurement, and unless you know what and how it's measured, you can't say anything about what the valid range of values is. Usually images are recorded by some process that involves quantization and digitization, which leads to integer pixel values in the range [0,255] or [0,1023] or [0,4095] or whatever the quantizer does. But these values represent some continuous physical process, and we can sometimes recover the values recorded. Isn't it convenient then for the pixel value to be in those original units, whatever they were (number of photons, temperature, pressure)? If we limit the values to the [0,1] range, the only unit you can represent is "fraction of the maximum level accepted by the quantizer". That's not a physical measurement...

Also, how do you guys deal with the Fourier Transform in your library? By definition, the DFT has a value of sum(image) at the origin. If image is in the range [0,1], then the value of the DFT at the origin is in the range [0,number of pixels]. You can't normalize that to [0,1] without making some things really complicated...

And there is not a single algorithm that I know that requires the input image to have a limited range of gray values. The only place where such a limit is important is in the display and when saving to file. There is no need to limit ranges anywhere else. In the DIPlib project we solve these issues as follows:

  • For display we have a dedicated tool. By default the image is shown with its min value as black and its max value as white (even for color images). But the tool allows the user to examine the original pixel values, as well as select different mapping modes. Basically, no choice is made for the user, the user is in control, but the default is chosen as the "least surprise" method (users tend to be most surprised by an all-black display).

  • For saving we offer a default file format (ICS) that stores floating-point values. If the user wants something different, they have to discretize (i.e. cast to integer), at which point they have to make a decision about how to do so. Again, no choice is made for the user, they're in control.

I think the display solution is simple and effective. The saving one is not. A sane alternative would be to cast to 8-bit unsigned integer without scaling, but print a warning if the [0,255] range is exceeded or not filled. The warning would contain a suggestion for a suitable mapping function that the user could use to manually cast to integer.

I think scikit-image is mostly aligned with your point of view @crisluengo and thank you very much for sharing it here.

In fact, one main rule in skimage is to not touch users input data values, unless the called method requires it... Some times the library deviates from this principle, that's why many issues are related to this topic and are as much as possible addressed...

Concerning image saving and display, I tend to think that skimage should let that to other libraries (imageio, tifffile, matplotlib, napari...) that are dedicated to that and will do it much better then we could... This strategy has also the other advantages of reducing skimage number of dependencies, reduce maintenance burden, and distribute development workload over different opensource packages...

I like the idea of doing one thing (image processing), and doing it well.

Was this page helpful?
0 / 5 - 0 ratings