Vision: [feature request] Image Histogram Transformation

Created on 10 Sep 2018  路  9Comments  路  Source: pytorch/vision

It is often useful (especially in the field of astronomy) to transform the histogram of images. I would like to suggest an image histogram transformation function (under torchvision.transforms) that transforms the histogram of an image to match that of a template image as closely as possible. For instance, consider the following function:

def match_histogram(source, template):

    source   = np.asanyarray(source)
    template = np.asanyarray(template)
    oldshape = source.shape
    source   = source.ravel()
    template = template.ravel()

    # get the set of unique pixel values and their corresponding indices and
    # counts
    s_values, bin_idx, s_counts = np.unique(source, return_inverse=True,
                                            return_counts=True)
    t_values, t_counts = np.unique(template, return_counts=True)

    # take the cumsum of the counts and normalize by the number of pixels to
    # get the empirical cumulative distribution functions for the source and
    # template images (maps pixel value --> quantile)
    s_quantiles  = np.cumsum(s_counts).astype(np.float32)
    s_quantiles /= s_quantiles[-1]
    t_quantiles  = np.cumsum(t_counts).astype(np.float32)
    t_quantiles /= t_quantiles[-1]

    # interpolate linearly to find the pixel values in the template image
    # that corresponds most closely to the quantiles in the source image
    interp_t_values = np.interp(s_quantiles, t_quantiles, t_values)

    return interp_t_values[bin_idx].reshape(oldshape)

The function above is not optimal since it has to recalculate template image information. It is not discretized for float type images. It only performs for highly discretized images such as png (0-255 bins). It also performs poorly when the number of diverse pixels is too low which might be fixed by adding small noise.

enhancement help wanted

Most helpful comment

Hey 馃槂

I was browsing throught the vison issues and found that one, turns out I actually did some work on histogram specification some time ago.
Something like that :

D89b8U6XsAEVk4n

I wrote it as a cuda module as I was running the transform in an optimisation loop and needed it to be fast. The code is available over here if that can be useful : https://github.com/pierre-wilmot/NeuralTextureSynthesis/
Happy to help cleaning it up if you think it's worth adding to the vision repo.

All 9 comments

Thanks for the issue!

I think we could provide a histogram transformation functionality in torchvision.
Maybe one possibility would be to allow the user to pass in directly the target histogram, instead of passing the image, and provide a simple functionality to compute the histogram of an image.

Also, apart from np.interp, all the other functions have torch equivalents, so maybe we could make it use torch functions whenever possible?

Also, could you send a PR?

Thank you @fmassa for your reply. The function above was written outside the context of PyTorch. That's why it's all NumPy. If you are going to integrate my code, I need some time (~ 1 month) to write it properly. As you said, I will use torch functions whenever possible and pass a target histogram instead. That's easy.
But the user should be able to

  • feed different histograms for different channels,
  • or the same one for all channels,
  • or flatten the input image and apply the histogram transformation and reshape it to the original shape (which is the case for the code above).

My biggest worry is applying a histogram transform on floats. I need to see how other people do this (if you know any references, please send it my way). Otherwise, the way I would do it is to divide the input image/tensor to $b$ ranges where $b$ equals to the number of bins of the provided histogram. Then we need to truncate values by choosing min/max values. These limits can be found by finding the location where a small portion $p$ of the data falls below/above those values. By implementing this we could address image processing for non-standard formats such as RAW in photography and FITS in astronomy, as well as sound files.

By putting more effort into it, we could provide template histograms eliminating the need for gamma correction, contrast adjustments, and mean and standard deviation shifting.

Do you think I should send the pull request before addressing the big issues above or after (I'm still new to the GitHub community)?

About points 1-2:
what about the following: if the user provides a 1d histogram, it performs the same equalization for all channels (as if it was broadcast for all channels), and if it is a 2d histogram, then each channel uses one of the provided histograms. One of the limitations of this approach is that the number of elements of all histograms should be the same, but this is usually fine for uint8 images.

If we pass the bins of the histogram as well, we don't need to worry if the image is floating point or not, so we would pass not only the counts but also what value each count accounts for. I think this would solve your comments, right?

I don't have any experience with RAW or FITS images, but please feel free to send a PR with what we have discussed. Also, raising any issues you might see with it is definitely valuable!

I like that! Then I'll start writing the function hist_transform(input_tensor, hist_bin, hist_count), all torch tensors and inform you when done.

Hello @fmassa again. So I finished polishing and testing the code as we discussed. It is very inconvenient to switch back and forth between torch.tensor and numpy.ndarray, so I decided to do everything in numpy.
I am very new to git and GitHub and I don't know how properly do a pull request. If you can help me with that, I'll appreciate it. See this gist for the code which includes a module test. So, the part that includes the major code is above # For tests and demonstration.

I have made a pull request here. Also, see this gist for tests and proof of concept.

Hey 馃槂

I was browsing throught the vison issues and found that one, turns out I actually did some work on histogram specification some time ago.
Something like that :

D89b8U6XsAEVk4n

I wrote it as a cuda module as I was running the transform in an optimisation loop and needed it to be fast. The code is available over here if that can be useful : https://github.com/pierre-wilmot/NeuralTextureSynthesis/
Happy to help cleaning it up if you think it's worth adding to the vision repo.

@gheaeckkseqrz Thanks for the proposal! I think it could be a nice addition to have efficient GPU-accelerated transforms, but first we need to have a reference implementation and I have to find the time to review the PR in #796

@gheaeckkseqrz's proposal would also really help with style transfer and potentially GAN related tasks as well. Support for histogram matching on GPU with tensors can be extremely useful for style transfer. https://github.com/ProGamerGov/neural-style-pt/issues/46#issuecomment-563005587

Was this page helpful?
0 / 5 - 0 ratings

Related issues

varagrawal picture varagrawal  路  3Comments

alpha-gradient picture alpha-gradient  路  3Comments

300LiterPropofol picture 300LiterPropofol  路  3Comments

zhang-zhenyu picture zhang-zhenyu  路  3Comments

xuanqing94 picture xuanqing94  路  3Comments