I noticed that the formula of BrightnessContrast operator on dali is
out = brightness_shift * output_range + brightness * (grey + contrast * (in - grey))
I can't figure out why to use this one but not the regular one:
out = brightness_shift + contrast * in
Are there some optimization techniques?
Hello, @simonJJJ
DALI aims to be a replacement for multiple vision processing frameworks - and they tend to follow different definitions.
Gluon CV (a computer vision library for MXNet) has additive brightness and multiplicative contrast.
Torchvision has black-centered multiplicative brighness and gray-centered multiplicative contrast.
To emulate both, DALI has an operator with both multiplicative and additive brightness.
The actual operation that's performed on the pixels is simply out = fma(x, alpha, beta)
where:
x - input pixel intensity
alpha = brightness * contrast
beta = brightness_shift * output_range + brightness * (1 - contrast) * grey
Hi, @mzient
Another question is why you use the fixed value of grey corresponding to the specific type, while torchvision use a rgb_to_grayscale function. Is that for faster speed?
Although two different methods of grey calculation for data augmentation won't make a much difference.
@simonJJJ To be perfectly honest, I didn't know that their meaning of "solid grey" is "mean brightness of the image" - and even if we knew that, it's unlikely we'd go with this implementations, since it requires a reduction (calculation of mean) which is quite expensive. Also, it's not suitable for videos, because it would result in shifting of the grey level from frame to frame