Dali: emulating torchvision.transforms.RandomGrayscale

Created on 25 Feb 2019 · 18Comments · Source: NVIDIA/DALI

I'm trying to emulate the pytorch torchvision.transforms.RandomGrayscale operation, and I think I need to create a custom op, but first I wanted to ask whether there is a clever way to use existing DALI ops instead.

The goal is to convert an image to grayscale with some probability p, but to retain 3 channels in the image.

The torchvision function uses a PIL transform to convert the image to grayscale, but the important thing is that their function retains the number of channels in the image, such that the resulting image still has 3 channels, but with identical values in the R, G, and B channels.

As an approximation in DALI, I tried using a CoinFlip with probability = .8, and tried to use the output of the coinflip as an input to the saturation parameter in the ColorTwist op (20% of the time it would be a zero, and 80% of the time a one). Unfortunately, the saturation parameter requires a float but CoinFlip returns an int. I was not able to use the Cast op to cast the output of the CoinFlip to float (DALI threw an error).

So my question is, am I missing some relatively simple way to do RandomGrayscale with some probability while preserving 3 channels in the image? (I'm replicating a published study that used this torchvision method, so I'm trying to stay as close to that original study as possible).

Thanks!

enhancement external contribution welcome

Source

grez72

Most helpful comment

We will add it to our ToDo list.

JanuszL on 29 Apr 2020

👍2 🎉1

All 18 comments

Hi,
Currently I don't see any very straight forward way to do this.
You can try to create custom color twist operator with randomness build in, or you ccan create custom random generator based on coin flip that returns float - dali/pipeline/operators/support/random/coin_flip.cc, putting something like this inside should do:

  float * out_data = output.template mutable_data<float>();

  for (int i = 0; i < batch_size_; ++i) {
    out_data[i] = dis_(rng_) ? 1 : 0;
  }

JanuszL on 26 Feb 2019

Hi,

Thanks for the suggestion. I tried to create a custom coin_flip as you suggest but ran into trouble.

I have limited C programming experience so I started with the tutorial custom op example, and tried to merge coin_flip.cc example with that one, but that didn't go so well. What I wrote compiled when I used the SampleWorkspace as in the tutorial, but not if I used the SupportWorkspace (as is used in dali/pipeline/operators/support/random/coin_flip.cc). Compiling with SampleWorkspace didn't work because the op isn't a "Support Op" (required by color twist).

I've included my coinflip.h and coninflip.cc files in case you spot the fix easily, but it seems likely to me that starting from the "CustomDummy" example might not have been the best approach. By any chance can you point me towards other custom op examples, especially ones that have modify an existing op like this?

Thanks!

coinflip.h

#ifndef EXAMPLE_COINFLIP_H_
#define EXAMPLE_COINFLIP_H_

#include <random>

#include "dali/pipeline/operators/operator.h"

#define CUSTOM_USE_OPERATOR_MEMBERS()              \
  using ::dali::OperatorBase::spec_;               \
  using ::dali::OperatorBase::num_threads_;        \
  using ::dali::OperatorBase::batch_size_

namespace other_ns {

template <typename SupportBackend>
class CoinFlip : public ::dali::Operator<SupportBackend> {
 public:
  inline explicit CoinFlip(const ::dali::OpSpec &spec) :
    ::dali::Operator<SupportBackend>(spec),
    dis_(spec.GetArgument<float>("probability")),
    rng_(spec.GetArgument<int64_t>("seed")) {}

  inline ~CoinFlip() override = default;

  DISABLE_COPY_MOVE_ASSIGN(CoinFlip);

  CUSTOM_USE_OPERATOR_MEMBERS();

  protected:
      void RunImpl(::dali::Workspace<SupportBackend> * ws, const int idx) override;

  private:
      std::bernoulli_distribution dis_;
      std::mt19937 rng_;

};

}  // namespace other_ns

#endif  // EXAMPLE_COINFLIP_H_

coinflip.cc

#include "coinflip.h"

namespace other_ns {

template<>
void CoinFlip<::dali::CPUBackend>::RunImpl(::dali::SampleWorkspace * ws, const int idx) {
    DALI_ENFORCE(idx == 0, "CoinFlip does not support multiple input sets.");
    auto &input = ws->Input<::dali::CPUBackend>(idx);
    auto output = ws->Output<::dali::CPUBackend>(idx);
    output->Resize({batch_size_});

    float * out_data = output->template mutable_data<float>();

    for (int i = 0; i < batch_size_; ++i) {
        out_data[i] = dis_(rng_) ? 1.0 : 0.0;
    }
}

}  // namespace other_ns

DALI_REGISTER_OPERATOR(CustomCoinFlip, ::other_ns::CoinFlip<::dali::CPUBackend>, ::dali::CPU);

DALI_SCHEMA(CustomCoinFlip)
  .DocStr("Produce tensor filled with 0.0s and 1.0s - results of random coin flip cast as float,"
      " usable as an argument for select ops.")
  .NumInput(0)
  .NumOutput(1)
  .AddOptionalArg("probability",
      R"code(Probability of returning 1.)code", 0.5f);

When I try to change SampleWorkspace to SupportWorkspace I get the following error:

/customcoinflip/coinflip.cc:6:6: error: template-id ‘RunImpl<>’ for ‘void other_ns::CoinFlip<dali::CPUBackend>::RunImpl(dali::SupportWorkspace*, int)’ does not match any template declaration
 void CoinFlip<::dali::CPUBackend>::RunImpl(::dali::SupportWorkspace * ws, const
      ^
/customcoinflip/coinflip.cc:6:88: note: saw 1 ‘template<>’, need 2 for specializing a member function template
 ip<::dali::CPUBackend>::RunImpl(::dali::SupportWorkspace * ws, const int idx) {

grez72 on 26 Feb 2019

@grez72
I've adjusted your example so that it compiles:

coinflip.h

#ifndef EXAMPLE_COINFLIP_H_
#define EXAMPLE_COINFLIP_H_

#include <random>
#include "dali/pipeline/operators/operator.h"

#define CUSTOM_USE_OPERATOR_MEMBERS()           \
    using ::dali::OperatorBase::spec_;          \
    using ::dali::OperatorBase::num_threads_;   \
    using ::dali::OperatorBase::batch_size_

namespace other_ns {

class CoinFlip : public ::dali::Operator<::dali::SupportBackend> {
 public:
    inline explicit CoinFlip(const ::dali::OpSpec &spec) :
        ::dali::Operator<::dali::SupportBackend>(spec),
        dis_(spec.GetArgument<float>("probability")),
        rng_(spec.GetArgument<int64_t>("seed")) {}

    inline ~CoinFlip() override = default;

    DISABLE_COPY_MOVE_ASSIGN(CoinFlip);
    CUSTOM_USE_OPERATOR_MEMBERS();

 protected:
    void RunImpl(::dali::Workspace<::dali::SupportBackend> *ws, const int idx) override;

 private:
    std::bernoulli_distribution dis_;
    std::mt19937 rng_;
};

}  // namespace other_ns

#endif  // EXAMPLE_COINFLIP_H_

coinflip.cc

#include "coinflip.h"

namespace other_ns {

void CoinFlip::RunImpl(::dali::SupportWorkspace * ws, const int idx) {
    DALI_ENFORCE(idx == 0, "CoinFlip does not support multiple input sets.");
    auto &output = ws->Output<::dali::CPUBackend>(idx);
    output.Resize({batch_size_});
    output.set_type(::dali::TypeInfo::Create<float>());

    float *out_data = output.template mutable_data<float>();

    for (int i = 0; i < batch_size_; ++i) {
        out_data[i] = dis_(rng_) ? 1.0 : 0.0;
    }
}

}  // namespace other_ns

DALI_REGISTER_OPERATOR(CustomCoinFlip, ::other_ns::CoinFlip, ::dali::Support);

DALI_SCHEMA(CustomCoinFlip)
  .DocStr("Produce tensor filled with 0.0s and 1.0s - results of random coin flip cast as float,"
          " usable as an argument for select ops.")
  .NumInput(0)
  .NumOutput(1)
  .AddOptionalArg("probability",
    R"code(Probability of returning 1.)code", 0.5f);

Let me know if it helps

jantonguirao on 26 Feb 2019

@grez72 I've taken a look at your original request and wrote a custom operator doing the random grayscale conversion. I hope it helps

RandomGrayscale.h:

#ifndef RANDOM_GRAYSCALE_H_
#define RANDOM_GRAYSCALE_H_

#include <random>
#include "dali/pipeline/operators/operator.h"

namespace other_ns {
    using namespace ::dali;

    template <typename Backend>
        class RandomGrayscale : public Operator<Backend> {
    public:
        inline explicit RandomGrayscale(const OpSpec &spec)
            : Operator<Backend>(spec)
            , input_type_(spec.GetArgument<DALIImageType>("image_type"))
            , dis_(spec.GetArgument<float>("probability"))
            , rng_(spec.GetArgument<int64_t>("seed")) {
        }

    protected:
        void RunImpl(Workspace<Backend> *ws, const int idx) override;

        USE_OPERATOR_MEMBERS();

        const DALIImageType input_type_;
        std::bernoulli_distribution dis_;
        std::mt19937 rng_;
    };

}  // namespace other_ns

#endif  // RANDOM_GRAYSCALE_H_

RandomGrayscale.cc:

#include "RandomGrayscale.h"

namespace other_ns {

using namespace ::dali;

DALI_SCHEMA(RandomGrayscale)
  .DocStr(R"code(Converts between various image color models)code")
  .NumInput(1)
  .NumOutput(1)
  .AllowMultipleInputSets()
  .EnforceInputLayout(DALI_NHWC)
  .AddArg("image_type",
      R"code(The color space of the input image)code", DALI_IMAGE_TYPE)
  .AddOptionalArg("probability",
      R"code(Probability of returning 1.)code", 0.5f);

template <>
void RandomGrayscale<CPUBackend>::RunImpl(SampleWorkspace *ws, const int idx) {
    const auto &input = ws->Input<CPUBackend>(idx);
    auto &output = ws->Output<CPUBackend>(idx);
    output.Copy(input, 0);
    const auto &input_shape = input.shape();

    const auto H = input_shape[0];
    const auto W = input_shape[1];
    const auto C = input_shape[2];
    DALI_ENFORCE( C == 3 );

    const bool should_convert = dis_(rng_);
    if ( should_convert ) {
        uint8_t *output_ptr = output.template mutable_data<uint8>();
        for (int i = 0; i < H*W; i++) {
            uint8_t gray = static_cast<uint8_t>(
                0.257f * output_ptr[i*C]
              + 0.504f * output_ptr[i*C+1]
              + 0.098f * output_ptr[i*C+2] + 16.0f);
            output_ptr[i*C]   = gray;
            output_ptr[i*C+1] = gray;
            output_ptr[i*C+2] = gray;
        }
    }
}

DALI_REGISTER_OPERATOR(RandomGrayscale, RandomGrayscale<CPUBackend>, CPU);

}  // namespace other_ns

jantonguirao on 26 Feb 2019

Wow, many thanks for updating the CoinFlip example, and for producing this RandomGrayscale op!

Would you mind sharing your CMakeLists.txt files for both of these?

grez72 on 26 Feb 2019

@grez72 Here is the CMakeLists.txt I used for both

cmake_minimum_required(VERSION 3.5)
find_package(CUDA 8.0 REQUIRED)

execute_process(
        COMMAND python -c "import nvidia.dali as dali; print(dali.sysconfig.get_lib_dir())"
        OUTPUT_VARIABLE DALI_LIB_DIR)
string(STRIP ${DALI_LIB_DIR} DALI_LIB_DIR)

execute_process(
        COMMAND python -c "import nvidia.dali as dali; print(\" \".join(dali.sysconfig.get_compile_flags()))"
        OUTPUT_VARIABLE DALI_COMPILE_FLAGS)
string(STRIP ${DALI_COMPILE_FLAGS} DALI_COMPILE_FLAGS)

set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -std=c++11 ${DALI_COMPILE_FLAGS} ")
link_directories( "${DALI_LIB_DIR}" )

cuda_add_library(mycoinflip SHARED coinflip.cc)
target_link_libraries(mycoinflip dali)

cuda_add_library(myrandomgrayscale SHARED RandomGrayscale.cc)
target_link_libraries(myrandomgrayscale dali)

jantonguirao on 26 Feb 2019

Thanks for that. I'm getting an error when compiling and just wanted to make sure it wasn't my CMakeLists.txt file. I placed your files (CMakeLists.txt, RandomGrayscale.h, RandomGrayscale.cc) in a folder called custom_ops/myrandomgrayscale, and then ran the following lines to compile:

rm -rf custom_ops/myrandomgrayscale/build && \
mkdir -p custom_ops/myrandomgrayscale/build && \
cd custom_ops/myrandomgrayscale/build && \
  cmake .. && \
  make -j12

...but I get the the error below. Is there something I should do differently to compile the custom operators?

Thanks again for all of your help so far!

-- The C compiler identification is GNU 5.4.0
-- The CXX compiler identification is GNU 5.4.0
-- Check for working C compiler: /usr/bin/cc
-- Check for working C compiler: /usr/bin/cc -- works
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Detecting C compile features
-- Detecting C compile features - done
-- Check for working CXX compiler: /usr/bin/c++
-- Check for working CXX compiler: /usr/bin/c++ -- works
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Looking for pthread.h
-- Looking for pthread.h - found
-- Looking for pthread_create
-- Looking for pthread_create - not found
-- Looking for pthread_create in pthreads
-- Looking for pthread_create in pthreads - not found
-- Looking for pthread_create in pthread
-- Looking for pthread_create in pthread - found
-- Found Threads: TRUE
-- Found CUDA: /usr/local/cuda (found suitable version "10.0", minimum required is "8.0")
-- Configuring done
-- Generating done
-- Build files have been written to: /home/jovyan/work/Projects/InstanceNetArtiPhys/dataloaders/custom_ops/myrandomgrayscale/build
-- Configuring done
-- Generating done
-- Build files have been written to: /home/jovyan/work/Projects/InstanceNetArtiPhys/dataloaders/custom_ops/myrandomgrayscale/build
Scanning dependencies of target myrandomgrayscale
[ 50%] Building CXX object CMakeFiles/myrandomgrayscale.dir/RandomGrayscale.cc.o
/home/jovyan/work/Projects/InstanceNetArtiPhys/dataloaders/custom_ops/myrandomgrayscale/RandomGrayscale.cc: In member function ‘void other_ns::RandomGrayscale<Backend>::RunImpl(dali::Workspace<Backend>*, int) [with Backend = dali::CPUBackend; dali::Workspace<Backend> = dali::SampleWorkspace]’:
/home/jovyan/work/Projects/InstanceNetArtiPhys/dataloaders/custom_ops/myrandomgrayscale/RandomGrayscale.cc:21:42: error: invalid initialization of non-const reference of type ‘dali::Tensor<dali::CPUBackend>*&’ from an rvalue of type ‘dali::Tensor<dali::CPUBackend>*’
     auto &output = ws->Output<CPUBackend>(idx);
                                          ^
/home/jovyan/work/Projects/InstanceNetArtiPhys/dataloaders/custom_ops/myrandomgrayscale/RandomGrayscale.cc:22:12: error: request for member ‘Copy’ in ‘output’, which is of pointer type ‘dali::Tensor<dali::CPUBackend>*’ (maybe you meant to use ‘->’ ?)
     output.Copy(input, 0);
            ^
/home/jovyan/work/Projects/InstanceNetArtiPhys/dataloaders/custom_ops/myrandomgrayscale/RandomGrayscale.cc:32:47: error: request for member ‘mutable_data’ in ‘output’, which is of pointer type ‘dali::Tensor<dali::CPUBackend>*’ (maybe you meant to use ‘->’ ?)
         uint8_t *output_ptr = output.template mutable_data<uint8>();
                                               ^
/home/jovyan/work/Projects/InstanceNetArtiPhys/dataloaders/custom_ops/myrandomgrayscale/RandomGrayscale.cc:32:65: error: expected primary-expression before ‘>’ token
         uint8_t *output_ptr = output.template mutable_data<uint8>();
                                                                 ^
/home/jovyan/work/Projects/InstanceNetArtiPhys/dataloaders/custom_ops/myrandomgrayscale/RandomGrayscale.cc:32:67: error: expected primary-expression before ‘)’ token
         uint8_t *output_ptr = output.template mutable_data<uint8>();
                                                                   ^
CMakeFiles/myrandomgrayscale.dir/build.make:62: recipe for target 'CMakeFiles/myrandomgrayscale.dir/RandomGrayscale.cc.o' failed
make[2]: *** [CMakeFiles/myrandomgrayscale.dir/RandomGrayscale.cc.o] Error 1
CMakeFiles/Makefile2:67: recipe for target 'CMakeFiles/myrandomgrayscale.dir/all' failed
make[1]: *** [CMakeFiles/myrandomgrayscale.dir/all] Error 2
Makefile:83: recipe for target 'all' failed
make: *** [all] Error 2

grez72 on 26 Feb 2019

@grez72 We recently changed the return type of ws->Output<...>(...) from pointer to reference.
The code I provided compiles with latest DALI (from master branch). It seems that in the DALI you have installed the old API is still there.
You have two options:

Build and use latest DALI from master branch
Change the code to work with pointers

auto *output = ws->Output<CPUBackend>(idx);
// ...
output->Copy(input, 0);
// ...
output->template mutable_data<uint8>();

jantonguirao on 26 Feb 2019

👍1

Hi,

Many thanks for all of your help working this out for me and walking me through the compilation.

I plan to update DALI to the latest version soon, but for now, I followed your instructions for changing the code to work with pointers for DALI 0.6.1. Now everything compiles perfectly, and I'm able to use your RandomGrayscale op in my pipeline! Many thanks for all of your help.

I have one last question / favor to ask: would it be possible for you to provide a gpu version of the op? I tried digging through the source of various ops, and it was difficult to find something that parallels your RandomGrayscale.cc op to use as a guide. You've been so generous with your time thus far that I hesitate to ask, but at the same time you've gotten me so close to finalizing my pipeline (using nvJPEGDecoder), that I can't resist asking.

Again, thanks for all of the help you've already provided!

grez72 on 26 Feb 2019

Hi @grez72 ,
To have a GPU operator, you will have to add a RandomGrayscale.cu file, which is very similar to the .cc but in which you define the RunImpl with DeviceWorkspace for GPUBackend implem.

The main difference is that Input and Output of DeviceWorkspace return TensorList, that contain whole batch of image instead of a single sample as in the CPUBackend case.

Here I provide you a non efficient version (since we launch one kernel per image): but you can also implement something where you store the pointers and metadata of each image in some additional array and launch a single batched kernel.
Similarly to what we do here https://github.com/NVIDIA/DALI/blob/20d2508095137cc3c1bdfa478795ba2a58c30a7d/dali/pipeline/operators/crop/crop.cu#L24 for instance

RandomGrayscale.cu :

#include "RandomGrayscale.h"

namespace other_ns {

using namespace ::dali;

template <int C = 3>
__global__ void ConditionalGrayscale(const uint8 *in,
   bool grayscale, int H, int W, int8 *out) {
  const int idx = blockIdx.x * threadIdx.x * threadIdx.y * C;
  if (idx > H * W)
    return;
  if (grayscale) {
            uint8_t gray = static_cast<uint8_t>(
                0.257f *  in[idx]
              + 0.504f * in[idx + 1]
              + 0.098f * in[idx +2] + 16.0f);
            out[idx]   = gray;
            out[idx+1] = gray;
            out[idx+2] = gray;
  } else {
            out[idx]   = in[idx];
            out[idx+1] = in[idx+1];
            out[idx+2] = in[idx+2];
  }
}

template <>
void RandomGrayscale<GPUBackend>::RunImpl(DeviceWorkspace *ws, const int idx) {
    // input and ouput are TensorList and not Tensor like in CPU op
    const auto &input = ws->Input<GPUBackend>(idx);
    auto &output = ws->Output<GPUBackend>(idx);
    const auto &shape = input.shape();

    // preparing the output buffer
    output.Resize(shape);
    output.set_type(input.type());
    ouput.SetLayout(input.GetLayout());

    DALI_ENFORCE( C == 3 );
    for (decltype(input.ntensor()) i = 0; i < input.ntensor(); i++) {
      const auto &input_shape = input.tensor_shape(i);
      const auto H = input_shape[0];
      const auto W = input_shape[1];

      const auto* in = input.tensor<uint8_t>(i);
      auto* out = output.mutable_tensor<uint8_t>(i);
      const bool should_convert = dis_(rng_);
      ConditionalGrayscale<<<H * W / 1024, dim3(32, 32), ws->stream()>>>(in
                                   should_convert,
                                   H,
                                   W,
                                   out);
    }
}

DALI_REGISTER_OPERATOR(RandomGrayscale, RandomGrayscale<GPUBackend>, GPU);

}  // namespace other_ns

Kh4L on 26 Feb 2019

+1 to have this baked into DALI. I tried the route of muxing, but that doesn't quite work because we end up with [1, H, W] for some images and [3, H, W] for others. Not sure if it is possible, but something that wraps any dali op with an RNG (like in torchvision.transforms), i.e:

transforms.RandomApply([transforms.ColorJitter(0.8, 0.8, 0.2)], p=0.8)

would be super useful. That way different operators don't need to directly have the RNG as input.

jramapuram on 28 Apr 2020

We will add it to our ToDo list.

JanuszL on 29 Apr 2020

👍2 🎉1

Hi,
RandomGrayscale that works for RGB inputs and outputs three channel images that have all the channels equal to the converted gray value can be obtained with Hsv and CoinFlip operators.

The CoinFlip drives the saturate parameter causing the images to be either destaurated (for value 0) or passed through with full saturation kept (for value 1). We indeed need to use Cast to convert CoinFlip output to floats.

I extended Hsv notebook with such example in #1962.

@jramapuram, @grez72 the Pytorch operator works also for 1-channel images by passing them through. Do you have such datasets/use cases where the 3channel RGB images and 1channel Gray images can appear together in one batch? I'm not sure if any of the operators following such mixed output can cope with it.

klecki on 13 May 2020

@klecki : this is perfect for my use-case. I always want 3 channel data. Thanks!

Trying to replicate SimCLR ( https://arxiv.org/abs/2002.05709 ) using DALI here. Just need GaussianBlur and I can do it all with DALI instead of PIL :)

jramapuram on 13 May 2020

@jramapuram Happy to help :)

The GaussianBlur is also on our roadmap, we're going to keep you posted.

klecki on 13 May 2020

🎉1

@jramapuram GaussianBlur for CPU should be available in our nightly and weekly builds (merged in #2038).
I am now working on GPU variant.

klecki on 30 Jun 2020

🎉1

@klecki : Awesome! Thanks a lot. Should help speed things up quite a bit for my self-supervised projects.

jramapuram on 30 Jun 2020

0.23 has been released with the relevant functionality.

JanuszL on 15 Jul 2020