One: [luci-interpreter] Support channel wise quantization

Created on 27 Oct 2020 · 9Comments · Source: Samsung/ONE

Need to support channel wise 16 bit quantized operators in luci interpreter.

For testing purposes.

Operator | Status
--- | ---
depthwise_conv | done #4940
conv | done #4917
prelu | done #4948
add | not needed for now
transpose_conv | done #4941

+cc @jinevening

arequant typproject

Source

binarman

🎉1

Most helpful comment

I've checked luci-interpreter can run the int16-quantized version of our target model after applying #5051.

jinevening on 13 Nov 2020

❤1 👍1

All 9 comments

@jinevening
Hi!
I want to clarify channel-wise quantization schema and some general quantization rules:
1) quantizer process weights, but not activations (except add)
2) bias should be quantized with scale == input_scale * weights_scale (I am asking because for now interpreter does not check it, but code assumes that this is true)

Are these statements right?

binarman on 4 Nov 2020

I also tried to create simple quantized model, but for some reason it didn't work.
Maybe you can tell me what I am doing wrong?

What I did:

generate tflite model (I used tf 2.3.1):

import tensorflow as tf
import numpy as np

kernel = tf.constant(np.array([[[[1,2,3],[4,5,6]]]]), dtype=tf.float32)
func = tf.function(lambda in_: tf.nn.conv2d(in_, kernel, strides=(1,1,1,1), padding="VALID"))
data = np.ones((1,1,1,2), dtype=np.float32)

converter = tf.lite.TFLiteConverter.from_concrete_functions([func.get_concrete_function(data)])
converter.experimental_new_converter = True
model = converter.convert()

with open("conv.tflite", "wb") as f:
  f.write(model)

generate input dataset:

import h5py
import numpy as np

input1_data = np.array([[[[1,2]]]], dtype=np.float32)
input2_data = np.array([[[[3,4]]]], dtype=np.float32)

with h5py.File("dataset.hdf5", "w") as f:
  top_group = f.create_group("value")
  input1_group = top_group.create_group("0")
  dset = input1_group.create_dataset("0", data=input1_data)
  input2_group = top_group.create_group("1")
  dset = input2_group.create_dataset("0", data=input2_data)

commands I used in attempt to quantize model:

$ tflite2circle conv.tflite conv.circle
$ record-minmax --input_model conv.circle --input_data dataset.hdf5 --output_model analyzed.circle
$ circle-quantizer --quantize_dequantize_weights float32 int16 channel analyzed.circle quantized_weights.circle
$ circle-quantizer --quantize_with_minmax float32 int16 channel quantized_weights.circle quantized_activations.circle

this sequence generates segfault in compiler/luci/pass/src/QuantizeWithMinMaxPass.cpp:483.
It tries to work with quantparam of CircleInput node, but node does not have it.

binarman on 4 Nov 2020

The sequence needs to be slightly changed. Please switch record-minmax and circle-quantizer --quantize_dequantize_weights as below.

Note that --quantize_dequantize_weights is just fake-quantization (it quantizes weights and then dequantizes it to fp32). Quantization of weights, activation, and bias are all done in --quantize_with_minmax.

$ tflite2circle conv.tflite conv.circle
$ circle-quantizer --quantize_dequantize_weights float32 int16 channel conv.circle conv.fake_quant.circle
$ record-minmax --input_model conv.fake_quant.circle --input_data dataset.hdf5 --output_model conv.minmax_recorded.circle
$ circle-quantizer --quantize_with_minmax float32 int16 channel conv.minmax_recorded.circle conv.quantized.circle

jinevening on 4 Nov 2020

❤1

@binarman, FYI, you can see the flow in compiler/one-cmds/one-quantize

seanshpark on 4 Nov 2020

👍1

PRelu and Add behave the same way in LWQ and CWQ (for now), so I think this job is done. 👍

jinevening on 9 Nov 2020

I've checked luci-interpreter can run the int16-quantized version of our target model after applying #5051.

jinevening on 13 Nov 2020

❤1 👍1

@jinevening
s16 prelu is merged, thank you!

Before closing this issue I have a question:
Do we have any plans related to uint8 CWQ operators in near future?

binarman on 17 Dec 2020

Do we have any plans related to uint8 CWQ operators in near future?

We need uint8 CWQ operators to test a model quantized by circle-quantizer. There exist uint8 CWQ kernels for some operators (https://github.com/Samsung/ONE/issues/5049). PRelu was not in the list, so it will be the next target.

jinevening on 18 Dec 2020

Thank you!

closing this issue

binarman on 18 Dec 2020

Was this page helpful?

0 / 5 - 0 ratings

Related issues

Compiler FE : Speed up CI build and test time

mhs4670go · 3Comments

Compiler FE: Direct ONNX support

lucenticus · 3Comments

[infra/Android] Using gold linker for android build

periannath · 3Comments

Checklist for 1.9.1 compiler release

mhs4670go · 3Comments

Compiler How to disable a single project test

seanshpark · 3Comments