Glow: [Quantization] Leave the decision of forcing the output scale/offset to be the same as the input to backends.

Created on 28 Nov 2018 · 6Comments · Source: pytorch/glow

In our previous quantization procedure, we forced the following type of quantized node output have the same quantization param as the input:

Kinded::Kind::LocalResponseNormalizationNodeKind                       
Kinded::Kind::SliceNodeKind                                      
Kinded::Kind::ReshapeNodeKind                                       
Kinded::Kind::TopKNodeKind                                        
Kinded::Kind::GatherNodeKind                                         
Kinded::Kind::MaxPoolNodeKind

However, as @tlepley-cadence suggested, since it is generally an accuracy vs speed tradeoff that should be left to the backend designer. We would like to evaluate those ops, and leave some or all of the decision to backends.

Source

beicy

👍1

Most helpful comment

Right now the constraints basically match whatever the interpreter does.
Going in the direction of more flexibility on that front probably mean we probably need a legalization step to tweak these constraints to match the actual backend we are targeting.

qcolombet on 29 Nov 2018

👍3

All 6 comments

With the current approach of forcing at quantization time having the same scale/offset for both input & output, we have seen sometimes that this rule could be broken by later optimizations.

tlepley-cadence on 28 Nov 2018

@tlepley-cadence @beicy I think that it would be a good idea to write a short proposal for the desired semantics. We need to decide on a set of rules and make sure that the optimizer and backends conform to these rules. Thierry mentioned that it would be a good idea to allow nodes to have different input and output types in the early stages of the pipeline and let the backends force equal quantization parameter. There could be other options. Could you come up with a proposal for the preferred semantics in the compiler?

nadavrot on 28 Nov 2018

@nadavrot Agree. I haven't checked all the nodes which now we force the input and output have the same quantization type so far. But we did some discussion on avgpool and maxpool before and think it is reasonable to let maxpool have the same type while avgpool may have different type. I will investigate the rest of nodes and come up with a proposal.

beicy on 28 Nov 2018

qcolombet on 29 Nov 2018

👍3

Checked the list of nodes. For the nodes in this list:

Kinded::Kind::LocalResponseNormalizationNodeKind                       
Kinded::Kind::SigmoidNodeKind                                          
Kinded::Kind::SliceNodeKind                                      
Kinded::Kind::ReshapeNodeKind                                       
Kinded::Kind::TanhNodeKind                                       
Kinded::Kind::TopKNodeKind                                        
Kinded::Kind::GatherNodeKind                                         
Kinded::Kind::MaxPoolNodeKind

The output range is included into the input range. Therefore, I don't think there is accuracy loss if we force the output have the same quantization params as the input. On the other hand, I don't think using different params would help us gain some accuracy or performance.

Therefore, for the nodes in the list , I think it is OK to leave them as they are. For the rest of nodes which are not in the list, the backends can decide if they want to force the same quantization params or not. But this won't affect our current design.

beicy on 4 Dec 2018

According to my previous comment, I think it is OK to close this issue. Please let me know if you have any suggestion! Thanks! @tlepley-cadence @nadavrot @qcolombet @rdzhabarov

beicy on 5 Dec 2018

Was this page helpful?

0 / 5 - 0 ratings