If I understand correctly, TensorRT and TVM search to accelerate prediction. .
TensorRT optimise prediction on GPU and TVM optimised prediction on almost all platform support GPU, ARM, Mobile ...
Is there a comparison between both on GPU ?
so far tvm does not yet optimizes for int8 which TensorRT is optimized for. But there are some on going effort on this, so answer is TensorRT is faster currently and we are keep improving TVM to cover optimizations used in TensorRT for all platforms
thanks for your answer
SO basically TVW will be a generic tensorrt, looking forward to see the new version
Hi @tqchen , curious if there's any updated perspective on TVM vs. TensorRT? Also how does ONNX relate to this project? Does it replace the need for an open exchange format?
Some updates https://tvm.apache.org/2019/04/29/opt-cuda-quantized please feel free to followup on https://discuss.tvm.ai/
Most helpful comment
so far tvm does not yet optimizes for int8 which TensorRT is optimized for. But there are some on going effort on this, so answer is TensorRT is faster currently and we are keep improving TVM to cover optimizations used in TensorRT for all platforms