Tvm: Adding performance regression test in CI

Created on 9 May 2019 · 12Comments · Source: apache/tvm

Currently we don't have perf regression test in CI. Merging PR is not able to guarantee performance is as fast as before. This situation happens many times:

https://discuss.tvm.ai/t/solved-relay-x86-target-performance-regression/2266
https://github.com/dmlc/tvm/issues/3088

I suggest add performance regression test into CI, once performance is significantly changed the PR should be blocked.

Source

antinucleon

👍10 ❤1

Most helpful comment

Some nightly infra is a better option so it won’t block the current CI

tqchen on 9 May 2019

👍3

All 12 comments

+1
However, performance regression test per PR might take too long, especially for model end to end test on every target device. We need to figure out a balanced solution.

kevinthesun on 9 May 2019

+1. Some nightly benchmark might be good enough for us to track any regression/improvements. Apache Lucene has a pretty good example.

wweic on 9 May 2019

+1
However, performance regression test per PR might take too long, especially for model end to end test on every target device. We need to figure out a balanced solution.

I think we don't need test all networks, A resnet-18/mobilenet and maybe simple RNN are able to expose most of problems.

antinucleon on 9 May 2019

Or we can do nightly benchmarking on a series of commits. Authors of these commits that cause significant performance regression will be notified to identify the issue

zhiics on 9 May 2019

👍1

I think the Rust project has a decent approach to this: https://perf.rust-lang.org/

They visualize the performance of the compiler, the generated code, etc nightly.

One worry I have is CI is already very slow and often becomes congested when lots of people are working on open PRs.

jroesch on 9 May 2019

👍1

Some nightly infra is a better option so it won’t block the current CI

tqchen on 9 May 2019

👍3

+1 to doing performance testing nightly (separate from CI). Moreover, we can do more important/shorter perf tests nightly and a more comprehensive set of perf tests weekly.

derisavi on 10 May 2019

Some thoughts as I read through this thread. A big +1 for doing a joined up performance nightly testing on what we care about and figuring out ways of doing this.

Another alternative to visualizing performance and something I've used in the teams I've worked in on other projects has been lnt. 2 interesting links below for performance monitoring. LNT is also interesting from the point of view that it manages to collect perf profiles, https://llvm.org/docs/lnt/profiles.html and can help visualize differences in performance from the perf profiles in terms of real assembler. I've found this pretty useful in terms of productivity.

http://lnt.llvm.org/ - monitoring llvm performance.
https://lnt.opensuse.org/ - monitoring GCC performance over time.

TBH, the rust web interface looks good - what would be good to check is
a. what are the dependencies on the target side ?
b. Whether the visualization can be separated from the data collection and the databases in which we store the performance data.

In terms of process,

Performance testing separate from CI is possibly good to bootstrap with but having a route to do performance testing after basic correctness sounds like a useful step as it would help with detecting performance issues pre-commit. The question is also is what scenarios with TVM are we interested in regular performance monitoring of because there are multiple usecases here that are likely to emerge.

My 2 cents on a Friday evening.

regards
Ramana

u99127 on 19 Jul 2019

@ZihengJiang has volunteered to exploring a nightly pipeline infra that allows us to use jenkins to commits the new logs and we can figure out viz using javascript(perhaps something like vega-lite) and github pages.

I think we should separate the concern of logging and viz. i.e. to make it easy for anyone to create log of their interest in an unified format, and then have modular viz to visualize these data. Also @slyubomirsky has already setup something for relay nightly that perhaps can be shared.

The least uncertain part of the configuration was viz and the html side actually.

tqchen on 19 Jul 2019

Sounds good - thanks a lot for that update @tqchen - Agreed that we should separate out logging to visualization as I expect we will want to experiment with visualization techniques before getting to one that we like.

u99127 on 19 Jul 2019

Here is a possible pipeline that needs a Jenkins and github

Pull the latest repo
Run nightly CI, on dedicated resources(so there won't be interference)
Provide API to log the metrics to files (say tvm.testing.log_data("category", "accuracy", value) )
Use jenkins mechanism to push the generated log file to a github repo, website branch
The viz will get automatically update on the github pages website.

tqchen on 19 Jul 2019

just chime in to see if there is any progress on perf regression test

yongwww on 2 Oct 2019

Was this page helpful?

0 / 5 - 0 ratings

Related issues

How to lower to matrix instructions?

xqdan · 5Comments

[WINDOWS][AutoTVM] OSError: [WinError 10048] Only one usage of each socket address (protocol/network address/port) is normally permitted and OSError: [WinError 10049] The requested address is not valid in its context

Coderx7 · 5Comments

[RFC][Relay][HalideIR] Automatically generate the AST

jroesch · 5Comments

Ask: TVM vs TensorRT

edmBernard · 5Comments

[RELAY] Avoid eager creation of global target object

tqchen · 4Comments