Glow: Debug mode interpreter is really slow

Created on 14 Nov 2018  路  6Comments  路  Source: pytorch/glow

I don't know what we could do about this, but running resnet50 in Debug mode with the interpreter is just way too slow. We could just say "don't do that", but since Debug mode is extremely useful for development I keep wishing it were faster. I assume the problem is all the range checks on .at() during the convolutions but I don't know how to sidestep it while still providing some safety.

wishlist

Most helpful comment

Assertions do slow you down, but even slower is not optimizing. The lack of inlining and register allocation at -O0 makes modern C++ seriously slow. When dealing with code that actually computes stuff, such as our interpreter, I will usually build with -O1 -g and rely on printf debugging as much as possible.

IIRC, LLVM's own "optimized with assertions" build mode is only ~20% slower than a real release build.

All 6 comments

How do you feel about the idea to add the method unsafe_at() and use it in the convolution unit tests?

Also, we could remove the second assert inside at() because the first one does this check internally.

Both of those ideas sound good to me!

Assertions do slow you down, but even slower is not optimizing. The lack of inlining and register allocation at -O0 makes modern C++ seriously slow. When dealing with code that actually computes stuff, such as our interpreter, I will usually build with -O1 -g and rely on printf debugging as much as possible.

IIRC, LLVM's own "optimized with assertions" build mode is only ~20% slower than a real release build.

I had thought cmake's RelWithDebInfo would be this, but no, it just adds -g (and maybe drops from -O3 to -O2). I'm going to see about a custom mode that's Release without -DNDEBUG, and see how that fares.

FWIW, I don't think there is much of a difference between -O1 and -O2 with clang. They might actually be identical. -O3 does more aggressive inlining.

I looked into this with #2057, but there was a fair amount of (reasonably) resistance to dirtying the code base to improve debug-mode interpreter performance. I still think it's pretty useful, but maybe not for master, so I've left the branch here: https://github.com/bertmaher/glow/tree/unsafe_at.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

rdzhabarov picture rdzhabarov  路  4Comments

artemrakhov-glow picture artemrakhov-glow  路  4Comments

ayermolo picture ayermolo  路  3Comments

tkclimb picture tkclimb  路  4Comments

opti-mix picture opti-mix  路  4Comments