Rust: Bad codegen with explicitly nonzero trailing/leading zeros

Created on 14 Oct 2018  路  13Comments  路  Source: rust-lang/rust

The compiler should be able to remove intrinsics::cttz's test when the input is explicitly nonzero.

https://play.rust-lang.org/?version=nightly&mode=release&edition=2015&gist=3d298b145a80d2a7584048de73ffc94c

if x == 0 { 0 } else { cttz(x) }

The cttz version produces three test instructions while cttz_nonzero produces one.

The LLVM IR show that cttz/trailing_zeros use hardcoded is_zero_undef values.
call i32 @llvm.cttz.i32(i32 %x, i1 false)

(The same is true of leading_zeros/trailing_zeros/ctlz/ctlz_nonzero)

A-LLVM I-slow

Most helpful comment

FWIW started working on this yesterday https://reviews.llvm.org/D60846

All 13 comments

LLVM patch to fix this: https://reviews.llvm.org/D55786

Verified fixed by LLVM upgrade.

This is not yet fixed (or regressed?) for u64: https://rust.godbolt.org/z/qZ9E0u

pub unsafe fn fn_cttz(x: u64) -> u32 {
    if x == 0 { 0 } else { cttz(x) as u32 }
}

Still leads to:

example::fn_cttz:
        test    rdi, rdi
        je      .LBB1_1
        je      .LBB1_3
        bsf     rax, rdi
        ret
.LBB1_1:
        xor     eax, eax
        ret
.LBB1_3:
        mov     eax, 64
        ret

Can we reopen this?

In your example the additional truncation prevents conversion into a select, leaving an explicit branch on zero. It looks like isKnownNonZero() only checks dominating conditions for pointer types, it does not recognize that the cttz argument cannot be zero.

For now, you can work around this by returning a u64 instead of a u32.

The problem is that u64::trailing_zeros returns a u32. And that's the only stable interface to this feature right now :/

@LukasKalbertodt Using x.trailing_zeros() as u64 and returning that should do the trick.

IR test case:

declare i64 @llvm.cttz.i64(i64, i1)

define i32 @test(i64 %x) {
start:
  %c = icmp eq i64 %x, 0
  br i1 %c, label %exit, label %non_zero

non_zero:
  %ctz = call i64 @llvm.cttz.i64(i64 %x, i1 false)
  %ctz32 = trunc i64 %ctz to i32
  br label %exit

exit:
  %res = phi i32 [ %ctz32, %non_zero ], [ 0, %start ]
  ret i32 %res
}

If someone is interested in tackling this issue:

The existing fold for handling a known non-zero argument to cttz is here: https://github.com/llvm-mirror/llvm/blob/581b638b1686c57586b44424d1a1fb0649315ce5/lib/Transforms/InstCombine/InstCombineCalls.cpp#L1329-L1339

The isKnownNonZero() function calls isKnownNullFromDominatingCondition() here: https://github.com/llvm-mirror/llvm/blob/581b638b1686c57586b44424d1a1fb0649315ce5/lib/Analysis/ValueTracking.cpp#L2069 This function is capable of determining that %x is non-zero in the above case. However, it is currently only used for pointer types.

Fixing this is in principle as easy as moving this check out of the isPointerTy() branch. However, there is a tricky compile-time vs optimization quality tradeoff here, because this is a potentially expensive operation. It might be that reviewers will prefer to handle this in some other way.

Edit: If this approach doesn't pan out, then an alternative would be to add this to the CorrelatedValuePropagation pass, by checking if the ConstantRange of the argument doesn't contain zero.

isKnownNonNullFromDominatingCondition does contain some code that checks that the argument is used with a nonnull attribute. I think we could split out the section that checks for icmp eq <ty> <val>, 0 which is also then used in isKnownNonNullFromDominatingCondition. Then we could use something like.

if (V->getType()->isPointerTy()) {
  ...
} else {
  newFunction(...)
}

FWIW started working on this yesterday https://reviews.llvm.org/D60846

This has now landed as https://reviews.llvm.org/D69571.

seems like this is not fixed.
see https://rust.godbolt.org/z/D_z3TX

@andjo403 It's fixed in LLVM 10, while Rust is still on LLVM 9.

think this can be closed now see https://rust.godbolt.org/z/D_z3TX where all versions generate the same instructions

Was this page helpful?
0 / 5 - 0 ratings