The compiler should be able to remove intrinsics::cttz's test when the input is explicitly nonzero.
if x == 0 { 0 } else { cttz(x) }
The cttz version produces three test instructions while cttz_nonzero produces one.
The LLVM IR show that cttz/trailing_zeros use hardcoded is_zero_undef values.
call i32 @llvm.cttz.i32(i32 %x, i1 false)
(The same is true of leading_zeros/trailing_zeros/ctlz/ctlz_nonzero)
LLVM patch to fix this: https://reviews.llvm.org/D55786
Verified fixed by LLVM upgrade.
This is not yet fixed (or regressed?) for u64: https://rust.godbolt.org/z/qZ9E0u
pub unsafe fn fn_cttz(x: u64) -> u32 {
if x == 0 { 0 } else { cttz(x) as u32 }
}
Still leads to:
example::fn_cttz:
test rdi, rdi
je .LBB1_1
je .LBB1_3
bsf rax, rdi
ret
.LBB1_1:
xor eax, eax
ret
.LBB1_3:
mov eax, 64
ret
Can we reopen this?
In your example the additional truncation prevents conversion into a select, leaving an explicit branch on zero. It looks like isKnownNonZero() only checks dominating conditions for pointer types, it does not recognize that the cttz argument cannot be zero.
For now, you can work around this by returning a u64 instead of a u32.
The problem is that u64::trailing_zeros returns a u32. And that's the only stable interface to this feature right now :/
@LukasKalbertodt Using x.trailing_zeros() as u64 and returning that should do the trick.
IR test case:
declare i64 @llvm.cttz.i64(i64, i1)
define i32 @test(i64 %x) {
start:
%c = icmp eq i64 %x, 0
br i1 %c, label %exit, label %non_zero
non_zero:
%ctz = call i64 @llvm.cttz.i64(i64 %x, i1 false)
%ctz32 = trunc i64 %ctz to i32
br label %exit
exit:
%res = phi i32 [ %ctz32, %non_zero ], [ 0, %start ]
ret i32 %res
}
If someone is interested in tackling this issue:
The existing fold for handling a known non-zero argument to cttz is here: https://github.com/llvm-mirror/llvm/blob/581b638b1686c57586b44424d1a1fb0649315ce5/lib/Transforms/InstCombine/InstCombineCalls.cpp#L1329-L1339
The isKnownNonZero() function calls isKnownNullFromDominatingCondition() here: https://github.com/llvm-mirror/llvm/blob/581b638b1686c57586b44424d1a1fb0649315ce5/lib/Analysis/ValueTracking.cpp#L2069 This function is capable of determining that %x is non-zero in the above case. However, it is currently only used for pointer types.
Fixing this is in principle as easy as moving this check out of the isPointerTy() branch. However, there is a tricky compile-time vs optimization quality tradeoff here, because this is a potentially expensive operation. It might be that reviewers will prefer to handle this in some other way.
Edit: If this approach doesn't pan out, then an alternative would be to add this to the CorrelatedValuePropagation pass, by checking if the ConstantRange of the argument doesn't contain zero.
isKnownNonNullFromDominatingCondition does contain some code that checks that the argument is used with a nonnull attribute. I think we could split out the section that checks for icmp eq <ty> <val>, 0 which is also then used in isKnownNonNullFromDominatingCondition. Then we could use something like.
if (V->getType()->isPointerTy()) {
...
} else {
newFunction(...)
}
FWIW started working on this yesterday https://reviews.llvm.org/D60846
This has now landed as https://reviews.llvm.org/D69571.
seems like this is not fixed.
see https://rust.godbolt.org/z/D_z3TX
@andjo403 It's fixed in LLVM 10, while Rust is still on LLVM 9.
think this can be closed now see https://rust.godbolt.org/z/D_z3TX where all versions generate the same instructions
Most helpful comment
FWIW started working on this yesterday https://reviews.llvm.org/D60846