(This is most probably an LLVM feature request, as clang has the same issue, but I couldn't figure out how to submit a bug report there, so here goes.)
The functions a and b below are equivalent, since arithmetic and logic right shift are the same for non-negative integers. The generated code for a is not as good.
pub fn a(i: i32) -> i32 {
if i >= 0 { ((i as u32) >> 8) as i32 } else { i >> 8 }
}
pub fn b(i: i32) -> i32 {
i >> 8
}
a:
mov ecx, edi
shr ecx, 8
mov eax, edi
sar eax, 8
test edi, edi
cmovns eax, ecx
ret
b:
sar edi, 8
mov eax, edi
ret
In what context did you originally run into this problem?
In the fixed crate for fixed-point numbers, I have conversions between different fixed-point types. For example a fixed-point number with inner i32 and 16 fractional bits can be converted to a fixed-point number with inner u32 and 8 fractional bits, which should be a simple sar for a wrapping conversion.
For this case (and for similar cases), when the i32 was positive, I was just calling the u32 → u32 function, but that uses shr while the other code path (negative i32) was using sar. I looked at the assembly to see that the function inlining and optimization were reducing the conversion to something reasonable, and found this, so I changed my code to use sar in both cases to produce better code.
Implemented in LLVM 9.
For reference: https://reviews.llvm.org/D64285 (This won't be part of https://github.com/rust-lang/rust/pull/62592 though. It will after all.)
Verified that both functions generate the same code on current nightly, so closing this.
Do we have regression tests for this?
@lzutao There are extensive regression tests for this in LLVM already. It doesn't make much sense to include tests of LLVM optimization in Rust, unless they are Rust-specific in some way.
Nice. Thanks for the info.
Most helpful comment
Implemented in LLVM 9.