While fixing various bindgen bugs related to long double, int128, (https://github.com/rust-lang-nursery/rust-bindgen/issues/1370, etc).
I realized that the following Rust program, in my x86_64 Linux machine:
#[repr(C)]
struct Foo {
f: u128,
}
fn main() {
println!("Align: {}", ::std::mem::align_of::<Foo>());
}
Prints 8.
While the following C program:
#include <stdio.h>
struct foo {
unsigned __int128 t;
};
int main() {
printf("Align: %ld\n", _Alignof(struct foo));
}
Prints 16 on the same system. This is pretty unexpected, and means that i128 / u128 are not really usable for FFI / alignment purposes.
I think the correct fix would be to add long_double type to libc with the correct alignment.
I guess that's probably not hard? I could have a go at that if it sounds like the right thing to do.
This is a bug in llvm. There was an attempt to fix this in the past, but it got reverted because of reasons.
The sysV __int128 should be 16-byte aligned and rust follows that ABI.
Rather, this is blocked on LLVM data-layouts getting fixed, because rustc expects them to be matched last time I checked.
Ugh, thanks @nagisa... I _think_ this is also a problem for long double, reading that LLVM patch they'd align it to 64 bits. Sigh.
#include <stdio.h>
struct foo {
long double bar;
};
int main() {
printf("%ld\n", _Alignof(struct foo)); // 16
}
Copying from the last thread, since this thread seems to mostly be about long double.
In C and C++, on x86_64 Linux, alignof(__int128) is equal to 16. However, in Rust, align_of::
C++: https://gcc.godbolt.org/z/YAq1XC
Rust: https://gcc.godbolt.org/z/QvZeqK
This will cause subtle UB if you ever try to use i128s across FFI boundaries; for example:
C++: https://gcc.godbolt.org/z/PrtHlp
Rust: https://gcc.godbolt.org/z/_SdVqD
The improper_ctypes warning diagnoses usage of these types in C FFI, so at least users are alerted that this does not work. Playground
@centril @eddyb this should be I-unsound.
Why that? Rust itself is sound. The types are not FFI-safe, but FFI needs unsafe code.
Using i128 and u128 on FFI is sound. It currently isn't due to a bug on our end, therefore, this is IMO a soundness bug.
Using i128 and u128 on FFI is sound.
This can be generalized for any T. The complications arise with getting layout right on both sides of the FFI (more specifically it is sound to use repr(Rust) type as long as you manage to get layout of T right on the other side). Since only a few targets have a defined layout for 128-bit integer types and many don鈥檛 even have an implementation defined behaviour, it makes perfect sense to delegate them to the same level of FFI support as we do for other repr(Rust) types.
Just to be entirely clear: even if we fixed this particular issue for x86_64 SysV targets, it will only make u128 matching with other x86_64 SysV-abiding artifacts. For most other targets the fact that i/u128 ABI is not specified or is implementation-specific will remain. And so this type can never get any more guarantees than what it currently gets.
Since only a few targets have a defined layout for 128-bit integer types
AFAIK all 64-bit targets except for MSVC have a defined __int128. That's a lot of targets.
Since only a few targets have a defined layout for 128-bit integer types and many don鈥檛 even have an implementation defined behaviour, it makes perfect sense to delegate them to the same level of FFI support as we do for other repr(Rust) types.
That means that the layout of these types is unspecified, which prevents users from using i128 to interface with C, requiring us to add a libc::__int128 on those platforms that's incompatible with the Rust type, and requiring users to use that in C FFI.
What's the rationale for not providing a layout that's compatible with __int128 on those platforms?
For all other integer types and bool, we match the C layout of the platform when the platform has the integer type - the stdint.h header is optional, so not all platforms are required to have it.
EDIT: NVM, for the integer types, stdint.h spec matches our, so if it exists, ours match. For bool we do specify it as being equal to C's _Bool, iff the platform has a _Bool. We could do the same for i128.
What's the rationale for not providing a layout that's compatible with __int128 on those platforms?
Nobody argued that we should have a different layout. I agree using a different layout is a bug. I just don't agree it's a soundness bug.
u128/i128 are not FFI-safe when they should be. That's a bug. That's why this issue exists and is still open. We also don't claim that they are FFI safe, so it's not a soundness bug.
Ultimately where to draw the line for soundness bugs can be arbitrary though.
Is anyone working on fixing this?
Trying to understand why does clang and rust have different u128 types https://godbolt.org/z/WyzeRz
Trying to understand why does clang and rust have different u128 types
The reason is that u128 is not FFI safe, so you can't use it on FFI, and therefore it doesn't matter whether it has the same layout as some other clang type.
@gnzlbg but why isn't it FFI safe?
I would've thought they would use the same llvm type
@gnzlbg but why isn't it FFI safe?
See: https://github.com/rust-lang/rust/issues/54341#issuecomment-506719475
So which targets do have a defined call ABI for passing 128-bit integers? You said everything except the MSVC targets, at least. (EDIT: corrected.)
It was my understanding that this is blocked on LLVM bugs. Not sure if there is an LLVM issue tracking this. But now it sounds more like this is blocked on Rust needing a target-dependent notion of "FFI safe"?
So which targets do have a defined call ABI for passing 128-bit integers?
I don't know of any target that doesn't. I also can't imagine a target for which not having this documented on its ABI wouldn't be considered an ABI spec bug.
You said all the MSVC targets, at least.
I said that the msvc compiler is the only widely-used C compiler I know that does not have an __int128 type, but that's not the only C compiler for this target, and other compilers for this target like clang do have a __int128 type and support using it on the windows msvc targets (https://gcc.godbolt.org/z/vivo9o). I'd suspect this agrees with what GCC does.
If all these other compilers for this target agree, then there is an ABI for __int128 that everybody agrees on for this target. The fact that MSVC doesn't support it just means that MSVC is not able to export symbols or call other symbols that use this type, but that's it.
There might be targets without an ABI spec and/or without C compilers, in which case, we can try to agree with whatever is used on the target, but if that doesn't support 128-bit integers, does it even matter what we do?
Most 32 bit C toolchains don't have __int128 AFAIK. Definitely not 32 bit x86 Linux.
@rkruppe I don't know of any 32-bit C toolchains with __int128, fwiw
@gnzlbg ah sorry, I read your prior post too hastily. Indeed you said everything except MSVC.
The LLVM bug @nagisa mentioned is about making i128 16-aligned everywhere, if I understand correctly. But Clang already does something to make its __int128 16-aligned, why can't rustc do the same?
It does seem rather odd though if on 32bit, u64 would be 4-aligned but u128 would be 16-aligned... actually doesn't that violate the "max alignment" thing C defines? I suppose that's related to C not having 128-bit integers on 32-bit platforms.
So also taking into account what @rkruppe and @ubsan just wrote, my understanding is there are actually two issues:
repr(Rust) types). But on 64bit we should do what the platform ABI says, so we need to bump the alignment to 16.Is that an accurate summary?
actually doesn't that violate the "max alignment" thing C defines?
No, most (all?) 32-bit targets support a type that sets the _Alignof(max_align_t) == 16, so __int128 doesn't break that.
@ubsan and @rkruppe are right. I just skimmed the x86 and riscv 32-bit linux ABIs and they don't mention the __int128 type in there, while they do so for the 64-bit types. I don't know why this is, but if the C compilers on those targets don't support it, then there is no need to document an ABI for it.
To elaborate on MSVC story: the Microsoft documentation for the x64 ABI does not explicitly specify the ABI for __int128 ^1. If one was to infer the ABI for u128 from the rest of the wording in Microsoft鈥檚 documentation then neither clang/llvm nor gcc correctly implement it (and both differ from each other).
This in my opinion makes the MSVC target (and also any other target that uses x64 ABI, such as UEFI) the prime example of a 64-bit target where the __int128 is unspecified and therefore not FFI-safe.
If one was to infer the ABI for u128 from the rest of the wording in Microsoft鈥檚 documentation then neither clang/llvm nor gcc correctly implement it (and both differ from each other).
Does the implementation of clang/llvm matches that of gcc for the target ? If so, it is possible to make i128 FFI safe by giving it the same ABI that all compilers of that platform _that support the type_ use. One can't use the type to interface with MSVC, so it doesn't matter that the MSVC ABI doesn't document the type.
There is precedent of doing this, for example, it is exactly how we argue that ZSTs are FFI safe, e.g., by arguing that all C compilers in a platform that support them as a C language extension treat them in the same way, independently of whether the ABI spec covers them or not.
We could provide a stronger _definition_ of what FFI safety means in Rust. For example, we could define that FFI safety means that (1) the platform has an ABI spec, (2) the platform ABI spec provides a precise ABI for the type, (3) and all C compilers of the platform stick to that ABI.
I'm quite certain that, under that definition, there aren't many Rust types that can be used in C-FFI, on any platform, and therefore, such a definition wouldn't be very useful. So the question is how much can we relax it to make it useful.
If MSVC ever provides __int128, and if when it does, it gives it an ABI different than what clang and gcc do, then we'd need to change the ABI on the target. I think that's fine. It means that all Rust libraries using u128 that were compiled before that change can't interface with MSVC compiled code, but that's something that they couldn't do before anyways because MSVC did not provide a __int128 type. Clang and GCC will need to do the same. This combined with the fact that MSVC is one of the platforms that breaks its ABI quite often (although the trend is currently for this to happen less often) makes me comfortable with such a situation. Clang and GCC do provide a compiler flag that allows selecting an older ABI version, e.g. -fclang-abi=7 (or similar). If the update of i128's ABI were to cause problems, we could provide such a flag or similar mechanism (e.g. a target) for backward compatibility as well. This wouldn't be any different that handling any other kind of platform ABI breakage.
Does the implementation of clang/llvm matches that of gcc for the target ?
I did say in the quoted sentence that they differ from each other; at least they did back when I was looking at it.
I did say in the quoted sentence that they differ from each other; at least they did back when I was looking at it.
I'm trying to reproduce without any luck. Do you remember what was different? Did you fill a clang bug report about this?
The only related issue filled by myself I can find is https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78799
The only related issue filled by myself I can find is https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78799
I've found a clang bug about this that was fixed on clang 6 (https://bugs.llvm.org/show_bug.cgi?id=31362), so maybe this is fixed now?
I'll be fine with having the FFI-safety warning be target dependent, iff we document for the targets for which we issue the warning the reason (e.g. on the msvc targets that Microsoft does not document an ABI for the type, and therefore our implementation could change in the future to match that), and also, if we implement the type on those targets in a "sane" way (e.g. on the msvc targets we implement it to do the same thing that gcc and clang do, until there is something better that we can do).
Most helpful comment
Nobody argued that we should have a different layout. I agree using a different layout is a bug. I just don't agree it's a soundness bug.
u128/i128are not FFI-safe when they should be. That's a bug. That's why this issue exists and is still open. We also don't claim that they are FFI safe, so it's not a soundness bug.Ultimately where to draw the line for soundness bugs can be arbitrary though.