Tracks stabilization for the linkage attribute.
cc @alexcrichton
Currently this translates to the various linkage models of LLVM, for example the values of the attribute can be:
match name {
"appending" => Some(llvm::AppendingLinkage),
"available_externally" => Some(llvm::AvailableExternallyLinkage),
"common" => Some(llvm::CommonLinkage),
"extern_weak" => Some(llvm::ExternalWeakLinkage),
"external" => Some(llvm::ExternalLinkage),
"internal" => Some(llvm::InternalLinkage),
"linkonce" => Some(llvm::LinkOnceAnyLinkage),
"linkonce_odr" => Some(llvm::LinkOnceODRLinkage),
"private" => Some(llvm::PrivateLinkage),
"weak" => Some(llvm::WeakAnyLinkage),
"weak_odr" => Some(llvm::WeakODRLinkage),
_ => None,
}
Some worries about this are:
external or weak, I've never seen a reason to use the other attributes.That being said, it's the only way to do weak symbols on Linux and it's also convenient for exporting a known symbol without worrying about privacy on the Rust side of things. I would personally want to reduce the set of accepted linkage types and then state "well of course linkage is platform-specific!"
cc #29629
This feature is fundamentally broken. Consider the following code:
static unsigned long B = 0;
unsigned long *A = &B;
void f(void) { }
#![feature(linkage)]
#[link_name = "c_part"]
extern {
#[linkage = "extern_weak"] static A: *const usize;
fn f();
}
fn main() {
unsafe {
f();
println!("{:x} @ {:x} @ {:p}", *(*A as *const usize), *A, A);
}
}
Which prints 0 @ 5617765adaa8 @ 0x5617765ad890 meaning that the A seen by the rust code contains the address of A and not A itself. It's also easy to get LLVM to abort with this.
This attribute is also used incorrectly here.
https://github.com/rust-lang/rust/issues/18804 needs to be resolved before stabilization.
I think we should at least work on stabilising the weak and extern_weak linkage. Both are very useful and a fairly widespread linkage options that seem to be supported (unlike many other options currently exposed) by at least all of the tier 1 targets.
To give an example of a use-case for weak linkage in Rust code (I need weak linkage for libloading), one would be for global statics unique even between multiple versions of the same crate. Consider that, currently, if a binary links log = ^0.3 and log = ^0.4 some way, or another, it will end up having two distinct global loggers. This could trivially be resolved with some use of the weak linkage option (as it ensures – at link time – only one instance of global with the same name).
That being said, #[linkage] should always infect whatever it applies to with unsafety. Consider for example these two crates:
// crate older version
#[linkage="weak"]
static mut FOO: u32 = !0;
// crate newer version
#[linkage="weak"]
static mut FOO: char = 'a';
when linked together, all uses of FOO as seen from the newer-version crate would be UB if the linker happened to choose to link-in the value from the older version. This is functionally a transmute without size checks.
extern_weak is broken in such a dubious way that even though I pointed out two years ago that it was being used incorrectly in the stdlib, the bug still hasn't been fixed.
~$ cat strong.rs
extern {
static __progname: *const u8;
}
fn main() {
unsafe {
println!(" __progname:\t\t{:?}", __progname);
println!(" &__progname:\t\t{:?}", &__progname as *const _);
if !__progname.is_null() {
println!(" *__progname:\t\t{:?}", *__progname as char);
}
}
}
~$ cat weak.rs
#![feature(linkage)]
extern {
#[linkage = "extern_weak"]
static __progname: *mut *const u8;
}
fn main() {
unsafe {
println!(" __progname:\t\t{:?}", __progname);
println!(" &__progname:\t\t{:?}", &__progname as *const _);
if !__progname.is_null() {
println!(" *__progname:\t\t{:?}", *__progname);
if !(*__progname).is_null() {
println!("**__progname:\t\t{:?}", **__progname as char);
}
}
}
}
~$ rustc strong.rs
~$ rustc weak.rs
~$ ./strong
__progname: 0x7ffdfc7438ac
&__progname: 0x7f10560a5360
*__progname: 's'
~$ ./weak
__progname: 0x7f730b21f360
&__progname: 0x5574a33bb008
*__progname: 0x7fffe4ca38b2
**__progname: 'w'
gcc and clang handle this attribute correctly.
PS: The address of __dso_handle isn't actually too significant in __cxa_thread_atexit_impl and &__dso_handle is only a few bytes off.
PSS: Wow, looks like I already explained this above (also two years ago). Maybe the NSA is trying to keep this potential remote code execution unfixed. 🤔
Ping @alexcrichton @nagisa what's the status here? Is there a bug here that can be solved / mentored?
@cramertj I personally consider this a perma-unstable issue for now. In general symbol visibility and ABIs are something that historically rustc hasn't done much to specify and has had a lot of freedom over. We relatively regularly tweak ABIs, symbol names, etc. There's a very thin layer at the end (like a C ABI) which is pretty stable but even that gets sketchy sometimes (#[no_mangle] deep in a module hierarchy?).
I think we've benefitted quite greatly from the symbol visibility flexibility we've had historically in terms of compiler refactorings and heading off regressions. It's hard to introduce a regression when you can't rely on the feature in the first place!
Along those lines I think there's definitely some select use cases where using something like #[linkage] is critical, but from what I've seen they tend to be few and far between and somewhat esoteric. A blanket and general #[linkage] attribute I think is way too powerful to solve this use case and it'd be better to poke around at various motivational use cases to see if there's a more narrow solution.
(plus that and the whole #[linkage] is incredibly platform/LLVM specific and I don't think we full understand all the linkage modes in LLVM and how they apply to all platforms as well)
Given that crate owners can't control how many instances of their crate will be included in a given binary, it seems that we really need a mechanism at least for _weak_ linkage in stable Rust.
I got bit by the fallout from https://github.com/rust-lang/rust/issues/29603#issuecomment-355775169, where rust-libloading gained a cc complation step to workaround this missing feature.
I would love some mechanism to merge statics and variables with the same value (and name, possibly).
Consider the following code:
// In my actual code, this is a more complicated proc macro.
macro_rules! special_number {
($value: expr) => {
{
// In my actual code, these static variables also have
// the special `export_name` of
// "\x01L_special_number_<unique_id>", where
// `<unique_id>` is a unique identifier to avoid
// symbol conflicts.
#[link_section = ".data,__custom_special_section"]
static SPECIAL_NUMBER: usize = $value;
&SPECIAL_NUMBER
}
};
}
extern {
fn consume_special_number(value: &usize);
}
pub fn main() {
unsafe {
consume_special_number(special_number!(42));
consume_special_number(special_number!(42));
consume_special_number(special_number!(42));
};
}
This will generate three different static variables. I would love it if I could get Rust to merge these static variables into one single static variable. Using a const doesn't work (it does merge the values, but you can't provide link attributes on a const).
There are two ways the merging could be performed:
export_name (that opt-in to merging) would be merged into a single static variable (and wouldn't result in duplicate symbol errors).I would be prefer option 2 (merging by name).
Perhaps this is what the linkonce_odr linkage type is for, but using the same export_name causes
The linkonce_odr and weak_odr linkage types are similar to this, I think, but don't work (in Rust) for merging globals/statics within a single translation unit. Rust could either extend them or introduce a new linkage type that does ODR merging within translation units.
Should we stabilize weak linkage first? :) It would be useful for building a runtime. We can write complex default function in Rust, export it in weak linkage; if another downstream provided their implementation, the complex default function won't be compiled. That could save code size which is importand in embedded environmant.
We could even implement another #[weak] lint instead if the original is too llvm-specific.
What work is needed to stabilize #[weak] in extern "C" blocks?
@nvzqz probably at least some resolution to #31508
Most helpful comment
I think we should at least work on stabilising the
weakandextern_weaklinkage. Both are very useful and a fairly widespread linkage options that seem to be supported (unlike many other options currently exposed) by at least all of the tier 1 targets.To give an example of a use-case for
weaklinkage in Rust code (I need weak linkage for libloading), one would be for global statics unique even between multiple versions of the same crate. Consider that, currently, if a binary linkslog = ^0.3andlog = ^0.4some way, or another, it will end up having two distinct global loggers. This could trivially be resolved with some use of theweaklinkage option (as it ensures – at link time – only one instance of global with the same name).That being said,
#[linkage]should always infect whatever it applies to with unsafety. Consider for example these two crates:when linked together, all uses of
FOOas seen from the newer-version crate would be UB if the linker happened to choose to link-in the value from the older version. This is functionally a transmute without size checks.