rfcs 🚀 - Define a Rust ABI | bleepingcoder.com

CC me

ranma42 on 20 Jan 2015

See #1675 for some motivation for this feature (implementing plugins, that is plugins for Rust programs, not for the compiler).

nrc on 17 Aug 2016

👍13

Another motivation is the ability to ship shared libraries which could be reused by multiple applications on disk (reducing bandwith usage on update, reducing disk usage) and in RAM (through shared .text pages, reducing RAM usage).

genodeftest on 16 Dec 2016

👍42

It would also make Linux distributions more happy as it would allow usage of shared libraries. Which apart from memory reduction in different ways, also make handling of security issues (or otherwise important bugs) simpler. You only have to update the code in a single place and rebuild that, instead of having to fix lots of copies of the code in different versions and rebuild everything.

sdroege on 27 Mar 2017

👍35

Shared libraries can be cool provided that there is a way to work around similar libraries accomplishing something in different ways such as openssl & libressl. There could also be value with making a way to allow a switch in-between static & dynamic libraries that can be set by the person compiling the crate. (possible flag?)

Arzte on 29 Mar 2017

In my opinion, it's very hard to take Rust seriously as a replacement systems programming language if it's not possible in any reasonably sane manner to build applications linking to libraries with a safe, stable ABI.

There are a lot of good reasons for supporting shared libraries, not the least of which is building fully-featured systems for more resource constrained devices (like your average SBC or mobile device). Not having shared libraries really blows up the disk utilization in places where it's not cheap.

@jpakkane describes this really well in his blog post where he conducts an experiment to prove this problem.

Conan-Kudo on 29 Mar 2017

👍28

In my opinion, it's very hard to take Rust seriously as a replacement systems programming language if it's not possible in any reasonably sane manner to build applications linking to libraries with a safe, stable ABI.

Note that C++ still doesn't have a stable ABI, and it took C decades to get one.

vks on 29 Mar 2017

👍12

It would also make Linux distributions more happy as it would allow usage of shared libraries.

There are a lot of good reasons for supporting shared libraries

(two quotes from two different people) Note that rust _does_ support shared libraries. What it doesn't support is mixing them from different compiler toolchains. Some linux distros already do this with Rust; since they have one global rustc version, it works just fine.

steveklabnik on 29 Mar 2017

Rust also supports exporting functions with stable ABI just fine, as, for example, this shows.

nagisa on 29 Mar 2017

👍2

Note that C++ still doesn't have a stable ABI, and it took C decades to get one.

While that is true, implementations (GCC, clang, MSVC at least) have a (somewhat) defined ABI and it only changes every now and then. With Rust there is no defined ABI at all and things might break in incompatible ways with any change in the compiler, a library you're using or your code, and you can't know when this is the case as the ABI is in no way defined (sure, you could look at the compiler code, but that could also change any moment).

What it doesn't support is mixing them from different compiler toolchains. Some linux distros already do this with Rust; since they have one global rustc version, it works just fine.

The problem is not only about compiler versions, but as written above about knowing what you can change in your code without breaking your ABI. And crates generally tracking their ABI in one way or another. Otherwise you can't use shared libraries created from crates in any reasonable way other than recompiling everything (and assuming ABI changed) whenever something changes.

sdroege on 29 Mar 2017

At least on Linux, everyone has pretty much settled on the Itanium C++ ABI. But even with the compiler locked down, it still requires very careful maintenance by a library author who hopes to export a stable ABI. Check out these KDE policies, for instance.

Rust crates would have many of the same challenges in presenting a stable ABI. Plus I think this is actually compounded by not having the separation between headers and source, so it's harder to actually tell what is reachable code. It's much more than just pub -- any generic code that gets monomorphized in your consumers may have many layers of both public and private calls to make to the original crate's library. And all of that monomorphized code has to remain supported as-is when you're shipping updates to your crate.

cuviper on 29 Mar 2017

👍5

Had a long term crazy idea to solve this.

Define a stable format for MIR and distribute all Rust binaries/shared libraries in that format. Package managers have a post-install step to translate the MIR into executables or .so files. Only the version of the MIR->binary backend "linker" has to be the same, the version of the source->MIR frontend compiler can differ.

Because of monomorphization and unboxed types, you need to still need to relink all the MIR files when a dependency is updated. Similarily, updating the backend compiler requires all the MIR files to be recompiled.

However, assuming we can push more and more optimisation passes into MIR rather than llvm, the time spent in the backend should be reduced to something acceptable.

If you want to push it even further, keep everything in MIR form and use miri (or a JIT version of it) to run them. Frequently used files can be linked and persisted to disk. And we've just reinvented the JVM/CLR/Webassembly.

plietar on 29 Mar 2017

❤4

Rust also supports exporting functions with stable ABI just fine, as, for example, this shows.

For the purposes of this discussion that would require that every single Rust crate only exports a C ABI. Which is not going to happen.

jpakkane on 29 Mar 2017

👍5 😄3

While that is true, implementations (GCC, clang, MSVC at least) have a (somewhat) defined ABI and it only changes every now and then.

Try linking to any interface that uses std::string. You cannot mix different gcc versions and clang, because they use incompatible implementations of std::string.

vks on 29 Mar 2017

👍2

While that is true, implementations (GCC, clang, MSVC at least) have a (somewhat) defined ABI and it only changes every now and then.

Try linking to any interface that uses std::string. You cannot mix different gcc versions and clang, because they use incompatible implementations of std::string.

Somewhat off-topic (we're not talking about C++ here), but as long as you stay in a compatible release series of those there is no problem. And with the correct compiler switches you can also e.g. also build your C++ code with gcc 7 against a library that was built with gcc 4.8 and uses/exposes std::string in its API.

The second part seems unnecessary but nice and useful (and a lot of work), but if the first part would be true for Rust that would be a big improvement already: a defined ABI, which might change for a new release whenever necessary

sdroege on 29 Mar 2017

👍3

Quote from GCC about ABI compatibility (as a note to myself):

…Versioning gives subsequent releases of library binaries the ability to add new symbols and add functionality, all the while retaining compatibility with the previous releases in the series. Thus, program binaries linked with the initial release of a library binary will still run correctly if the library binary is replaced by carefully-managed subsequent library binaries. This is called forward compatibility. …

FranklinYu on 29 May 2017

Relying on libraries to correctly maintain binary compatibility is just an easily avoidable safety hazard. What is wrong with just letting the package repository rebuild dependent code? (To be clear, in this scenario using shared libraries is a given. Shared libraries and ABI stability are independent issues.)

And before someone argues about update download size, I must note that differential updates are not that difficult (no pun intended). In fact, if reproducible builds are done well, the resulting binary should be identical unless the library ABI has actually changed.

le-jzr on 29 May 2017

Because that assumes rebuilding is cheap. While it might be true for small projects, it's a fairly expensive and crappy process when you have long chains of things in big projects.

And it also makes it impossible to rely on Rust for actual systems programming because pure Rust libraries cannot be relied on for any given period of time.

Conan-Kudo on 29 May 2017

👍9

And it also makes it impossible to rely on Rust for actual systems programming because pure Rust libraries cannot be relied on for any given period of time.

What exactly can't be relied on? Systems programming is my area of interest and I don't see what you mean.

le-jzr on 29 May 2017

To get an understanding of how much rebuilding and interdependencies there actually are in a full blown Linux distro, please read this blog post. It talks about static linking so it is not directly related to this discussion but useful to get a sense of scale.

What is wrong with just letting the package repository rebuild dependent code?

Well as an example on Debian there are 2906 packages that depend on GLib 2.0. Many more depend in it indirectly.

jpakkane on 29 May 2017

👍11

Fair enough. But you don't actually need stable ABI for any of that. The ABI can change between rustc versions, but different builds on the same version are still compatible. So your distro only needs to rebuild stuff whenever they bump rustc version, which I assume is not gonna be often for a typical distro.

le-jzr on 29 May 2017

At least in Fedora and Mageia, you'd be wrong about that. We bump Rust almost right after the new version arrives.

Conan-Kudo on 29 May 2017

So, once every six weeks?

le-jzr on 29 May 2017

Fair enough. But you don't actually need stable ABI for any of that. The ABI can change between rustc versions

There's also the ABI aspect of the actual crates/libraries, not just of the compiler itself. If you have e.g. 200 crates depending on the slog crate, and slog changes ABI, you'll also have to recompile all those... independent of any rustc changes or not. And you'd need a way to actually know that its ABI has changed.

Basically what you are proposing ("just rebuild everything") requires you to rebuild everything all the time. You update one crate, you need to rebuild all crates that depend on it. Because there is no defined ABI, you can only assume that every change also changes the ABI. (This has nothing to do with ABI stability FWIW)

sdroege on 29 May 2017

And you'd need a way to actually know that its ABI has changed.

You absolutely need that in any scenario. Otherwise you reduce all safety guarantees to wishful thinking.

Also, I'm not proposing anything, I'm honestly just trying to understand the limitations of what can be done with current compiler.

le-jzr on 29 May 2017

And you'd need a way to actually know that its ABI has changed.

You absolutely need that in any scenario. Otherwise you reduce all safety guarantees to wishful thinking.

Well, that's all this issue is about. To actually define a Rust ABI (that might change every compiler release, or not) so you can know what actually is an ABI change and what not, and to be actually able to know when the ABI of some crate has changed. It's not about stabilizing any ABI.

sdroege on 29 May 2017

👍2

What I'm getting at is that just telling programmers a list of stuff to avoid is not by itself sufficient. It needs to be verified automatically, otherwise you just have the C situation where human errors go undetected until it causes disasters. Isn't there some open issue about verifying semver contract automatically for cargo crates? Can't find it now, but I'm sure I read something about it. It sounds like verifying ABI compatibility would be a natural part of that, whether or not said ABI is stable and specified.

le-jzr on 29 May 2017

👍1

My reasoning is that ABI compatibility is not so much the domain of developers as it is of package maintainers. With all the guarantees Rust provides, it's easy to imagine an upstream commit that passes all reviews because it doesn't touch unsafe code and doesn't change public API as understood by semver, but introduces a breakage in ABI which nobody on the project noticed (especially since Rust developers are so used to ABI being off limits to casual reasoning).

Without automated checking, that change would make it to cargo, and if the package maintainer doesn't review each individual commit, they would package a version that introduces an invisible vulnerability in dependent packages, all the while being perfectly fine from semver perspective.

le-jzr on 29 May 2017

Yes, that completely makes sense as the next step after the ABI is defined, and would be very extremely useful to have

sdroege on 29 May 2017

Eventually, Rust needs an ABI, because you want to reuse libraries between projects without statically linking them - because of the file size. Let's say I want to use libservo for rendering. Do I really want to have a 70MB library embedded statically into several applications or rather a shared library installed on the system that other applications can depend upon?

What I'd need is:

Some way of telling cargo / rust that I want to use a external rust library
Rust / cargo should discover the types in the external library, in order to use them in documentation
The ABI should preserve generics, otherwise it'd be pretty useless. This is probably very hard to do.
Name mangling is a problem. There should be some way to turn off name mangling for a crate, not just on individual types.
Rust / cargo should learn to not to statically link this library
Integration with standard package managers like apt, rpm, etc.

Something like:

extern "Rust" crate servo;

What's the status on this? Has anyone made a proposal on how the ABI should look like?

fschutt on 26 Sep 2017

👍1

@fschutt nothing has been done on this front. Things are still in flux and defining an ABI means we're stuck with it. I'm sure the discussion for this will be longer and more exhausting than the Epoch discussion that went on recently. Defining an ABI hasn't had much of a priority at all because there's still a lot of other things the compiler team is trying to implement and resources are limited. As far as I know no one has made a proposal at all.

mgattozzi on 26 Sep 2017

👎1

@mgattozzi wrote:

@fschutt nothing has been done on this front. Things are still in flux and defining an ABI means we're stuck with it.

I disagree. First, defining an ABI does not mean we are stuck with it. Java does (mostly smaller) ABI breaks for every major release. Gtk+, Qt and the KDE project do ABI breaks on every major release. The C++ standard library does ABI breaks every now and then.
Second, we really do need ABI stability for shared libraries, because otherwise Rust is not fit as a general-purpose programming language for wide use on regular operating systems such as windows or *nix. As long as there is no ABI stability and thus no libraries can be shared, most rust applications bloat your disk and your ram. And it will be a hell of a job for distribution maintainers to get CVEs fixed.

genodeftest on 26 Sep 2017

👍11

@fschutt You're right, that there are some hard problems.

Some way of telling cargo / rust that I want to use a external rust library

Cargo indeed doesn't have many smarts about using dylibs, although it does know how to build them and rustc does know how to link against them. You should see https://doc.rust-lang.org/reference/linkage.html about what the compiler supports today. Additionally, you can build with -C prefer-dynamic to get rustc to prefer linking against dylibs when possible. You can get cargo to use this with the RUSTFLAGS env var.

Rust / cargo should discover the types in the external library, in order to use them in documentation

Not really sure what you mean by this, but Rust library artifacts have "metadata" that includes all this information. It isn't used in documentation. That's a separate thing, that needs more information than is feasible to stash into metadata.

The ABI should preserve generics, otherwise it'd be pretty useless. This is probably very hard to do.

I'm not sure what you're getting at here, but this is more about code size and specialization/monomorphization and is extremely hard to tackle in the Rust of today.

Name mangling is a problem. There should be some way to turn off name mangling for a crate, not just on individual types.

It isn't a problem. Sorry, I just don't understand how you could see that it would be a problem. Name mangling is pretty much completely stable at this point.

One of the things people usually want when they come to this issue is the ability to re-compile some frequently-depended-on crate with some minor internal changes, and have those changes automatically used. Frankly, it's not a realistic or even desirable goal for many categories of Rust libraries. Rust does very aggressive monomorphization and inlining of generics, to boil away abstractions.

One misconception is that because Rust doesn't have a defined ABI, you can't do dylibs. This is factually incorrect. rustc is dynamically linked to many of its deps. You can dynamically link to _all_ your crate dependencies. This doesn't mean all code will be shared, or even most. Any generic types will get monomorphized on-demand and usually inlined into their callsites.

But since there _is_ no guarantee of ABI stability, if there is a rustc upgrade you need to recompile the world. If any of your "fundamental" deps change (futures, std, many others) you need to recompile the world.

Binary modularity is a hard problem that won't be solved just by defining an ABI. We could decide to freeze struct/enum/vtable/etc layout today and none of the problems you want to be solved would be impacted by this much.

emberian on 26 Sep 2017

👍9

Defining an ABI will help answer the question, and define the scope of: "can artifacts compiled with some earlier version of rustc be used with artifacts compiled with some later version of rustc".

Code size issues are important but belong in some other issue. Integration with common Linux distribution patterns belongs in some other issue.

emberian on 26 Sep 2017

👍4

The ABI should preserve generics, otherwise it'd be pretty useless. This is probably very hard to do.

I'm not sure what you're getting at here, but this is more about code size and specialization/monomorphization and is extremely hard to tackle in the Rust of today.

I think he meant that it actually has to be supported via dylibs, the assumption being that this is not possible at all yet which is wrong as you say yourself. dylibs exist nowadays already support generics just fine, all the required metadata for them is there and monomorphization will then still happen afterwards.

Defining an ABI will help answer the question, and define the scope of: "can artifacts compiled with some earlier version of rustc be used with artifacts compiled with some later version of rustc".

It also defines "what changes can I, as a crate author, do without changing the ABI of my crate". Which is generally also very useful, but without solving the social problem of all crate authors to care about that and having a way to signaling ABI compatibility in crate versions (and a way to automatically check this!) it is only a minimal first step.

sdroege on 26 Sep 2017

@genodeftest Second, we really do need ABI stability for shared libraries, because otherwise Rust is not fit as a general-purpose programming language for wide use on regular operating systems such as windows or *nix. As long as there is no ABI stability and thus no libraries can be shared, most rust applications bloat your disk and your ram. And it will be a hell of a job for distribution maintainers to get CVEs fixed.

Defining an ABI and actually guaranteeing any kind of ABI stability are completely different issues though. The latter requires the former, but one step at a time.

All problems you mention can already be solved nowadays without any of the two though, you "only" need to recompile the world whenever anything changes. Because you have no guarantees whether that change causes the ABI to be different, and if that change in generic code requires rebuilding users of that generic code because they got their own monomorphized copies of it. The latter is also a problem with C++ and various other languages though (or C inline functions or macros defined in headers), not sure how e.g. Linux distributions are handling issues related to that.

sdroege on 26 Sep 2017

Adding something like "are these C-exposed types and functions compatible between my versions" to rust-semverver (i.e. treating them as exported) should be possible - it already handles exported Rust APIs (although not their ABIs - while it's doable, modulo generics, it's a bit pointless just by itself).

cc @ibabushkin

eddyb on 26 Sep 2017

👍1

Additionally, you can build with -C prefer-dynamic to get rustc to prefer linking against dylibs when possible. You can get cargo to use this with the RUSTFLAGS env var.

This is not as easy as described. Setting RUSTFLAGS will affect cargo's building behavior. See this issue: https://github.com/rust-lang/cargo/issues/4538

Basically, you cannot easily build dependencies as dylib by setting RUSTFLAGS with -C prefer-dynamic. Cargo will compile dependencies as rlib by default (and is not configurable).

Therefore, I agree that the whole discussion is not just about ABI stability. To achieve dynamic linking, some core tools like cargo needs modifications.

One misconception is that because Rust doesn't have a defined ABI, you can't do dylibs. This is factually incorrect. rustc is dynamically linked to many of its deps. You can dynamically link to all your crate dependencies. This doesn't mean all code will be shared, or even most. Any generic types will get monomorphized on-demand and usually inlined into their callsites.

Yes. One can compile dylib by setting rustc with the --crate-type dylib flag. However, as discussed before, you cannot link lib compiled with old rustc to binary with new rustc. All binaries and dynamic libraries need to be compiled with same rustc. I believe this is more about ABI compatibility.

All problems you mention can already be solved nowadays without any of the two though, you "only" need to recompile the world whenever anything changes.

Yes. But when using Rust as a system language, "recompile the world" is not a good idea. This means that when a libraries is updated, all libraries and binaries depend on it needs to be recompiled. It will have a very long compilation time when many Rust binaries are dynamic linked.

not sure how e.g. Linux distributions are handling issues related to that.

I think package maintainers have to resolve the linking issues. In addition, I guess most core C/C++ libraries in Linux distributions are more stable than Rust crate lib.

mssun on 27 Sep 2017

👍1

Because you have no guarantees whether that change causes the ABI to be different, and if that change in generic code requires rebuilding users of that generic code because they got their own monomorphized copies of it.

This means the following two things:

The ABI boundary of a crate does not necessarily consist of public items only. So the crate developers would have much more than rust-lang/rfcs#1105 to consider if semver semantics are extended to the ABI. To mirror one particular conclusion in that RFC, backward compatibility across behavioral changes in generics would be entirely up to the developers to maintain.
ABI stabilization shall include stabilization of the serialization format for generics baked into dylibs.

I think the linkability guarantee can be solved by an ABI check tool akin to rust-semverver, which should be able to check the serialized generics and the underlying non-generic items for semver-significant breaks.

Making the crate's ABI surface evident to the developers could perhaps also be solved by tools, by forcing to annotate every internal item exposed via generics with API stability attributes when commitment to an API version is made.

mzabaluev on 18 Nov 2017

Note that reachability analysis for internal functions and methods should be performed by the crate linker already now, unless it emits all internal non-generic functions as dynamic symbols just in case they are used in some generic. That would be a bad thing.

mzabaluev on 18 Nov 2017

There's really three different goals here:
1) Provide guarantees of what code changes a developers can make to their crates without breaking ABI compatibility for binaries compiled against the older version of that crate.
2) Allow recompiling a crate with a newer rustc version without breaking ABI compatibility.
3) Document the ABI used by rustc, so that other compilers that link against rust binaries can be written.

The Rust ABI Specification would solve all three goals. However the first goal is only solved indirectly -- an explicit list of allowed code changes is still desirable.
But there's nothing really stopping us from reaching goal 1 without defining a Rust ABI. Yes, you'll still need to recompile the world after upgrading rustc, but at least developers can upgrade their dynamic library if they keep using the same rustc. This kinda already works in practice (as long as you follow an undocumented set of rules), we just need some documentation, and ideally an automatic ABI compatibility check.

As an example for a starting guarantee that we could already give:

Changing the body of non-generic, non-inline functions does not affect the crate ABI.

Such a guarantee does restrict our ABI choice -- e.g. it prevents us from picking the best calling convention based on the call sites or the function body (at least for exported functions). But it places no restrictions on what the calling convention is, as long as it's only influenced by the function signature and rustc version.

Start with something like that, build an ABI compatibility check tool, then slowly add more guarantees (thus slowly restricting the set of possible ABIs). This way we get the benefits of goal 1 without the huge amount of work of fully defining the rust ABI (for each platform!), and without preventing all future optimizations to type layout / calling conventions.

dgrunwald on 18 Nov 2017

👍5

It could be a good idea to do exactly what GHC does - the binaries produced by 8.x are compatible with what 8.y produces. This means that compiler ABI change implies a major version bump.

marmistrz on 22 Mar 2018

@marmistrz I'd like to note that GHC has a version 8, so there was a version 7, but Rust will never have a 2, so in effect what you are saying is that we should define a stable ABI for ever.

Centril on 22 Mar 2018

Why not? Why not release a 2.0 even if there are no API changes but just ABI ones?

marmistrz on 22 Mar 2018

👍2

@marmistrz Sure, that is possible. However, moving to a version 2.0 has psychological effects beyond technical aspects. It could lead to the impression that we don't take backwards compatibility seriously even if there only were ABI changes. We would need a new merged RFC to support this change.

Centril on 22 Mar 2018

@Centril
Rust breaks backwards compat anyway, e.g. by https://github.com/rust-lang/rust/issues/34537

I sometimes wish C++ threw away some of the old C-compat features. That would be a much more nice language to write in. Rust may need to face the same in a 20 years timeframe.

When Python 3 was announced, Python 2 was given a dying period of at least 5 years to give people the time to move to Python 3.

the main point is that ABI compatibility may be handled by versioning conventions, and the GHC scheme was just an example. If it's handled by minor versions which are multiples of prime numbers or any other scheme - that's just an implementation detail.

marmistrz on 22 Mar 2018

It could be a good idea to do exactly what GHC does - the binaries produced by 8.x are compatible with what 8.y produces. This means that compiler ABI change implies a major version bump.

This solves nothing from the issue here. The problem is defining an ABI so that also crate authors know which changes they can make without breaking ABI, and ideally defining even some way of versioning the ABI :)

While it would certainly be nice if Rust and the standard library itself would somehow signal ABI compatibility between versions, that's only a very small part of the whole thing (and to some degree we already have that: every stable release changes the ABI).

sdroege on 22 Mar 2018

Let me see if I understand this right:

Rust will never have a version 2 because people will think Rust doesn't take backwards compatibility seriously, and that would be bad
Rust deliberately breaks ABI every release with no backward compatibility at all, and that's just fine for a compiled system programming language that made a big deal out of language stability starting with 1.0.

saschmit on 23 Mar 2018

😕1 👍1

Rust has been committed to a stable language and API since 1.0, but ABI stability has never been claimed. Your code is compatible; your binaries are not.

cuviper on 23 Mar 2018

👍6

For reference: on the ABI state in Swift: https://swift.org/abi-stability/#data-layout

marmistrz on 23 Mar 2018

👍4 😄2

Another benefit I don't think anyone has mentioned is that distributing closed source libraries is a huge pain without a stable, or at least versioned ABI.

C++ also has this problem - most people's solution is just to compile the library with a load of different compilers and settings and hope for the best, but that is pretty awful. Another awful solution is to provide a C wrapper of your C++ library, and then an open source C++ wrapper of that. It would be nice if Rust was better.

Timmmm on 31 Jul 2018

👍9

@Timmmm You can totally distribute a closed source library, but you have to pick a rustc version and by default we keep far too much information around to be too "closed source".

This isn't even about ABI compatibility - you can't compile against the Rust typesystem across compiler versions, and I don't recall proposals for how this could even be made to work.

eddyb on 31 Jul 2018

Yeah, I don't really see this happening as it would effectively prohibit all future changes to the language. However, there's another option which might work:

Define a new ABI, which will essentially be a superset of the C ABI
This ABI will support C types, plus a limited set of rust types (Vec, slices, strings, structs annotated with repr(NewABI), etc.)

This would allow much more convenient interop, whilst only locking down those layouts which are already effectively stable.

For bonus points, the layout of types in this new ABI could be defined as mappings to equivalent C structures, allowing the use of the new ABI from other languages as well.

Diggsey on 31 Jul 2018

👍3 👎1

@Diggsey So you would still be restricted to monomorphic declarations? What's the advantage over some sort of interop library using the C ABI that's provided as source?

eddyb on 31 Jul 2018

Some previous discussion on a "safe superset of C" ABI: https://internals.rust-lang.org/t/cross-language-safer-abi-based-on-rust/4691

Ixrec on 31 Jul 2018

🎉2 👍1

So if this task were completed, and you wanted to use Rust plugins or shared libraries you'd still have to compile them all with exactly the same compiler, settings, libc, etc? That doesn't seem particularly workable.

I like the idea of a better-than-C ABI though that supports a stable subset of Rust types.

Timmmm on 1 Aug 2018

👍3

I also think it would be valuable to have an officially supported ABI that supports, at version 1, a set of commonly used Rust "vocabulary" types: Option (with deterministic null-pointer optimization and tag layout), Result, Vecs, strings, slices, references (with possibly limited semantics for lifetimes, like supporting non-static things only as parameters to higher-rank lifetimed functions), possibly dyn Traits. It would need to be opt-in for types used in FFI not to limit the evolution of the language, but that doesn't differ from the current situation, doesn't it.

I think it would be great to have an official solution because that would bring people together, towards a shared interface. We only have a single alternative – the C ABI – at the moment, and people are building their own abstractions on top of that because C ABI isn't expressive enough. I think Rust provides a great set of primitives, and I think providing an official ABI would bring value to other languages too. I could imagine some scripting language runtimes and some other emerging systems languages such as Zig saying "we now also support Rust ABI v1.0 for FFI!"

golddranks on 7 Aug 2018

👍4

So useful points from other languages.
Here is an arch wiki article on Haskell, arch dynamic links haskell packages even though there is no stable abi.
This should gives some ideas on how to go about dynmic linking aspect.
An important quote:

Dynamic linking is used for most Haskell modules packaged through pacman and some packages in the AUR. Since GHC provides no ABI compatibility between compiler releases, static linking is often the preferred option for local development outside of the package system.

Here is an Haskell package and one of its depencies. Could also be a refrence on how to go about it
https://www.archlinux.org/packages/community/x86_64/shellcheck/
https://www.archlinux.org/packages/community/x86_64/haskell-aeson/
As I use shellcheck personally, I note that every update of ghc and haskell-aeson, shellcheck gets rebuilt. At the time of writing shellcheck has 59 rebuilds without a version bump (c++ packages generally have less than 10). So a stable abi is not needed, but is very much preferred for dynamic linking.

One step is to list what parts that need to be exported in binary form. Some ideas for ABI could be taken from Vala and Zig since they have a stable ABI. Vala can be a reference for OO feature for example.

marcthe12 on 11 Aug 2018

Vala can be a reference for OO feature for example.

As far as I know Vala translates into C-equivalent code + GLib/GObject/Gio library calls, so it does not need to define its own ABI. Its OO features are partially dynamic and are backed by the runtime library stack.

mzabaluev on 8 Oct 2018

👍2

As far as I know Vala translates into C-equivalent code + GLib/GObject/Gio library calls

Yes, Vala just compiles to C and apart from that uses the GLib/GObject conventions and ABI for naming functions, types, etc. It does not really apply to Rust in any way.

https://internals.rust-lang.org/t/pre-rfc-a-new-symbol-mangling-scheme/8501 is a good starting point for defining the symbol name part of an ABI instead of the current ad-hoc (AFAIU?) scheme.

sdroege on 8 Oct 2018

@sdroege Note that I really don't want to guarantee any symbol mangling scheme, and I'd prefer if, from the start, we had an option to generate short symbol names (even just a hash).
cc @michaelwoerister

eddyb on 3 Nov 2018

@sdroege Note that I _really_ don't want to guarantee _any_ symbol mangling scheme, and I'd prefer if, from the start, we had an option to generate short symbol names (even just a hash).

Sure but having it documented and having it "guaranteed" for a single, specific compiler version is already a good improvement over the undocumented (AFAIK, apart from the rustc code of course) current mangling scheme.

sdroege on 3 Nov 2018

@sdroege Sure, but it would depend on compiler flags, and not be part of the ABI itself.

That is, resolving cross-compilation-unit function/global references will still be done in an implementation-defined manner¹, and symbol names would only serve as a form of debuginfo.

¹ this is done through the identity of the item being referenced, even if that requires recording a mapping to the symbol name or enough information to deterministically recompute it (I wish binary formats were more identity-oriented instead of relying on strings everywhere...)

eddyb on 3 Nov 2018

We noticed a flaw in the current undefined ABI that results way too much stack usage, copying, and poor cache performance in https://github.com/rust-random/rand/issues/817

It appears fn(..) -> Large return Large inside its own stack frame, when the correct behavior would be for the caller to supply a correctly sized buffer.

Issues like this could cause significant performance penalties, especially with const generics ala fn foo<const n: usize>(..) -> [T;n], so they should be fixed before even considering a defined ABI.

burdges on 6 Jun 2019

👍2

The new, well-defined, symbol mangling scheme is a big step towards stabilizing the ABI.

Other big items of work necessarily include:

The calling conventions (mostly punted to llvm targets).
Specification of the data structure layouts (at call boundary at least) for all supported reprs.
A stable metadata representation format for generics and inline functions.

mzabaluev on 6 Jun 2019

The new, well-defined, symbol mangling scheme is a big step towards stabilizing the ABI.

It should not be interpreted this way. The language team has not been involved in the manging scheme design and has no current plans to work towards a stable ABI.

Centril on 6 Jun 2019

👍5 👀1

@burdges I don't understand what you mean, we lower that as passing *mut Large.
You might be seeing two copies of the return value, but that's just for soundness and requires a MIR optimization that have been planned and worked on (and off) for the past couple years (i.e. "NRVO").

eddyb on 6 Jun 2019

👍2

How should I, as someone who values tools that use Rust and does not enjoy recompiling over 400(!) libraries every time rustc is bumped, inform the language team that a stable ABI would be important to me?

How should we, a Linux distribution that would like to ship Rust packages, inform the language team that a stable ABI would be important to us? That six week release cycles for Rust mean we have multiple rustc bumps _between releases_ of our distro, which means we're probably going to be sitting on 1.36 for the next six months until we have the time to bump up and rebuild everything multiple times?

I want to like Rust, I want to start writing Rust, but I can't without _some_ form of stability.

awilfox on 29 Aug 2019

👍8

There are no current plans to introduce anything resembling a wholesale stable ABI either for some new repr(v1) or repr(Rust) and there is active opposition to this within the language team. As such, I'm going to close this super broad wishlist issue.

Centril on 29 Aug 2019

👎22 😕17 👍4

Since a part of the community seems to want stable ABI very badly and the language team is opposed to the idea, maybe there's some way to have the cake and eat it?

If ABI compatibility can be efficiently checked via software, Rust could define _ABI revisions_. If an ABI change is needed, the revision would simply be bumped - I suppose this wouldn't happen too often because Rust is already bent on stable API compatibility.

While this indeed requires some would, it would mean that the language team can change the ABI whenever they want and the distro packagers can reuse the builds whenever they can, saving disk space, network traffic and electricity.

marmistrz on 29 Aug 2019

👍3

Is there any reason to expect that any two rustc versions to have compatible ABIs?

I'd kinda assume niches get altered in every single recent rustc version. Also, the current ABI has serious problems like lacking NRVO. We should not impose any friction on these improvements.

In the longer run, there are afaik no great proposals for optimized dynamic linking in modern languages like Rust, OCaML, Haskell, or even C++, so expect 5+ years of radical ABI churn whenever people really take an interest in that problem.

burdges on 29 Aug 2019

In the longer run, there are afaik no great proposals for optimized dynamic linking in modern languages like Rust, OCaML, Haskell, or even C++, so expect 5+ years of radical ABI churn whenever people really take an interest in that problem.

OCaml allows dynamically loading and linking essentially arbitrary code, with the only restriction being that every module the plugin depends on has to be either (a) inside the plugin, or (b) inside the host application, and the hash of the module interface has to match. This preserves type safety.

However, OCaml's dynamic loading mechanism is in a position that's significantly easier to handle than Rust's because OCaml doesn't have monomorphization and it has an uniform value representation, and as a result dynamic loading in OCaml is not mutually exclusive with full type erasure; the types are only (indirectly) present in the interface hash.

whitequark on 29 Aug 2019

👍1

(Not speaking for the Rust team) I would suggest that instead of adding more comments to this issue here, it would seem more likely to lead to results to handle the different, orthogonal issues that were discussed here (that all in one way or another are related to a Rust ABI) into separate issues.

Many of them can be solved without first defining a stable Rust ABI, and can be tackled independent of that. And hopefully bring us at the same closer to actually being able to work on the general "Rust ABI" problem in the future.

sdroege on 29 Aug 2019

👍5

For clarity, Is this:

and there is active opposition to this within the language team

a "Rust should never have a stable ABI" or a "we're still nowhere near ready to make a stable ABI" kind of opposition?

mathstuf on 30 Aug 2019

👍6

For clarity, Is this:

and there is active opposition to this within the language team

a "Rust should never have a stable ABI" or a "we're still nowhere near ready to make a stable ABI" kind of opposition?

It is my opinion (as a language team member but not speaking for the team as a whole) that Rust should never have a stable ABI for the default representation repr(Rust). It may or may not be a good idea to add a repr(v1) that is strictly opt-in but that has serious technical challenges (e.g. generics) and is in any case not a realistic goal or priority for the next years. I would also not like to see repr(v1) used in the standard library if we would ever add it.

Centril on 30 Aug 2019

👎7 👍5

@burdges The current calling convention is RVO-oriented (except for things like returning the variants of Result separately, I guess).
When we say NRVO, we mean more like "removing copies within a function by writing to the destination directly", which is an optimization not requiring ABI changes (modulo the Result thing which is a rather advanced transformation).

Also, if anyone wants to see my informal take on this: https://twitter.com/eddyb_r/status/1166953126928277505

It's close to "we're still nowhere near ready to make a stable ABI" but also IMO, a lot of the stuff around the idea of a "stable ABI" is not thought through that well, or even outright misguided.

If you start from regarding C as an utter failure on pretty much all fronts other than its popularity, you might find better ways to do things.

But yeah it's far off, likely involving programming languages and tooling very different from what we're used to, especially in the systems programming / low-level areas.

eddyb on 31 Aug 2019

👍3

@eddyb

So you would still be restricted to monomorphic declarations? What's the advantage over some sort of interop library using the C ABI that's provided as source?

With an interop library you always need a "serialization" and "deserialization" step to convert your types to some #[repr(C)] type, a fair bit of unsafe code, and you probably need some kind of procedural macro system to define your interface in a way that is not too painful.

If we had a #[repr(MoreThanC)] or something, that extends #[repr(C)] with support for more types (like Vecs, Strings) then you effectively solve the "plugin system" use-case: plugin interfaces typically involve only simple types anyway because you have to keep them stable, and this would make it faster, safer and simpler than if you had to use an interop library.

Diggsey on 31 Aug 2019

@Diggsey You could probably prototype parts of the Swift approach (i.e. adding a bit of indirection to hide potential differences, while trying to minimize overhead) outside of the language.

You don't need full "serialization", just enough to provide access without relying on any assumptions.

For example, proc_macro::bridge::buffer contains minimal (e.g. T: Copy-only) versions of &[T] and Vec<T> that can be safely passed between two Rust "worlds" (potentially compiled by incompatible-ABI compilers and using incompatible global allocators), without significant overhead (e.g. when extending the Vec you only need to do a dynamic call once you run out of capacity).

We don't want to bake anything like stable ABIs into the standard library because that puts hard limits on what we can do with the implementation, whereas any experiments in the ecosystem could thrive and go through many iterations, with just proc macros and traits.

Since you mention serialization, here's an analogy: a stable ABI being used by the standard library is like a stable serialization framework that's baked into every single type supporting it and you can't version it.

There's only one way data is represented in memory, and Rust is already having trouble taking advantage of that representation, due to the compilation model.
So I'd rather move in the direction of delaying that choice of representation while also giving people who really need it more fine-grained control over custom representations (e.g. bit-packing).

eddyb on 31 Aug 2019

👍3

a stable ABI being used by the standard library is like a stable serialization framework that's baked into every single type supporting it and you can't version it.

GCC had to deal with this for C++11, and they ended up forking stuff with ABI tags.
https://developers.redhat.com/blog/2015/02/05/gcc5-and-the-c11-abi/

I mention this as a cautionary tale.

cuviper on 31 Aug 2019

👍3

Interesting, it sounds like "Define a Rust ABI" actually means roughly two-ish things:

If you want broader dynamic linking, then you pass some dyn(v1) Trait across the interface. It'd require do a stable trait based interfaces for the data structures, but this bring other benefits too. I'd think this might facilitate doing some large GUI toolkit in Rust for example.

If you want to "Rust OS", then you want some #[repr(redox-v1)] that constrains type parameters, probably disallows type generics, but lifetimes definitely work, and const generics might work, maybe at the cost of making them untestable in where clauses.

There is an extended version of this second form where you specify #[repr(wasm-v1)] to write directly into some form that ensures compatibility even with some virtual machine.

I'd think #[repr(redox-v1)] would reduce the depth of type erasure required to exploit dyn(v1) Trait, so maybe that's logically first.

burdges on 1 Sep 2019

😕1

Given that Rust already lets you switch between a bunch of different ABIs with repr and extern as it is, I don't see the harm in simply adding one that freezes the current unstable ABI as a stable one that you would have to explicitly ask for, while the unstable one would remain the default. In the future, if there's a better design, just add that one as a selectable ABI as well. I don't think it's the right course of action to wait until we've figured out the “ABI to end all ABIs” before we offer even a single Rust-specific ABI, especially considering that it looks like work on coming up with that perfect ABI isn't happening right now anyway.

Also consider that since most functions are only crate-internal, the issue of making sure the ABI is a good one isn't as pressing as C++, where everything is public by default.

Serentty on 20 Oct 2019

@Serentty Besides other reasons, the current rules can't be frozen because they don't exist.
You can't version implementation-defined behavior without copying the entirety of the implementation, it's not even implementation-specified, and ideally it should be specified at least by an RFC.

If an RFC for a specific ABI is presented, with details for all supported targets, that might work out.
But why would you bother with that when repr(C) and extern "C" exist?

Even if you have such an ABI, it won't be used by libstd types, just like the C one isn't.
(that's one of other reasons I alluded to above, probably the main one)

eddyb on 20 Oct 2019

(looks like eddyb and I were writing these replies at the same time)

I don't see the harm in simply adding one that freezes the current unstable ABI as a stable one that you would have to explicitly ask for, while the unstable one would remain the default

This expresses a common misunderstanding that "the current unstable Rust ABI" is something that is already fully implemented in a well-defined, well-understood way that we know how to support to everyone's satisfaction.

A large chunk of the work here is achieving consensus on what a "Rust ABI" is even supposed to be, and that whatever it's supposed to be would even be a desirable thing to stabilize. There are already loads of comments in this thread expressing disagreement over what it is or reasons why it's undesirable to ever stabilize any version of it.

Ixrec on 20 Oct 2019

Okay, so if there's no rigid documentation for how the current ABI works, then that is indeed a problem that prevents it from being frozen.

If an RFC for a specific ABI is presented, with details for all supported targets, that might work out.
But why would you bother with that when repr(C) and extern "C" exist?

Because the C ABI is still quite limited. Not supporting trait objects is a fairly big hurdle, for example.

Serentty on 20 Oct 2019

I just heard from a friend that trait pointers are currently an unstable feature in the C ABI. That actually brings it close to what I would want in a Rust ABI anyway. The only thing really left bothering me is that the standard library can't easily be shipped as separate form the executable, which would be a real bandwidth-saver over CDNs, but because of generics that's probably not entirely possible anyway.

Serentty on 21 Oct 2019

For reference, a blog post by @Gankra: How Swift Achieved Dynamic Linking Where Rust Couldn't

mzabaluev on 9 Nov 2019

👍10

Rfcs: Define a Rust ABI

Most helpful comment

All 86 comments

Related issues