rust 🚀 - Tracking Issue for inline assembly (`asm!`)

Note: if you'd like to report an issue in inline assembly, please report it as a separate github issue, and just link to this one. Please don't report issues in inline assembly as comments on this tracking issue.

joshtriplett on 26 May 2020

Should the asm! macro be available directly from the prelude as it is now, or should it have to be imported from std::arch::$ARCH::asm? The advantage of the latter is that it would make it explicit that the asm! macro is target-specific, but it would make cross-platform code slightly longer to write.

Definitely want the explicit import. This would also make it a bit clearer what's going on when somebody tries to compile an old-style asm! with a newer compiler, which is going to happen a lot: you'd get an unresolved symbol rather than mysterious syntax errors.

BartMassey on 26 May 2020

👍4

On Mon, May 25, 2020 at 03:56:53PM -0700, Bart Massey wrote:

Definitely want the explicit import. This would also make it a bit clearer what's going on when somebody tries to compile an old-style asm! with a newer compiler, which is going to happen a lot: you'd get an unresolved symbol rather than mysterious syntax errors.

Rust will actually catch many instances of this. Any use of the old
asm! syntax that has any operands at all will get caught in the parser
(because it uses a ':'), and result in a very clear error telling the
user about the syntax transition, and a hint suggesting llvm_asm!. (At
some point that hint should also start suggesting the use of the new
syntax.)

The only case that wouldn't be caught in the parser would be if you have
an asm! that has no inputs, no outputs, no clobbers, and no options,
so it just looks like asm!("...");. In that case, on x86, it'll give
you an assembly syntax error from the backend if you used AT&T syntax.

We might, theoretically, be able to do a little better than that, as
well; looking into that.

joshtriplett on 26 May 2020

👍2

Minor compatibility concern: asm! supports some LLVM-isms like C style comments /* comment */ (https://github.com/rust-lang/rust/pull/73056#issuecomment-640446400).

Other backends using external assemblers may need to do some pre-processing before passing the asm text to them.

petrochenkov on 8 Jun 2020

Minor compatibility concern: asm! supports some LLVM-isms like C style comments /* comment */ (#73056 (comment)).

GNU Assembler (IIRC defined as the official assembly dialect by the RFC), and by extension GCC, supports C-style comments too, so they are definitely not LLVM-specific.

programmerjake on 9 Jun 2020

Namespacing the asm! macro

I am also in favor of explicit imports from arch modules. I would suggest going even further and introduce separate macros for each supported target (e.g. asm_x86!, asm_arm!, etc.).

newpavlov on 9 Jun 2020

Some feedback from various places in response to the blog post. (I'm skipping general expressions of awesomeness, though there have been many; I'm quoting those that have specific feedback.)

@amluto (Andy Lutomirski, prominent Linux kernel developer who works on a lot of low-level x86) at https://news.ycombinator.com/item?id=23467108 :

Rust folks, thank you so much for making this far, far better than GCC’s asm syntax for C. Also, thank you for using Intel x86 syntax instead of AT&T.
It would be delightful if GCC were to adopt something similar after this stabilizes in Rust.

Several people on HN were a little confused about the backward-compatibility story here; in the future, when we talk about features that have only existed on nightly, we need to be more clear in the future about messaging around our stability policy.

Several people wondered why this used a string constant; it'd be good to explain that in documentation. And this doesn't mean the assembly isn't parsed, it means it isn't parsed by rustc (it's parsed by LLVM in the backend).

We really should have mentioned, in the blog post, that AT&T syntax was supported with options(att_syntax), so that people didn't think they have to translate all their assembly. I've prepared an update to the blog post mentioning that.

Someone mentioned preferring an "explicit clobber" syntax (like clobber("ax")) rather than having to write out("ax") _. I can understand that.

Closely related, I think we should definitely have both "clobber all function-clobbered registers" and "clobber all general-purpose registers" options.

joshtriplett on 9 Jun 2020

❤7

I’ll add one more comment: please document what happens if you try to use not-quite-general-purpose registers as operands (clobber or otherwise). The most important case is the PIC register. GCC does not appreciate inline asm that clobbers the PIC register. IMO it should be allowed, especially for things like CPUID on x86_32.

This also includes thinks like EBP/RBP. If I’m building with frame pointers on or I do something that forces a frame pointer, is RBP available? RSP is another example — presumably it’s not available.

amluto on 9 Jun 2020

Several people wondered why this used a string constant; it'd be good to explain that in documentation. And this doesn't mean the assembly isn't parsed, it means it isn't parsed by rustc (it's parsed by LLVM in the backend).

Actually what I feel a lot of people are really asking is: "Why isn't this just like MSVC's __asm or D's inline assembly where I can just write the asm directly and the compiler will figure out the input/output/clobbers automagically?"

The issue with that approach is that in a lot of cases, we can't actually figure out the constraints just from the assembly code. Common examples are call and syscall instructions which can have an arbitrary ABI.

Finally, if you really want it, you can parse the asm code in a proc macro and derive the necessary input/output/clobber operations from it. In fact, this is what Clang does to support the MS __asm: it parses the ASM in the front-end (with some support from LLVM's MC layer) and rewrites the asm to use standard LLVM constraint codes.

I’ll add one more comment: please document what happens if you try to use not-quite-general-purpose registers as operands (clobber or otherwise). The most important case is the PIC register. GCC does not appreciate inline asm that clobbers the PIC register. IMO it should be allowed, especially for things like CPUID on x86_32.

This is actually quite tricky and I believe it is a bug for the back-end (GCC/LLVM) to silently accept the use of these registers and then generate wrong code (for example by not preserving the PIC base around the asm). This should be fixed in the back-end or at the very least cause a compile-time error.

We currently always disallow the use of the stack pointer and frame pointer as operands for inline asm in the front-end. However the exact set of reserved registers depends not only on the current target but can also be different depending on the properties of the function:

The frame pointer (EBP) is only reserved if the function needs a frame pointer (e.g. dynamic alloca).
If the function requires stack realignment then a "base pointer" is also reserved (EBP on x86).

These properties can vary depending on the optimization level and inlining so there is no way we can selectively enforce this in the front-end. A blanket ban on the use of registers that may be reserved is also a non-starter since these registers are commonly used (e.g. for syscall arguments).

Amanieu on 10 Jun 2020

👍2

One more issue I've observed in several contexts: everyone formats asm! statements differently, and we should 1) come up with guidance on how to do so, and 2) implement that guidance in rustfmt.

Notably, this includes how to format both single-line and multi-line assembly statements.

EDIT: originally, in this comment, I proposed an initial set of requirements. To keep format bikeshedding out of this tracking issue, I've moved that to an issue on the fmt-rfcs repo.

joshtriplett on 10 Jun 2020

❤5

I'm trying to implement an optimization barrier like black_box for a f64 value with asm!.

The following code is working so far. However, it seems that it's relying on an undocumented behavior of the compiler. Is there any legit way to avoid the "argument never used" error?

#[inline]
fn secure(mut x: f64) -> f64 {
    unsafe {
        // Won't compile without `# {0}` due to an "argument never used" error.
        asm!("# {0}", inout(xmm_reg) x, options(nomem, nostack));
    }
    x
}

rustc --version --verbose:

rustc 1.46.0-nightly (feb3536eb 2020-06-09)
binary: rustc
commit-hash: feb3536eba10c2e4585d066629598f03d5ddc7c6
commit-date: 2020-06-09
host: x86_64-unknown-linux-gnu
release: 1.46.0-nightly

Full example code (should be built with --release)

```rust
#![feature(asm)]

use core::arch::x86_64::*;

#[inline]
fn secure(mut x: f64) -> f64 {
unsafe {
asm!("# {0}", inout(xmm_reg) x, options(nomem, nostack));
}
x
}

#[inline]
fn mul(x: f64, y: f64) -> f64 {
secure(secure(x) * secure(y))
}

fn main() {
unsafe {
_MM_SET_ROUNDING_MODE(_MM_ROUND_UP);
}

  assert_ne!(-mul(-1.1, 10.1), mul(1.1, 10.1));

  unsafe {
      _MM_SET_ROUNDING_MODE(_MM_ROUND_NEAREST);
  }

}
```

mizuno-gsinet on 10 Jun 2020

_MM_SET_ROUNDING_MODE(_MM_ROUND_UP);

I don't think a blackbox is enough for LLVM to always use the rounding mode you want. I think LLVM is allowed to reset the rounding mode at any point in your code. I think your code is only guaranteed to work fine when you pass the -fp-model=strict LLVM argument, in which case the blackbox is not necessary anyway.

https://reviews.llvm.org/D62731

precise
By default, the compiler uses /fp:precise behavior.
[...]
The compiler generates code intended to run in the default floating-point environment and assumes that the floating-point environment is not accessed or modified at runtime. That is, it assumes that the code does not unmask floating-point exceptions, read or write floating-point status registers, or change rounding modes.
[...]
strict
[...]
Under /fp:strict, the compiler generates code that allows the program to safely unmask floating-point exceptions, read or write floating-point status registers, or change rounding modes.

bjorn3 on 10 Jun 2020

See #72965 and #73056 for the issue with unused arguments.

Amanieu on 10 Jun 2020

I don't think a blackbox is enough for LLVM to always use the rounding mode you want. I think LLVM is allowed to reset the rounding mode at any point in your code. I think your code is only guaranteed to work fine when you pass the -fp-model=strict LLVM argument, in which case the blackbox is not necessary anyway.

Thank you for your reply. -fp-model (or any equivalent option) does not seem to be implemented in rustc yet. Is there a tracking issue for that? (Sorry for being off-topic.)

See #72965 and #73056 for the issue with unused arguments.

Thank you for letting me know about them!

mizuno-gsinet on 10 Jun 2020

Another comment that seems relevant to capture:

For ARM, can we have the "." separators? They are standard for Aarch64 anyway, and they make the 32-bit code far easier to read. Like so:

add.s.ne
ldm.ia.cc

(in both orders, for those cases with two suffixes)

Are these supported by LLVM?

joshtriplett on 10 Jun 2020

Thank you for your reply. -fp-model (or any equivalent option) does not seem to be implemented in rustc yet. Is there a tracking issue for that? (Sorry for being off-topic.)

I thought -fp-model was an LLVM option. I just read the actual diff of patch I linked, and it turn out to be a clang option that causes it to emit LLVM float instructions with certain flags.

bjorn3 on 10 Jun 2020

😄1

For ARM, can we have the "." separators? They are standard for Aarch64 anyway, and they make the 32-bit code far easier to read.

Those aren't separators, they are part of the instruction name on AArch64. They do not exist on 32-bit ARM.

Amanieu on 10 Jun 2020

One of the things I still think might be missing in the current implementation is either noclobber outputs or a memory operand. These are extremely handy when implementing functions that are supersets of an existing ABI. For example, a function which has the x86_64 SYSV ABI but also passes in arguments in r10-r15.

I can do this today (this is a real function I'm writing):

struct Registers {
    rdi: usize,
    rsi: usize,
    rdx: usize,
    rcx: usize,
    r8: usize,
    r9: usize,
    r10: usize,
    r11: usize,
    r12: usize,
    r13: usize,
    r14: usize,
    r15: usize,
}

struct Context {
    // ... other fields
    registers: Registers
    // ... other fields
}

unsafe extern "C" fn handler(
    rdi: usize,
    rsi: usize,
    rdx: usize,
    rcx: usize,
    r8: usize,
    r9: usize,
    ret: usize,
    ctx: &mut Context
) -> usize {
    let r10: usize;
    let r11: usize;
    let r12: usize;
    let r13: usize;
    let r14: usize;
    let r15: usize;

    asm!(
        "",
        lateout("r10") r10,
        lateout("r11") r11,
        lateout("r12") r12,
        lateout("r13") r13,
        lateout("r14") r14,
        lateout("r15") r15,
        options(pure, nomem, nostack)
    );

    ctx.registers.rdi = rdi;
    ctx.registers.rsi = rsi;
    ctx.registers.rdx = rdx;
    ctx.registers.rcx = rcx;
    ctx.registers.r8 = r8;
    ctx.registers.r9 = r9;
    ctx.registers.r10 = r10;
    ctx.registers.r11 = r11;
    ctx.registers.r12 = r12;
    ctx.registers.r13 = r13;
    ctx.registers.r14 = r14;
    ctx.registers.r15 = r15;

    ret
}

However, the emitted assembler now does four pushes and four pops because it thinks r12-r15 have been clobbered.

I could pass in references to each field and do a mov [{}], r12. But this causes a lea to be generated for each mov just to ensure the address is in a register. The only way I can see around this is to pass in the ctx reference directly and manage the offsets myself. But this is really fragile and error prone.

GCC/Clang offers me multiple ways to solve this problem. But I don't see any possible solution for this under the current Rust proposal. Under GCC/Clang I can:

exclude the register from the clobber list (noclobberout("r12")?)
use an explicit register variable (let r12: usize = register!("r12");?)
use an m constraint on the operand (in(mem)?)

Regardless of how it is solved, this does seem to me to be a common use case that the current RFC doesn't address.

npmccallum on 5 Aug 2020

👍1

I don't understand what you are trying to do. handler is extern "C" and therefore uses the SysV ABI which requires r12-r15 to be preserved since they are modified by the asm code.

Amanieu on 5 Aug 2020

@Amanieu

I don't understand what you are trying to do.

I'm trying to preserve a CPU state for later evaluation before it is erased. This handler is a callback from a vDSO function provided by the Linux kernel (in the current SGX patches). It is called immediately after an SGX enclave exit to provide a way to capture the CPU state from the exiting enclave.

This might seems like a niche use case, but reading a register into a struct field isn't niche. This is very common in kernel development.

Another example is in this CPU context switching code. Notice how the offsets into the struct are manually specified to avoid the problem I've described. This is precisely the fragile and error-prone workaround that people will resort to if Rust's inline assembly can't solve their problems. And this is precisely why the GCC/LLVM memory constraint (m) exists.

handler is extern "C" and therefore uses the SysV ABI which requires r12-r15 to be preserved

Correct.

since they are modified by the asm code.

We are not modifying them. We are reading them. This is precisely why they are noclobber.

This function is a superset of the System V ABI. Under the System V ABI, the registers r12-r15 must be preserved and their initial contents are undefined. It is perfectly legitimate to extend that ABI by defining the initial contents of r12-r15. Again, this is very common in kernel development.

Example

Here's a simple example of the pattern which uses a type parameter to highlight the problem:

#[repr(C)]
struct Foo<T> {
    bar: T,
    baz: usize,
}

/// # Safety
/// This function is unsafe because it extends the calling convention.
/// The caller of this function MUST put the value of `baz` in `r12` before calling it.
unsafe extern "C" fn qux<T>(foo: &mut Foo<T>) {
    asm!("mov {}, r12", in(mem) &foo.bar);
}

The author is hoping for this as output, where N is the offset of baz:

qux:
    mov [rdi + N], r12
    ret

Notice that the author of this code cannot know the offset of baz in the struct Foo. Therefore, the author cannot write mov [rdi + N], r12 since the value of N is not known.

The author can specify the output pointer as in(reg) &foo.bar. But this forces the compiler to generate an additional lea instruction which may not be acceptable in performance critical code paths. It also consumes an additional register which is not acceptable in code like context switching routines since all available registers will be consumed by other inputs/outputs.

The author could also specify asm!("", out("r12") foo.bar);. However, this generates an additional pop and push instruction since the compiler thinks r12 was clobbered when it wasn't.

npmccallum on 5 Aug 2020

I'm trying to preserve a CPU state for later evaluation before it is erased. This handler is a callback from a vDSO function provided by the Linux kernel (in the current SGX patches). It is called immediately after an SGX enclave exit to provide a way to capture the CPU state from the exiting enclave.

You must write the whole function in assembly using a global_asm! block. If you use a function and then put an asm! block in there, the compiler is allowed to overwrite any register before entering the asm! block.

bjorn3 on 5 Aug 2020

@bjorn3 Yes, that works today. But I'm pointing out a legitimate usage pattern that people will work around in less than desirable ways if it is not solved. This problem will be greatly exacerbated when asm is stabilized but global_asm isn't.

npmccallum on 5 Aug 2020

@bjorn3 Also, global_asm is undesirable for other reasons. Things like mangling and documentation come to mind.

npmccallum on 5 Aug 2020

You could use a #[naked] function combined with a asm! block that contains mov instructions to move the registers to the destination you want. You can't write any rust code in the function then though.

The problem is that using asm! without #[naked] will never work for your use case. You need to implement a custom calling convention in the compiler to be able to write this without #[naked] or global_asm!. There is simply no way to tell LLVM that you want the values of specific registers at the start of the function without #[naked]. Even if there was a way to specify that you read the value of a register at the point of the asm! block without marking it as clobbering, without #[naked] LLVM is allowed to write any register it want before reaching the asm! block.

bjorn3 on 5 Aug 2020

As @bjorn3 said, your code is incorrect and does not do what you think it does. The RFC clearly states that:

Any registers not specified as inputs will contain an undefined value on entry to the asm block.

This means that your asm! is simply writing undefined values to your struct. The fact that these undefined values happen to be the input values of your function is a coincidence: the compiler is free to place any value it wants in those registers prior to executing the asm!.

If you are writing a function with a custom calling convention then the whole function must be written in assembly. This can be done either through global_asm! or a #[naked] function.

Amanieu on 5 Aug 2020

Regarding your other question about writing to struct fields, this can be done using const operands and the offset_of! macro (make sure you enable the unstable_const feature).

Amanieu on 5 Aug 2020

You could use a #[naked] function combined with a asm! block that contains mov instructions to move the registers to the destination you want. You can't write any rust code in the function then though.

Understood. In fact, I recently fixed rustc's incorrect code generation for #[naked] precisely because I needed this. I'm happy to use #[naked]. But we still need a way to get the offset into the assembly. And that has to come from outside and we shouldn't use a register for it.

npmccallum on 5 Aug 2020

As @bjorn3 said, your code is incorrect and does not do what you think it does.

I'm aware of those problems. I wrote up this quickly hoping you would get the gist. But it feels to me like you're picking apart ancillary issues rather than the core one: we need a usable way to get the offset of fields into the assembly from "outside".

Regarding your other question about writing to struct fields, this can be done using const operands and the offset_of! macro (make sure you enable the unstable_const feature).

I've already read through the entirety of the offset_of! crate. It is a very clever, but unsustainable hack for what is obviously a missing compiler feature. It also depends on a number of unstable features with no clear path to stabilization. So what do we do when asm is stable but offset_of! is not?

But even if we have offset_of!, there still remains a usability problem which is that doing something that should be easy (moving a register into a memory offset) is difficult for a capable developer to accomplish but reasonably easy for a compiler to accomplish (convert &my_struct.my_field to [reg + offset].

npmccallum on 5 Aug 2020

You could use a #[naked] function combined with a asm! block that contains mov instructions to move the registers to the destination you want. You can't write any rust code in the function then though.

The problem is that using asm! without #[naked] will never work for your use case. You need to implement a custom calling convention in the compiler to be able to write this without #[naked] or global_asm!. There is simply no way to tell LLVM that you want the values of specific registers at the start of the function without #[naked]. Even if there was a way to specify that you read the value of a register at the point of the asm! block without marking it as clobbering, without #[naked] LLVM is allowed to write any register it want before reaching the asm! block.

(vDSO maintainer here.)

Indeed. @npmccallum, your example doesn't work in GCC either unless you play horrible attribute games. If you're going to write a function with a custom calling convention, write the entire function in asm. There is a time and a place for inline asm, and this isn't it.

Even for context switching, trying to play these kinds of games with inline asm is not worth the pain. Linux used to try to do this, and we got rid of it. The part of the context switch logic in Linux that messes with GPRs is now a plain old asm function in a .S file. We call it from C. This way unwinding works as expected and we don't need to fight with the compiler.

amluto on 5 Aug 2020

@amluto It is my intent to do this with #[naked] as I've already outlined above. That still doesn't solve the problem of how to get field offsets into the assembly.

Ironically (since you're the vDSO maintainer), the thing that makes writing this whole function as assembly difficult is that I asked for a misc parameter on the SGX vDSO function and it doesn't appear that I'm going to get it. So now I have to write my CPU state struct as a sub-field of a wrapper to the exception info struct, which makes managing the offsets more painful because I now have to consider nested structs in my offset calculations.

npmccallum on 5 Aug 2020

@Amanieu FYI, the crate you recommended:

Doesn't compile in the released version.
No longer appears to produce const output with any combination of features.

Therefore, I don't see any way to get the offset of a field into assembly at all.

npmccallum on 5 Aug 2020

@amluto It is my intent to do this with #[naked] as I've already outlined above. That still doesn't solve the problem of how to get field offsets into the assembly.

Rust should IMO definitely gain a way to do memory access (along the lines of gcc's "m" and "rm").

But I really dislike naked. Many years ago I wrote some ISRs in naked C, and I felt very l33t. But even back then I didn't actually think that naked deserved to work, and I like it even less now. It seems very hard to make naked have actual semantics -- somehow it's supposed to replace the prologue and epilogue of a function, but with advanced compilers with features like shrink-wrapping, what does that even mean? Binding a variable to, say, register r12 means approximately nothing to me, and I hack on x86 assembly on a regular basis.

I would suggest that Rust consider addressing naked similarly to the way the new inline asm works. The brilliance of the new asm mechanism is that it more or less works like a function call. naked could be similar but in the opposite direction. Using gcc's C as an analogy, you can do this (not compile-tested, may contain any number of errors):

asm(
".globl foo\n\t"
"foo:\n\t"
"pushq %r12\n\t"
"movq %rsp, %rdi\n\t"
"callq c_foo\n\t"
"popq %r12\n\t"
"ret");

void c_foo(uint64_t *r12)
{
  /* Check it out, I have a pointer to r12, and the semantics are unambiguous. */
}

I have only two problems with this style of programming: the nasty multiline string and the fact that the final binary contains a pointless call, ret, and argument passing. But maybe it can be a model for something like:

pub extern "asm" fn foo() -> () {
  asm_prologue!("prologue here");
  asm_body!("C", (input args) -> (output type), body written in Rust);
  asm_epilogue!("epilogue here");
}

The idea is that this generates, literally, the prologue, then, inline, the asm body of the asm_body! part, which itself has the C ABI (or whatever ABI may be requested in the future), then the epilogue. So it's similar to:

fn:
  prologue_here
  callq body
  epilogue_here

and could, in fact, be correctly instantiated like that. But it runs a little faster because the call is omitted. The compiler helps by promising that the body will be compiled in such a way that control flow falls off the end instead of potentially having multiple ret instructions in the middle.

Sorry about the nonsense syntax. I'm not nearly enough of a Rust expert to have a good idea off the top of my head for how this should be spelled.

Ironically (since you're the vDSO maintainer), the thing that makes writing this whole function as assembly difficult is that I asked for a misc parameter on the SGX vDSO function and it doesn't appear that I'm going to get it. So now I have to write my CPU state struct as a sub-field of a wrapper to the exception info struct, which makes managing the offsets more painful because I now have to consider nested structs in my offset calculations.

Sorry about this. The ongoing ABI discussions have dragged on so long that I've mostly lost interest. If everyone involved reaches something like consensus about something that is functional, I'll give it one final review, but the amount of time this has all taken is absurd.

amluto on 5 Aug 2020

@amluto It is my intent to do this with #[naked] as I've already outlined above. That still doesn't solve the problem of how to get field offsets into the assembly.

If you use #[naked], you cannot use memory operands in any case. To quote the GCC manual (emphasis added):

The only statements that can be safely included in naked functions are asm statements that do not have operands. All other statements, including declarations of local variables, if statements, and so forth, should be avoided. Naked functions should be used to implement the body of an assembly function, while allowing the compiler to construct the requisite function declaration for the assembler.

#[naked] in Rust has not been stabilized or fully specified, but it's best to assume it has the same restrictions. It really ought to enforce them.

As an exception, it should be fine to use const operands in naked functions, since they are just expanded to a constant integer in the assembly string.

Thus, you can theoretically achieve what you want by passing the result of offset_of! as a const operand, but not in any other way.

I've already read through the entirety of the offset_of! crate. It is a very clever, but unsustainable hack for what is obviously a missing compiler feature. It also depends on a number of unstable features with no clear path to stabilization. So what do we do when asm is stable but offset_of! is not?

This is a real problem, but per above it's also an unavoidable problem for the specific use case you mentioned.

On the bright side, one part of the necessary functionality, a raw reference operator, will probably be stabilized (in temporary form) pretty soon.

But even if we have offset_of!, there still remains a usability problem which is that doing something that should be easy (moving a register into a memory offset) is difficult for a capable developer to accomplish but reasonably easy for a compiler to accomplish (convert &my_struct.my_field to [reg + offset].

This may still be a valid point with respect to other use cases.

comex on 6 Aug 2020

@amluto It is my intent to do this with #[naked] as I've already outlined above. That still doesn't solve the problem of how to get field offsets into the assembly.

If you use #[naked], you cannot use memory operands in any case. To quote the GCC manual (emphasis added):

The only statements that can be safely included in naked functions are asm statements that do not have operands. All other statements, including declarations of local variables, if statements, and so forth, should be avoided. Naked functions should be used to implement the body of an assembly function, while allowing the compiler to construct the requisite function declaration for the assembler.

#[naked] in Rust has not been stabilized or fully specified, but it's best to assume it has the same restrictions. It really ought to enforce them.

I don't see why Rust should emulate GCC here. GCC's behavior is IMO utterly useless. If the only valid thing to put in a naked function is asm without operands, then I see no reason to use a naked function at all. Top-level asm or a real asm file seems just as powerful and less error-prone.

I'm suggesting that naked functions would be genuinely useful if they could contain a real body written in Rust (or C or C++ or whatever). If GCC had this capability, Linux might use it for kernel entry code.

As an exception, it should be fine to use const operands in naked functions, since they are just expanded to a constant integer in the assembly string.

I would argue that a high-quality naked function design (in Rust or otherwise) would allow any operand as long as it doesn't refer to anything in function scope. In Linux, the asm entry code references globals, and, in inline asm, referencing globals via asm operands is often nicer than spelling them out explicitly.

amluto on 6 Aug 2020

I don't see why Rust should emulate GCC here. GCC's behavior is IMO utterly useless.

Rust depends on LLVM, which is emulating GCC's inline asm functionality. This means that the inline asm implementation in Rust is unable to do anything that GCC doesn't allow.

bjorn3 on 6 Aug 2020

I don't see why Rust should emulate GCC here. GCC's behavior is IMO utterly useless. If the only valid thing to put in a naked function is asm without operands, then I see no reason to use a naked function at all. Top-level asm or a real asm file seems just as powerful and less error-prone.

It is definitely an awkward design inherited from GCC.

There are a few inherent advantages of naked functions over top-level asm:

They can be dead-code eliminated if unused.
They can be generic, and they can reference their generic parameters in const operands, which amounts to rudimentary sort of templating system for asm functions. It might be possible to enhance this in the future.
The compiler will handle declaring the symbol with the right linkage type and section.

However, it's very strange that those features are available for functions but not data. (You could create a naked "function" that's actually data, but that's a total hack and you might not be able to get it in the right section.)

There is also an advantage of naked functions over top-level asm that's not inherent, but due to a limitation that exists in GCC and (more relevantly) in LLVM IR:

Top-level asm cannot have operands, not even immediate operands. For regular integers, that's not the end of the world; we could allow immediate operands on the frontend, and fully expand them in the asm string before passing to LLVM. In fact, the new function-level asm! already works that way for some reason.

But immediate operands also have the ability to reference other symbols in a way that prevents LLVM from dead-code-eliminating them, and there's no way to replicate that with top-level asm. Not sure how hard that would be to fix. (edit: wrong)

I'm suggesting that naked functions would be genuinely useful if they could contain a real body written in Rust (or C or C++ or whatever). If GCC had this capability, Linux might use it for kernel entry code.

How would this work, though? Would the compiler be forced to put everything in registers and not assume the existence of a stack? Even then, how would you tell it which registers are available for use?

Actually, that sounds interesting, but it's not something likely to be tackled by Rust, which mostly adopts LLVM's backend as-is.

I would argue that a high-quality naked function design (in Rust or otherwise) would allow any operand as long as it doesn't refer to anything in function scope. In Linux, the asm entry code references globals, and, in inline asm, referencing globals via asm operands is often nicer than spelling them out explicitly.

Hmm… in GCC and LLVM, i immediate operands can be symbol names which are kept as symbolic in the generated assembly, but it looks like the new asm! doesn't support this. @Amanieu Do you remember if this has been discussed before?

comex on 6 Aug 2020

Hmm… in GCC and LLVM, i immediate operands can be symbol names which are kept as symbolic in the generated assembly, but it looks like the new asm! doesn't support this. @Amanieu Do you remember if this has been discussed before?

We have a special sym operand type specifically for this.

Also note that it should be possible to add support for both sym and const operands to global_asm! (and in fact I've mentioned this before as an obvious next step for the inline asm project group).

Amanieu on 6 Aug 2020

There are a few inherent advantages of naked functions over top-level asm:

An additional benefit: the compiler handles name mangling and doesn't require an additional extern function declaration to be able to call it.

programmerjake on 6 Aug 2020

For the next ISA that gets the new inline assembly, can that be PowerPC64? We (Libre-SOC) are building a Libre-licensed PowerPC64LE CPU/GPU that uses Rust for a lot of the software we're writing. One thing we're currently using llvm_asm! for is testing our processor's instructions against a POWER9 server using power-instruction-analyzer. We are also writing a Vulkan driver that will probably need inline assembly.

programmerjake on 6 Aug 2020

Libre-SOC can potentially help fund implementing PowerPC64 inline assembly.

programmerjake on 6 Aug 2020

Adding support for a new architecture is actually quite straightforward since most of the work is already done in LLVM. You only need to make 2 changes:

Add the register definitions to src/librustc_target/asm/.
Add lowering to LLVM asm to src/librustc_codegen_llvm/asm.rs.

Have a look at #73214 which added inline asm support for Hexagon.

Amanieu on 6 Aug 2020

Adding support for a new architecture is actually quite straightforward since most of the work is already done in LLVM. You only need to make 2 changes:

Add the register definitions to src/librustc_target/asm/.

Add lowering to LLVM asm to src/librustc_codegen_llvm/asm.rs.

Have a look at #73214 which added inline asm support for Hexagon.

Sounds good! PowerPC seems likely to be quite a bit more complex since it has lots of weird SIMD registers and condition registers, so it is probably closer to x86 in complexity.

programmerjake on 6 Aug 2020

Created a tracking bugreport on Libre-SOC's bugtracker: https://bugs.libre-soc.org/show_bug.cgi?id=451

programmerjake on 6 Aug 2020

#[naked] in Rust has not been stabilized or fully specified, but it's best to assume it has the same restrictions. It really ought to enforce them.

Agreed. I want to help here.

I took a stab at writing an RFC. It needs more work, but I'd like to at least shop it around a bit first. Please let me know where I'm suggesting stupid things.

But even if we have offset_of!, there still remains a usability problem which is that doing something that should be easy (moving a register into a memory offset) is difficult for a capable developer to accomplish but reasonably easy for a compiler to accomplish (convert &my_struct.my_field to [reg + offset].

This may still be a valid point with respect to other use cases.

I think with the RFC above and a compiler-provided offset_of!(), we could be pretty close to something that's very usable.

npmccallum on 6 Aug 2020

Hmm… in GCC and LLVM, i immediate operands can be symbol names which are kept as symbolic in the generated assembly, but it looks like the new asm! doesn't support this. @Amanieu Do you remember if this has been discussed before?

We have a special sym operand type specifically for this.

Ah, I see.

Also note that it should be possible to add support for both sym and const operands to global_asm! (and in fact I've mentioned this before as an obvious next step for the inline asm project group).

Possible, yes, but only by changing LLVM, since LLVM IR module-level assembly doesn't support operands and the alternative of pasting symbol names into the asm string wouldn't be sufficient to tell LLVM the target symbols must not be dead-code-eliminated. (edit: wrong)

comex on 6 Aug 2020

Possible, yes, but only by changing LLVM, since LLVM IR module-level assembly doesn't support operands and the alternative of pasting symbol names into the asm string wouldn't be sufficient to tell LLVM the target symbols must not be dead-code-eliminated.

It's just a matter of marking any symbols referenced by global_asm! as #[used].

Amanieu on 6 Aug 2020

@npmccallum

I took a stab at writing an RFC. It needs more work, but I'd like to at least shop it around a bit first. Please let me know where I'm suggesting stupid things.

Nice! I think that's already in shape to submit as an RFC PR. (Edit: Maybe finish up the "TBD" sections first, though I don't think you need to say much.) I have some nits but will save them for that.

@Amanieu

It's just a matter of marking any symbols referenced by global_asm! as #[used].

Oh… you're right. I was thinking that that would prevent the target from being eliminated even if the user was itself eliminated, but module-level inline assembly can't be eliminated as unused in the first place. Welp, ignore me.

comex on 7 Aug 2020

Created a tracking bugreport on Libre-SOC's bugtracker: https://bugs.libre-soc.org/show_bug.cgi?id=451

€400 of funding available for adding PowerPC support to asm!!

programmerjake on 7 Aug 2020

@npmccallum

I took a stab at writing an RFC. It needs more work, but I'd like to at least shop it around a bit first. Please let me know where I'm suggesting stupid things.

Nice! I think that's already in shape to submit as an RFC PR. (Edit: Maybe finish up the "TBD" sections first, though I don't think you need to say much.) I have some nits but will save them for that.

I agree, nice RFC! It looks reasonable to me.

programmerjake on 7 Aug 2020

I took a stab at writing an RFC. It needs more work, but I'd like to at least shop it around a bit first. Please let me know where I'm suggesting stupid things.

That looks decent if quite limited, except for one giant nit. You say:

A naked function is a type of FFI function with a defined calling convention and a body which contains only assembly code which can rely upon the defined calling convention.

I suspect there are a few legitimate use cases for this, but I think you've ruled out almost all the useful cases. For example, I don't think one could validly write an x86 interrupt handler or an SGX vDSO handlers while satisfying this requirement. I would suggest fixing it by adding a companion feature and changing the wording slightly. How about:

A naked function is a type of FFI function with a defined calling convention or the "asm" calling convention and a body which contains only assembly code which can rely upon the specified calling convention.

And adding the "asm" calling convention as a new companion feature. The "asm" calling convention is simple: it describes a calling convention that is unknown to the Rust compiler. It is an error to attempt to call an extern "asm" function from Rust. It is legal to create pointers to extern "asm" functions and to reference them from inline asm.

I'm not sure whether extern "asm" function should be allowed to take arguments or return anything.

Does that make sense?

amluto on 7 Aug 2020

And adding the "asm" calling convention as a new companion feature. The "asm" calling convention is simple: it describes a calling convention that is unknown to the Rust compiler. It is an error to attempt to call an extern "asm" function from Rust. It is legal to create pointers to extern "asm" functions and to reference them from inline asm.

I'm not sure whether extern "asm" function should be allowed to take arguments or return anything.

Does that make sense?

I like that idea. Bikeshed: how about extern "unknown"? With "asm" I'm worried people might think 'it has asm! in it so I need extern "asm"', or something like that.

comex on 7 Aug 2020

And adding the "asm" calling convention as a new companion feature. The "asm" calling convention is simple: it describes a calling convention that is unknown to the Rust compiler. It is an error to attempt to call an extern "asm" function from Rust. It is legal to create pointers to extern "asm" functions and to reference them from inline asm.
I'm not sure whether extern "asm" function should be allowed to take arguments or return anything.
Does that make sense?

I like that idea.

me too!

Bikeshed: how about extern "unknown"? With "asm" I'm worried people might think 'it has asm! in it so I need extern "asm"', or something like that.

I think unspecified is better than unknown, since unknown implies that no one knows what the calling convention is, whereas unspecified implies that you just aren't telling the compiler. Another probably even better option might be custom.

programmerjake on 7 Aug 2020

👍3

@npmccallum

I took a stab at writing an RFC. It needs more work, but I'd like to at least shop it around a bit first. Please let me know where I'm suggesting stupid things.

I think you should just open a draft pull request with what you have now, then, once you finish filling in all the TBD sections, change it to a non-draft. This will allow us to critique your RFC there instead of filling this issue with comments that are borderline offtopic. Additionally, when more people come to look at your RFC, they will be able to see our comments instead of the comments being in a totally different place and much harder to find.

programmerjake on 7 Aug 2020

👍2

Discussion on naked functions should probably continue on one of these:

#[naked] tracking issue #32408
https://github.com/rust-lang/rfcs/pull/2774 a PR to update the naked function RFC.

Amanieu on 7 Aug 2020

👍1

How do we rewrite a simple bittest intrinsic using this new syntax without the support of memory operands?

MSxDOS on 8 Aug 2020

By passing the address in as a register operand.

Amanieu on 8 Aug 2020

That doesn't work and I think it would be rather weird if it did - the memory address is not a register.

MSxDOS on 9 Aug 2020

It does work if you wrap it in [] to indicate a memory operand.

Amanieu on 9 Aug 2020

I see. I'd rather have in(mem) syntax here though. reg doesn't make much sense since there's no register involved.

MSxDOS on 9 Aug 2020

There is a register involved to hold the address you want to perform a bit test on.

Amanieu on 9 Aug 2020

Not necessary true. For an operand on stack, with in(mem) you can just access it with [rsp + offset], and for an readonly operand you can access it with [rip + offset]. I do see a use-case of in(mem) which would allow address with offsets.

nbdd0121 on 9 Aug 2020

Is there an asm! equivalent for this llvm_asm!? Note the %gs:${1:c}

llvm_asm!("movq %gs:${1:c}, $0" : "=r"(*word) : "ri"(offset + i * 8) :: "volatile");

WildCryptoFox on 14 Aug 2020

asm!("mov {}, [gs:{}]", lateout(reg) *word, in(reg) offset + i * 8);

Amanieu on 14 Aug 2020

❤2

I have noticed that Call Frame Information directives like .cfi_* are ignored in the new syntax. Maybe it would be a good idea to have a warning instead of silently ignoring them.

Also it looks to me like they can be really useful in naked functions to give hints to the debugger/unwind. Could there be an option to not skip them?

bkolobara on 18 Aug 2020

CFI directive should work exactly the same way they did in llvm_asm!. There has been no change to the handling of assembler directives.

Amanieu on 18 Aug 2020

I just made a small test app:

#![feature(llvm_asm, asm, naked_functions)]

#[naked]
pub unsafe extern "C" fn llvm_asm() {
  llvm_asm!(
    r#"
      .cfi_def_cfa %rbp, 1111111111
    "#
    : : : : "volatile")
}

#[naked]
pub unsafe extern "C" fn new_asm() {
  asm!(
    ".cfi_def_cfa %rbp, 2222222222",
  )
}

If i dump the DWARF info with: objdump --dwarf=frames target/debug/libdump_tmp.rlib I get:

target/debug/libdump_tmp.rlib(dump_tmp-3cf5821319461f68.3fgjxdvrhrasgjaa.rcgu.o):       file format Mach-O 64-bit x86-64

.debug_frame contents:

.eh_frame contents:

00000000 00000014 ffffffff CIE
  Version:               1
  Augmentation:          "zR"
  Code alignment factor: 1
  Data alignment factor: -8
  Return address column: 16
  Augmentation data:     10

  DW_CFA_def_cfa: reg7 +8
  DW_CFA_offset: reg16 -8
  DW_CFA_nop:
  DW_CFA_nop:

00000018 0000001c 0000001c FDE cie=0000001c pc=fffffce8...fffffce9
  DW_CFA_def_cfa: reg6 +1111111111

00000038 0000001c 0000003c FDE cie=0000003c pc=fffffcd8...fffffcd9
  DW_CFA_nop:
  DW_CFA_nop:
  DW_CFA_nop:
  DW_CFA_nop:
  DW_CFA_nop:
  DW_CFA_nop:
  DW_CFA_nop:

There is no DWARF info for the second function.

Can it be that the nounwind option dismisses the the CFI information from the final result?

bkolobara on 18 Aug 2020

Since asm! uses intel syntax by default you need to use rbp instead of %rbp.

Amanieu on 18 Aug 2020

Thanks! I would have never figured this out. It's a bit unfortunate that the CFI directives just compile even with incorrect syntax.

bkolobara on 18 Aug 2020

I have run into another issue. I'm trying to mark the floating point registers as clobbered on AArch64 with out("v10") _, but this results in a non-trivial scalar-to-vector conversion, possible invalid constraint for vector type error. Is there another syntax of doing this or are this registers maybe not supported?

Another register I'm having trouble marking as clobbered is x30 (lr). I get a couldn't allocate output register for constraint '{x30}' error.

bkolobara on 20 Aug 2020

The first issue should have been fixed by #75014, try updating to the latest nightly.

Amanieu on 21 Aug 2020

Can you open an issue for the other one?

Amanieu on 21 Aug 2020

I have tested with the latest nightly, but the error still shows up. I have opened an issue: #75761.

bkolobara on 21 Aug 2020

One of the steps in the issue here:

LLVM version check (#69171 (comment))

Quoting the linked comment:

Let's add a new compile-time feature flag to rustc, something like llvm_inline_asm_ok, which indicates if we have an LLVM that should handle inline assembly without the known bugs we've encountered.

Before anyone could start working on this, they would need a list of these "known bugs". Do we have such a list?

bstrie on 12 Oct 2020

Basically intel syntax is broken on LLVM < 10.0.1. The workaround (which we use in libstd) is to use the att_syntax option and use AT&T syntax.

Amanieu on 12 Oct 2020

❤1

Is that the full list? Looks like there's an issue for that at https://github.com/rust-lang/rust/issues/76738 . In the meantime, what determines the minimum supported LLVM? If, for example, that were the only thing blocking stabilization of this feature, then that may be a persuasive argument to bump the min LLVM. Is that the only blocker? I don't see any others listed...

bstrie on 12 Oct 2020

@bstrie I believe asm! is still missing support for some tier-2 or tier-3 targets, while llvm_asm! supports them.
However, I am not sure this is a blocker.

lzutao on 12 Oct 2020

Should the asm! macro be available directly from the prelude as it is now, or should it have to be imported from std::arch::$ARCH::asm? The advantage of the latter is that it would make it explicit that the asm! macro is target-specific, but it would make cross-platform code slightly longer to write.

I guess both, and should support non std, so std::arch -> core::arch.
and core::arch::$ARCH::asm automatically means #[target_arch = "$arch"]

lygstate on 15 Oct 2020

Should not be available without an architecture qualifier, in my opinion.

As it stands right now, it's a code portability issue: the asm can compile but does not assemble because wrong assembly for the target. In the worst case it does assemble and does something way different than intended, although this is unlikely.

BartMassey on 16 Oct 2020

It's a tradeoff. There are multiple cases where it's safe to use asm! on multiple architectures, and in those cases, having to qualify asm! would make it more annoying to use.

One of the most common: it's possible to write inline assembly that works on both x86-32 and x86-64, and it'd be annoying to have to qualify asm! differently for that.

Another case would be if you've already handled the portability via separate macros generating assembly; for instance, you could have a make_jump("1f") and a 1: label.

Another case would be if you're using asm! primarily with assembler directives.

In all of those cases, it'd be annoying to have to have a big block of code just going #[cfg(...)] usecore::arch::$ARCH::asm;` for different architectures.

No objections to it also being available arch-qualified, though.

joshtriplett on 18 Oct 2020

👍3

Interesting. Good points. I feel like there's a lot going on here.

Seems like "generic x86" or "generic arm" might want to be its own architecture qualifier for this — you really do not want your arm assembly compiled for x86 or vice-versa.
I feel like the macros could import and qualify the appropriate architecture module? Which leads to…
I'm not sure why it would be safe to use the same assembler directives across the full range of architectures — does every architecture we support right now have an AT&T style assembler? Or are you going to get corner-cased when you try to compile for Power or something?

This whole issue seems complex to me. I want to do better than gcc-style inlines in which if you try to compile on the wrong architecture you get mysterious errors at best and no errors at worst. Maybe there's some clever trick that I'm missing here?

BartMassey on 18 Oct 2020

Interesting. Good points. I feel like there's a lot going on here.

* Seems like "generic x86" or "generic arm" might want to be its own architecture qualifier for this — you really do not want your arm assembly compiled for x86 or vice-versa.

There's even more going on, I think. At least on gcc, one can use all manner of various directives to create sections, create data structures in sections, add aliases, etc, and some of this is ELF-specific but has little or nothing to do with the architecture. These can be combined: one can write x86 assembler that will probably only work on ELF systems or systems that are ELF-like enough to understand what's going on.

I think that trying to make all of this explicit and to check that all the right qualifiers are in place is admirable but possibly quite difficult.

amluto on 18 Oct 2020

👍1

@BartMassey wrote:

* I'm not sure why it would be safe to use the same assembler directives across the full range of architectures — does every architecture we support right now have an AT&T style assembler? Or are you going to get corner-cased when you try to compile for Power or something?

This was roughly specified in the asm! RFC, and we did specifically say that roughly the common set of directives supported by both LLVM and GNU must be supported. And we also don't want to do a massive assembler parsing change to implement a passlist of directives.

That said...

@amluto wrote:

At least on gcc, one can use all manner of various directives to create sections, create data structures in sections, add aliases, etc, and some of this is ELF-specific but has little or nothing to do with the architecture. These can be combined: one can write x86 assembler that will probably only work on ELF systems or systems that are ELF-like enough to understand what's going on.

Yeah, that's fair; if you get creative with directives, you may well become more target-specific than just architecture, and in particular you're likely to be OS-specific or format-specific or even any number of things we don't currently include in our definition of what a target is.

Handling portability for assembler is more than just making sure you don't accidentally use ARM asm! on x86 or vice versa. Having to write core::arch::aarch64::asm!(...) won't handle portability issues like "how do I make a syscall", or "does my target use ELF and support weak symbols", or "do I need to reference symbols via the PLT". I'm not sure how much it helps to arch-qualify the asm! macro when all of the other issues remain.

I don't think it's unreasonable that if you use asm!, any semblance of portability becomes your responsibility.

joshtriplett on 19 Oct 2020

@joshtriplett "I don't think it's unreasonable that if you use asm!, any semblance of portability becomes your responsibility." That's fair.

I'm worried though, especially in the context of packages normally being automatically compiled from source before use: seems like we've left lots of possibilities for accidents due to not detecting assembly code from wrong architectures and/or dependencies during package builds. Consider

#![feature(asm)]

pub fn add_asm(x: i32, y: i32) -> i32 {
    let mut result: i32;
    unsafe { asm!(
        "mov {0}, {2}\nadd {1}, {2}",
        in(reg) x,
        in(reg) y,
        lateout(reg) result,
    )};
    result
}

This compiles and assembles with both --target=i586-unknown-linux-gnu and --target=arm-unknown-linux-gnueabihf. Not sure if both will do the same thing or not; it will depend on the default direction of mov and add on the two architectures, among other things.

If you build this for --target=x86_64-unknown-linux-gnu you'll get some nasty-looking warnings about register widths (why? it's clear from the widths of the operands what register widths are desired I think?), but it will assemble anyhow. Someone trying to install this package is going to be confused about how to proceed I think. Heck, I'm confused about how they should proceed — depends on my intent when I wrote the assembly, which is nowhere reflected here. I'm not sure there is any way to reflect it as things stand.

BartMassey on 20 Oct 2020

When will be merged into stable branch? asm is the most importance feature for the production. e.g. rust-libprobe use USDT to embedded asm and can be dynamic tracing the program.

houstar on 28 Oct 2020

😕1

On Tue, Oct 27, 2020 at 11:15:42PM -0700, Leno Hou wrote:

When will be merged into stable branch? asm is the most importance feature for the production. e.g. rust-libprobe use USDT to embedded asm and can be dynamic tracing the program.

asm! is also a large complex feature with a substantial surface area,
and we're watching for further reports from the ecosystem about how well
the new syntax is working.

If you've had experience with the new asm!, writing up your
experiences in detail would help us stabilize it sooner.

joshtriplett on 28 Oct 2020

👍3

I just used the new asm! to port this macro from libunwind: https://github.com/libunwind/libunwind/blob/v1.4.0/include/libunwind-aarch64.h#L215-L236.

https://github.com/sfackler/rstack/blob/141fa714e69e05d396342ee056807245f30a9384/unwind-sys/src/aarch64.rs#L148-L170

It feels pretty good to me. The ability to specify explicit registers for arguments is way less weird than having to use a register variable as with gcc's asm.

One bit of feedback is that it would be nice to be able to still refer to arguments placed in specific registers by their name/position rather than just having to use the register itself. Even though it's equivalent, using the argument's name in the asm can make it a bit more clear what's going on, and in cases like this where an existing bit of code is being ported make the Rust version more closely mirror the C version for ease of review.

sfackler on 2 Nov 2020

👍2

@sfackler

One bit of feedback is that it would be nice to be able to still refer to arguments placed in specific registers by their name/position rather than just having to use the register itself. Even though it's equivalent, using the argument's name in the asm can make it a bit more clear what's going on, and in cases like this where an existing bit of code is being ported make the Rust version more closely mirror the C version for ease of review.

I would like that as well. It's especially useful with argument names in long assembly blocks, though once that's allowed positions should be allowed too. It also makes refactoring easier: since you don't repeat the register name, you can just change the register in one place.

joshtriplett on 5 Nov 2020

My main concern is the possibility of confusion when there is a mismatch between the register size, value size and sub-register name. In such situations how should the register name be rendered in the code?

// All examples here on x86_64, so full registers are 64-bit wide.

// 64-bit value but eax is a 32-bit subregister
asm!("mov {}, {}", in("eax") 123u64);

// 16-bit value but eax is a 32-bit subregister
asm!("mov {}, {}", in("eax") 123u16);

Modifiers add yet another layer of confusion since it isn't clear what the default should be when no modifier is specified: should it default to the full register name (rax) or should it use the same name that was used in the argument?

There's also the rule that all operands must be used in the template string otherwise the compiler emits an error. This simply doesn't make sense for explicit register operands since most of the time they don't appear in the asm text (e.g. syscall arguments).

Amanieu on 13 Nov 2020

It seems like asm could just require that the value assigned to a named register must match the register's size.

Suppressing that rule for arguments assigned to a named register that don't provide an explicit name seems reasonable.

sfackler on 13 Nov 2020

The sub-register issue gets more complicated if you consider other architectures:

RISC-V only has a single register name x5 which is used for all data sizes.
AArch64 has x0 (64-bit) and w0 (32-bit) but the latter is also used with 8-bit and 16-bit values.

Amanieu on 13 Nov 2020

On Fri, Nov 13, 2020 at 08:01:34AM -0800, Amanieu wrote:

My main concern is the possibility of confusion when there is a mismatch between the register size, value size and sub-register name. In such situations how should the register name be rendered in the code?
// All examples here on x86_64, so full registers are 64-bit wide.

// 64-bit value but eax is a 32-bit subregister
asm!("mov {}, {}", in("eax") 123u64);

// 16-bit value but eax is a 32-bit subregister
asm!("mov {}, {}", in("eax") 123u16);

I can think of a few reasonable possibilities here.

First, we could require that if you use a template parameter to refer
to an explicit register, you must provide a modifier. That would be a
usability issue, but unambiguous. This seems better than not allowing a
template parameter to refer to an explicit register at all, at least.
But I don't think this is the right solution.

We could require that types always match what the named register can
accept. Or, we could lint if they don't match. So, in("eax") 123u32
would work, but in("eax") 123u16 would error or lint. I think there's
value in doing this to catch potential errors. It would mean larger
register tables for each architecture, but that doesn't seem too
onerous.

Whether we have that error/lint or not, I think we should always match
the type, not the named register.

Modifiers add yet another layer of confusion since it isn't clear what the default should be when no modifier is specified: should it default to the full register name (rax) or should it use the same name that was used in the argument?

Without a modifier specified, I definitely think we should default to
the register size that matches the type provided, just as we do with
in(reg) today.

There's also the rule that all operands must be used in the template string otherwise the compiler emits an error. This simply doesn't make sense for explicit register operands since most of the time they don't appear in the asm text (e.g. syscall arguments).

I think for explicit registers we should enforce a modified version of
that rule:

If you don't name the argument (in("rax") 42u64), you don't have to
refer to it positionally, since it may be used implicitly.
If you name the argument (meaning = in("rax") 42u64), you must use
that name in the template string.

joshtriplett on 13 Nov 2020

Without a modifier specified, I definitely think we should default to
the register size that matches the type provided, just as we do with
in(reg) today.

The current behavior is that a placeholder without a modifier will always default to the full register width (rax on x86-64, eax on x86) regardless of the input type. However the compiler will warn if there is a mismatch with the input type.

Amanieu on 14 Nov 2020

Rust: Tracking Issue for inline assembly (`asm!`)

Steps

Unresolved Questions

Namespacing the `asm!` macro

Implementation history

Most helpful comment

All 92 comments

Example

Related issues