Rust: vtable addresses differ based on the number of codegen units

Created on 21 Nov 2017 · 16Comments · Source: rust-lang/rust

I'm seeing the ptr::eq function from the standard library sometimes return false when it should return true. This is happening in debug mode in nightly and beta, but not stable, and not in any version in release mode.

Here's the simplest I could get a reproduction of the issue:

use std::ptr;
use container::Container;

trait Trait {}
struct Struct {}
impl Trait for Struct {}

mod container {
    use Trait;

    pub(crate) struct Container<'a> {
        pub(crate) item: &'a Trait,
    }
}

impl<'a> Container<'a> {
    fn new(item: &'a Struct) -> Container<'a> {
        Container { item }
    }
}

fn main() {
    let item = Struct {};
    let container = Container::new(&item);

    assert!(ptr::eq(container.item, &item));
}

Expected the assertion to pass, and it fails on beta and nightly, in debug mode only.

Most helpful comment

[T] is the other unsized case and comparison does behave reasonably there, so it's not a total lost cause.

sfackler on 21 Nov 2017

👍4

All 16 comments

The vtable pointers are different (try transmuting container.item and &item as &Trait to (usize, usize)). And the problem disappears with -C codegen-units=1 (the default is 16 in debug mode since beta). This points to the issue being something about deduplicating vtables between CGUs?

durka on 21 Nov 2017

👍1

I ran into this a while ago, and @eddyb said that we never make any guarantees around uniqueness of vtables: https://github.com/rust-lang/rust/pull/30485#issuecomment-166521737.

sfackler on 21 Nov 2017

👍3 😕1

@durka is there anything I can clarify?

sfackler on 21 Nov 2017

Something I'd like clarified - this is considered a bug right? It seems to me that If we never make any guarantees around uniqueness of vtables, then ptr::eq should not be comparing vtables. Is that how other people are seeing it?

PeterHatch on 21 Nov 2017

Not necessarily - imagine you had a

#[derive(Debug)]
#[repr(C)]
struct Foo {
    bar: u32,
}

let foo = Foo { bar: 0 };

let a: &Debug = &foo;
let b: &Debug = &foo.bar;

The data pointers of a and b are equal, but they probably shouldn't compare equal.

sfackler on 21 Nov 2017

👍1

It seems pretty weird that the same trait gets different vtables.

durka on 21 Nov 2017

Okay, but to clarify again - this is a discussion of how to fix the issue? Not whether there is an issue? Because I'm appreciating the discussion of what caused it, and potential issues with simple ways to fix it, but I'm still not completely sure if people agree that it is an issue that needs fixing?

To discuss the potential fix more - shouldn't they compare equal? From the documentation of ptr::eq ( https://doc.rust-lang.org/std/ptr/fn.eq.html) I expected them to compare equal. (Maybe that documentation should change?) I mean, it certainly could be surprising (especially when using the == operator directly, rather than the function), but that's kind of a weird situation, and I'm not even sure what you'd want in that case? It seems like either way could be more convenient some of the time.

Regardless, it seems important that, if there's going to be no guarantees about vtables, that doesn't also mean that there is no guarantee the comparing pointers to trait objects will ever return true. As far as I can tell, it's just not useful at that point?

PeterHatch on 21 Nov 2017

@durka Not all that weird - we could defer vtable instantiation until the final link, but then you'd have to do a bunch of rewriting of all of the rlibs you're linking to rather than just having each compilation unit create the vtables it needs.

@PeterHatch The documentation should definitely note the behavior around trait objects, yeah.

sfackler on 21 Nov 2017

Yeah, if the behavior of comparing pointers to trait objects depends on random details of the compilation environment like CGUs, LTO, etc, it seems like a footgun for the operator to have T: ?Sized at all. But since that ship has sailed, perhaps some loud documentation warnings and a clippy lint are in order.

durka on 21 Nov 2017

[T] is the other unsized case and comparison does behave reasonably there, so it's not a total lost cause.

sfackler on 21 Nov 2017

👍4

@arielb1 states:

If you have different codegen units (e.g. different modules) they can have different vtables. I think that's a bug.

shepmaster on 25 Nov 2017

Also note that you can throw away part of the fat pointer if all you care about is the data portion:

let a = container.item as *const _ as *const ();
let b = &item as *const _ as *const ();
assert!(ptr::eq(a, b));

shepmaster on 25 Nov 2017

Why would it be a bug? Maybe it's suboptimal? But we make no guarantees.

eddyb on 26 Nov 2017

It's worth giving a warning in ptr::eq about comparing *const Trait fat pointers.

I suppose Rc::ptr_eq or Arc::ptr_eq compares *mut RcBox<T>s or *mut ArcInner<T>s so the vtable pointer is there. Is this vtable pointer necessarily equal according to the semantics? I suppose no because it comes from outside so these need the same warning.

In other words, I can cast an Rc<Struct> to an Rc<Trait> which unsizes by attaching a vtable pointer outside the Rc<..>, right? If I do this in different compilation units, then this vtable pointer will be different, but the *mut RcBox<T> shall remain the same, right?

Should maybe Rc::ptr_eq and Arc::ptr_eq use this as *const () trick?

    pub fn ptr_eq(this: &Self, other: &Self) -> bool {
        this.ptr.as_ptr() as *const () == other.ptr.as_ptr() as *const ()
    }

burdges on 27 Aug 2018

Some points raised in duplicates:

Debug-printing the wide ptr ignores the vtable, so this leads to rather confusing assertion messages (the pointers compare unequal but print equal).
One proposal was to compared the vtable content instead of the vtable address.

Also @Diggsey raises this concern

fn identity(x: Box<dyn Trait>) -> Box<dyn Trait> { x }
If the compiler is able to prove that the function is only called with Box<Foo>, then the compiler would be free to return TraitRepr { ptr: x.ptr(), vtable: MyFooVtable }, even if it was passed a different vtable.

I don't think that is true. The behavior of the function is to return a copy of the argument, and I don't think we permit the mere copying of a wide ptr to change the vtable to a different but equivalent one. I don't see why the optimizer would do that, and I don't see why we would permit it in the spec.

RalfJung on 7 Mar 2020

I don't think that is true. The behavior of the function is to return a copy of the argument, and I don't think we permit the mere copying of a wide ptr to change the vtable to a different but equivalent one. I don't see why the optimizer would do that, and I don't see why we would permit it in the spec.

Devirtualization is the optimisation I was thinking of. I imagine we would eventually like to be able to compile code operating on a dyn Trait object to monomorphised code when the compiler can figure out the concrete type. At the boundary between the monomorphised code and the "trait object" code, we would have to reconstruct the vtable pointer, and this would come from the current codegen unit.