I'm seeing the ptr::eq function from the standard library sometimes return false when it should return true. This is happening in debug mode in nightly and beta, but not stable, and not in any version in release mode.
Here's the simplest I could get a reproduction of the issue:
use std::ptr;
use container::Container;
trait Trait {}
struct Struct {}
impl Trait for Struct {}
mod container {
use Trait;
pub(crate) struct Container<'a> {
pub(crate) item: &'a Trait,
}
}
impl<'a> Container<'a> {
fn new(item: &'a Struct) -> Container<'a> {
Container { item }
}
}
fn main() {
let item = Struct {};
let container = Container::new(&item);
assert!(ptr::eq(container.item, &item));
}
Expected the assertion to pass, and it fails on beta and nightly, in debug mode only.
rustc +beta --version --verbose
:
rustc 1.22.0-beta.3 (cc6ed0640 2017-11-13)
binary: rustc
commit-hash: cc6ed0640fbcd2dff95b4532fd12aa0d6c545f28
commit-date: 2017-11-13
host: x86_64-pc-windows-msvc
release: 1.22.0-beta.3
LLVM version: 4.0
rustc +nightly --version --verbose
:
rustc 1.23.0-nightly (5041b3bb3 2017-11-19)
binary: rustc
commit-hash: 5041b3bb3d953a14f32b15d1e41341c629acae12
commit-date: 2017-11-19
host: x86_64-pc-windows-msvc
release: 1.23.0-nightly
LLVM version: 4.0
The vtable pointers are different (try transmuting container.item
and &item as &Trait
to (usize, usize)
). And the problem disappears with -C codegen-units=1
(the default is 16 in debug mode since beta). This points to the issue being something about deduplicating vtables between CGUs?
I ran into this a while ago, and @eddyb said that we never make any guarantees around uniqueness of vtables: https://github.com/rust-lang/rust/pull/30485#issuecomment-166521737.
@durka is there anything I can clarify?
Something I'd like clarified - this is considered a bug right? It seems to me that If we never make any guarantees around uniqueness of vtables, then ptr::eq
should not be comparing vtables. Is that how other people are seeing it?
Not necessarily - imagine you had a
#[derive(Debug)]
#[repr(C)]
struct Foo {
bar: u32,
}
let foo = Foo { bar: 0 };
let a: &Debug = &foo;
let b: &Debug = &foo.bar;
The data pointers of a
and b
are equal, but they probably shouldn't compare equal.
It seems pretty weird that the same trait gets different vtables.
Okay, but to clarify again - this is a discussion of how to fix the issue? Not whether there is an issue? Because I'm appreciating the discussion of what caused it, and potential issues with simple ways to fix it, but I'm still not completely sure if people agree that it is an issue that needs fixing?
To discuss the potential fix more - shouldn't they compare equal? From the documentation of ptr::eq
( https://doc.rust-lang.org/std/ptr/fn.eq.html) I expected them to compare equal. (Maybe that documentation should change?) I mean, it certainly could be surprising (especially when using the ==
operator directly, rather than the function), but that's kind of a weird situation, and I'm not even sure what you'd want in that case? It seems like either way could be more convenient some of the time.
Regardless, it seems important that, if there's going to be no guarantees about vtables, that doesn't also mean that there is no guarantee the comparing pointers to trait objects will ever return true. As far as I can tell, it's just not useful at that point?
@durka Not all that weird - we could defer vtable instantiation until the final link, but then you'd have to do a bunch of rewriting of all of the rlibs you're linking to rather than just having each compilation unit create the vtables it needs.
@PeterHatch The documentation should definitely note the behavior around trait objects, yeah.
Yeah, if the behavior of comparing pointers to trait objects depends on random details of the compilation environment like CGUs, LTO, etc, it seems like a footgun for the operator to have T: ?Sized
at all. But since that ship has sailed, perhaps some loud documentation warnings and a clippy lint are in order.
[T]
is the other unsized case and comparison does behave reasonably there, so it's not a total lost cause.
@arielb1 states:
If you have different codegen units (e.g. different modules) they can have different vtables. I think that's a bug.
Also note that you can throw away part of the fat pointer if all you care about is the data portion:
let a = container.item as *const _ as *const ();
let b = &item as *const _ as *const ();
assert!(ptr::eq(a, b));
Why would it be a bug? Maybe it's suboptimal? But we make no guarantees.
It's worth giving a warning in ptr::eq
about comparing *const Trait
fat pointers.
I suppose Rc::ptr_eq
or Arc::ptr_eq
compares *mut RcBox<T>
s or *mut ArcInner<T>
s so the vtable pointer is there. Is this vtable pointer necessarily equal according to the semantics? I suppose no because it comes from outside so these need the same warning.
In other words, I can cast an Rc<Struct>
to an Rc<Trait>
which unsizes by attaching a vtable pointer outside the Rc<..>
, right? If I do this in different compilation units, then this vtable pointer will be different, but the *mut RcBox<T>
shall remain the same, right?
Should maybe Rc::ptr_eq
and Arc::ptr_eq
use this as *const ()
trick?
pub fn ptr_eq(this: &Self, other: &Self) -> bool {
this.ptr.as_ptr() as *const () == other.ptr.as_ptr() as *const ()
}
Some points raised in duplicates:
Also @Diggsey raises this concern
fn identity(x: Box<dyn Trait>) -> Box<dyn Trait> { x }
If the compiler is able to prove that the function is only called withBox<Foo>
, then the compiler would be free to returnTraitRepr { ptr: x.ptr(), vtable: MyFooVtable }
, even if it was passed a different vtable.
I don't think that is true. The behavior of the function is to return a copy of the argument, and I don't think we permit the mere copying of a wide ptr to change the vtable to a different but equivalent one. I don't see why the optimizer would do that, and I don't see why we would permit it in the spec.
I don't think that is true. The behavior of the function is to return a copy of the argument, and I don't think we permit the mere copying of a wide ptr to change the vtable to a different but equivalent one. I don't see why the optimizer would do that, and I don't see why we would permit it in the spec.
Devirtualization is the optimisation I was thinking of. I imagine we would eventually like to be able to compile code operating on a dyn Trait
object to monomorphised code when the compiler can figure out the concrete type. At the boundary between the monomorphised code and the "trait object" code, we would have to reconstruct the vtable pointer, and this would come from the current codegen unit.
Most helpful comment
[T]
is the other unsized case and comparison does behave reasonably there, so it's not a total lost cause.