Hi,
I just got a crash on Travis, on stable (1.35.0), when enabling edition="2018"
Commands ran by Travis:
git clone https://github.com/fflorent/nom_locate.git
cd nom_locate
git checkout -qf 77e08c3bfb1cadee8176f96ea78ba8c68308d9c8
cargo build --no-default-features --features=""
cargo test --no-default-features --features=""
which causes this output:
Updating crates.io index
Compiling nom_locate v0.3.1 (/home/travis/build/fflorent/nom_locate)
Finished dev [unoptimized + debuginfo] target(s) in 0.47s
Compiling nom_locate v0.3.1 (/home/travis/build/fflorent/nom_locate)
error: internal compiler error: src/librustc/dep_graph/graph.rs:688: DepNode Hir(nom_locate[4d08]::tests[0]::it_should_capture_position[1]::{{misc}}[4]::{{misc}}[0]) should have been pre-allocated but wasn't.
thread 'rustc' panicked at 'Box<Any>', src/librustc_errors/lib.rs:635:9
note: Run with `RUST_BACKTRACE=1` environment variable to display a backtrace.
error: aborting due to previous error
note: the compiler unexpectedly panicked. this is a bug.
note: we would appreciate a bug report: https://github.com/rust-lang/rust/blob/master/CONTRIBUTING.md#bug-reports
note: rustc 1.35.0 (3c235d560 2019-05-20) running on x86_64-unknown-linux-gnu
note: compiler flags: -C debuginfo=2 -C incremental
note: some of the compiler flags provided by cargo are hidden
But I can't reproduce this on my computer.
Link to the Travis log: https://travis-ci.org/fflorent/nom_locate/jobs/558169389
Cannot reproduce with local 1.35.0. Build passes with the exact same compiler, git commit and command line. Maybe something travis specific? Can you try to upgrade to the latest 1.36.0 and see if it still happens?
It works on 1.36.0-beta.7 on travis: https://travis-ci.org/fflorent/nom_locate/jobs/558169394
@ProgVal does this happen even after clearing the Travis cache?
That fixed it.
Should I close the issue?
No, that means it's an incremental compilation bug. Finding a way to reproduce this would be good.
I just got the same error on my project, and I am able to reproduce it first building on a first commit and then immediately building again on the next commit.
I am going to try to make an MCVE but if anyone wants to try this right now, here's the repo and the relevant commits (of course anyone is allowed to use this to create an MCVE or fix the bug in some other way):
Repo: https://github.com/nasso/screensnap
~~First commit (build from scratch from there): https://github.com/nasso/screensnap/commit/6e1e327e5f0d93c333af357114b07c10d999a768
~~Second commit (checkout that commit and trigger an incremental build from the previous one): https://github.com/nasso/screensnap/commit/56a104e2cbddb1479597a82be90ae9d787a75b6c~~
Here's what rustc
gives me:
error: internal compiler error: src\librustc\dep_graph\graph.rs:688: DepNode Hir(screensnap[29a7]::screengrab[0]::Screenshot[0]::data[0]) should have been pre-allocated but wasn't.
thread 'rustc' panicked at 'Box<Any>', src\librustc_errors\lib.rs:637:9
note: Run with `RUST_BACKTRACE=1` environment variable to display a backtrace.
error: aborting due to previous error
note: the compiler unexpectedly panicked. this is a bug.
note: we would appreciate a bug report: https://github.com/rust-lang/rust/blob/master/CONTRIBUTING.md#bug-reports
note: rustc 1.36.0 (a53f9df32 2019-07-03) running on x86_64-pc-windows-gnu
note: compiler flags: -C debuginfo=2 -C ar=D:\Programmes\msys64\mingw64\bin\ar.exe -C linker=D:\Programmes\msys64\mingw64\bin\gcc.exe -C incremental --crate-type bin
note: some of the compiler flags provided by cargo are hidden
error: Could not compile `screensnap`.
Now I'm going to try to find a simpler way to reproduce it, but I'm leaving this here in case I wasn't able to.
Edit: see my message below
Okay I got it:
cargo new --lib test
foo
:pub trait Something {
fn foo();
}
cargo build
pub struct Something {
foo: u8,
}
cargo build
again; here the error shows up for me.Can confirm the mcve above! 馃帀
// revisions: rpass1 rpass2
#[cfg(rpass1)]
pub trait Something {
fn foo();
}
#[cfg(rpass2)]
pub struct Something {
pub foo: u8,
}
fn main() {}
Regression in nightly-2019-05-06 (looking for the exact commit)
regression in d65e721ef824becd76773368718701cd0db83e59
CC https://github.com/rust-lang/rust/pull/60525 @eddyb
@rustbot modify labels: -E-needs-bisection
You have to be kidding me. Just from #60525 causing a regression, we can tell that:
trait Something
's method foo
struct Something
's field foo
used to have different DefPath
s but don't after that PR.
Now they're both {type:Something}::{value:foo}
. But there should exist examples of DefPath
collisions before #60525, I don't think I introduced the only ones that cause this bug.
Probably the simplest thing to try is changing a method to an associated constant.
Or even a free function and a const
. I'll report back if I find another testcase.
EDIT: I can't get those to trigger an ICE, traits might be special :(.
cc @Zoxc @michaelwoerister Any idea how matching DefPath
s can cause an ICE? AFAIK you would still have the different HIR, which would result in the old query results being discarded.
Compiler team triage:
Marking as P-high as this is some sort of regression. No assignee yet.
Hi, I'm affected by this too, I saw from comment it's related to incremental build. What command should I use to workaround this?
Use cargo clean
when you encounter it. It should not happen unless you change something between builds. Or you could disable incremental compilation
cc @Zoxc @michaelwoerister Any idea how matching
DefPath
s can cause an ICE? AFAIK you would still have the different HIR, which would result in the old query results being discarded.
DefPaths a probably expected to be unique in many places.
@michaelwoerister Not sure that makes sense to me, they're based on names, and names can get reused - sounds maybe like a small red-green oversight?
Or maybe just older quirks of the first incremental attempts, that stuck around?
they're based on names, and names can get reused
reused as in "the same name can refer to different things in subsequent compilation sessions" or reused as in "there can be two things with the same name in the same scope"?
The former should be handled just fine incremental compilation. That latter should lead to differing disambiguation indices in the given DefPath
components.
@michaelwoerister The former, it's what's causing this ICE.
Something::foo
changes from a trait
method to a struct
field and that breaks incremental.
Yes, that's a bug then. Just wanted to clarify.
Hey so @aturon and I were talking and he's going to take a crack at this -- @michaelwoerister and @eddyb, neither of you have any in-progress fixes, right?
No, I don't think anyone has gotten farther than the original testcases.
I tried to construct other examples that result in the same def paths (some of them have been like that forever), but they're not drastic enough to cause the bug to show up.
I haven't really been looking into this.
I encountered this but too today. rustc 1.38.0 (625451e37 2019-09-23)
Did exactly the same. Had a trait and made it a struct later on. Different name and field though, which remained the same.
Only occurs if cargo is in the game. not if compiled with rustc ..rs
@aspera-non-spernit You can reproduce with rustc
if you pass -C incremental=some-directory
(Cargo passes -C incremental=target/incremental
I think).
reassigning to self
Reproduced locally. Looking into it.
Potentially interesting observation: the bug only arises when replacing a trait with a struct. (Going in the other direction does not ICE.)
(I haven't looked at this since I last checked in, but I want to look more into it, because its one of the few cases of an incremental compilation ICE where we actually have enough information to reproduce the bug locally...)
The problem seems to be that as_local_hir_id
called from type_of
is untracked and reveals the HirId
without adding a dependency. When Something::foo
is changed from a method to a field, this HirId
will change without causing type_of
to be re-executed.
https://github.com/rust-lang/rust/pull/68289 just added a workaround for this, but doesn't really fix the issue.
I see 3 ways we can fix this:
as_local_hir_id
to track the dependencyHirId
into DefPathData
so type_of
receives this information as an argument instead of discovering it through as_local_hir_id
DefIndex
to HIR owners (HirId
's with 0
as the local item)local item
Nitpick: ItemLocalId
is meant to mean "item-local (node) ID", maybe something like IntraOwnerId
would be better at describing this but unsure.
DefId
and HIR owners matching is something I want in the medium-term but I'm unsure how long it would take to get there.
@Zoxc do you think it was a mistake to add the "workaround" in PR #68289 ?
My observations was that these bugs were so hard to track down in practice, they simply weren't getting fixed, at least not in any reasonable time frame. Bug reporters almost never provided enough context for us to reproduce them, which meant they were not actionable on our end.
Now, if the ICE here were preventing an even worse outcome, such as incorrect codegen, then I would want to keep the ICE in place.
But my understanding, at least for PR #68289, that the alternative to ICE'ing in that case was to just do some potentially unnecessary re-compilation work. Which is unfortunate and wasteful, but not the same sort of criticial misstep that could really have bad conseqences downstream, in my opinion.
if we had some infrastructure in place to more readily reproduce such bugs when they arise, then I would be more readily convinced that we could remove a workarounds like PR #68289.
Neither you nor @michaelwoerister found the root cause, so I don't think the workaround PR was a mistake, but I do think it should be reverted once the root cause is fixed (it seems quite specific anyway).
I think auditing information reachable from TyCtxt
is more effective than the incremental bug reports we get. We should also audit manual HashStable
implementations (we sort collections for some reason, which is incorrect and slow, and there's probably other bugs lurking). After https://github.com/rust-lang/rust/pull/68944 showed incremental regressions I looked over the HIR map and found some bugs. The Definitions
type might hold some other bugs too. Having query system from the start (before parsing) would help prevent bugs by ensuring things go through queries instead of through TyCtxt
.
I'd like to understand this in more detail. I think #68289 is not a workaround, it just removes false positives from the assertion.
@Zoxc, do you know where in type_of
the error occurs? Is there an untracked decision made based on the HirId
? E.g. is as_local_hir_id
used to determine if this is an item or just a field? That would indeed be problematic.
However, that also sounds like a different issue. The assertion triggered here checks if the DepNode for all current HIR items exist as expected. The problem you are describing is that we miss a change in a downstream query. I'd like to know how the two relate? Did you run into this assertion while working on https://github.com/rust-lang/rust/pull/68944?
(we sort collections for some reason, which is incorrect and slow, and there's probably other bugs lurking)
Only sets and maps should be sorted in order for generating a stable hash value. Otherwise they'd still be "sorted" just by whatever logic and random runtime values determine their position in the collection.
@Zoxc, do you know where in
type_of
the error occurs? Is there an untracked decision made based on theHirId
? E.g. isas_local_hir_id
used to determine if this is an item or just a field? That would indeed be problematic.
type_of
calls tcx.hir().get(hir_id)
which uses the hir_id
to select which dep node to read. If the HIR id refers to a HIR owner, like the trait method Something::foo
, we read from Hir(Something::foo)
, if it doesn't refer to a HIR owner, like the struct field Something::foo
, then we read from the parent HIR owner, which will be Hir(Something)
.
In rpass1
, type_of(Something::foo)
will have a dependency on Hir(Something::foo)
.
In rpass2
, type_of(Something::foo)
will have a dependency on Hir(Something)
.
If our incremental state correspond to rpass1
and we try to build rpass2
. We'll try to mark type_of(Something::foo)
as green. We assume that the first dependency of type_of(Something::foo)
must always be executed in rpass2
since type_of(Something::foo)
has no information available to it to conditionally execute the first dependency. Hence we know that type_of(Something::foo)
has a dependency on Hir(Something::foo)
so executing Hir(Something::foo)
is safe. However that is incorrect, since we know that type_of(Something::foo)
in rpass2
should have a dependency on Hir(Something)
not Hir(Something::foo)
. The compiler will (incorrectly) still go ahead and try to force Hir(Something::foo)
which leads to this ICE (or another ICE in https://github.com/rust-lang/rust/pull/68944).
Hm, I think I see what's going on now.
From that description it looks to me like #68289 does the correct thing here:
rpass1
we register a dependency on Hir(Something::foo)
.rpass2
, where Something::foo
is not a standalone item anymore, weDepNode
for Hir(Something::foo)
(because Something::foo
has been removed as far as dep-tracking is concerned), and thenSo far this is the behavior I would expect. However, at that point things can still go wrong:
try_mark_green(Hir(Something::foo))
came back as None
, i.e. Hir(Something::foo)
is considered Red
and the caller of try_mark_green(Hir(Something::foo))
, i.e. try_mark_green(type_of(Something::foo))
concludes that type_of(Something::foo)
in order to find out if it has changed.force_from_dep_node(tcx, Something::foo)
is called, which in turn will call tcx.type_of(Something::foo)
but, alas, Something::foo
is a field now, so calling type_of
for it will ICE. Is that correct? I.e. is the ICE happening when trying to force type_of(Something::foo)
(as opposed to when trying to force Hir(Something::foo)
)?
Is that correct? I.e. is the ICE happening when trying to force
type_of(Something::foo)
(as opposed to when trying to forceHir(Something::foo)
)?
No. Forcing/executing type_of(Something::foo)
works fine in both rpass1
and rpass2
. This ICE happens in try_mark_previous_green(Hir(Something::foo))
. When we call try_mark_previous_green(Hir(Something::foo))
in rpass2
the compiler is already in an invalid state. We're trying to mark a dep node which is not used or needed in rpass2
as green. That must never happen.
If instead type_of
used an as_local_hir_id
query. It would have an additional dependency AsLocalHirId(Something::foo)
before the Hir
/ HirBody
dependency. In that case in rpass2
when marking type_of(Something::foo)
, we'd see that AsLocalHirId(Something::foo)
is red and exit early before we even get a chance to look at the non-existing Hir(Something::foo)
.
When we call try_mark_previous_green(Hir(Something::foo)) in rpass2 the compiler is already in an invalid state. We're trying to mark a dep node which is not used or needed in rpass2 as green. That must never happen.
I wouldn't call that an invalid state. The previous dep-graph is expected to contain nodes that don't exist in the current context anymore; it's how we deal with removed items. Failing to mark such removed dependencies as green is a way of handling this appropriately -- except that that strategy fails when query key reconstruction accidentally succeeds because of DefPathHash
overlaps, as demonstrated in this bug.
I can see how an as_local_hir_id
query would solve this particular problem. But the underlying problem would persist. Including the component-kind in the DefPathHash
might be a more fundamental fix. I'm not sure how viable it is.
Switching to a setup where inputs cannot be removed (e.g. because there is always exactly one input like (src-code, commandline-args)
) would also solve the problem.
@Zoxc and @michaelwoerister : am I correct in my understanding that with the "workaround", for lack of a better term, that landed in PR #68289, we should no longer see any ICE's here, and the remaining work item(s) here are about fixing more fundamental issues in the compiler internals that might allow us to remove that workaround?
If this is the case, then I would probably prefer that we open a fresh issue for that follow-up work, and (re)close this issue as fixed, if for no other reason than to keep the comment thread clean.
@pnkfelix I'm not entirely sure. I think the changes in #68289 make sense, but I also see how they might just "shift" the problem in some cases (as described here).
Also @Zoxc and I might have been talking about two separate problems, both of which can exist at the same time.
I think this particular issue should be closed as it talks about a "dep-node pre-allocation" failure (a false positive which is fixed by #68289). I'd like to understand more about the problem @Zoxc is seeing in https://github.com/rust-lang/rust/pull/68944. A separate issue with a reproduction case is a better venue for that. I would have expected that https://github.com/rust-lang/rust/pull/68944 makes most DepNode pre-allocation go away, but due to illness I haven't actually had the chance to review that PR in any detail.
In my efforts to add things to the glacier, I couldn't get the MVCE to ICE on the latest nightly.
Just wanted to confirm it again by renaming of trait to struct with rustc 1.41.0 (5e1a79984 2020-01-27)
Most helpful comment
Okay I got it:
cargo new --lib test
foo
:cargo build
cargo build
again; here the error shows up for me.