Rust: ICE in incr comp after s/trait/struct/: src/librustc/dep_graph/graph.rs:688: DepNode Hir(...) should have been pre-allocated but wasn't.

Created on 13 Jul 2019  路  42Comments  路  Source: rust-lang/rust

Hi,

I just got a crash on Travis, on stable (1.35.0), when enabling edition="2018"

Commands ran by Travis:

git clone https://github.com/fflorent/nom_locate.git
cd nom_locate
git checkout -qf 77e08c3bfb1cadee8176f96ea78ba8c68308d9c8
cargo build --no-default-features --features=""
cargo test --no-default-features --features=""

which causes this output:

    Updating crates.io index
   Compiling nom_locate v0.3.1 (/home/travis/build/fflorent/nom_locate)
    Finished dev [unoptimized + debuginfo] target(s) in 0.47s
   Compiling nom_locate v0.3.1 (/home/travis/build/fflorent/nom_locate)
error: internal compiler error: src/librustc/dep_graph/graph.rs:688: DepNode Hir(nom_locate[4d08]::tests[0]::it_should_capture_position[1]::{{misc}}[4]::{{misc}}[0]) should have been pre-allocated but wasn't.
thread 'rustc' panicked at 'Box<Any>', src/librustc_errors/lib.rs:635:9
note: Run with `RUST_BACKTRACE=1` environment variable to display a backtrace.
error: aborting due to previous error
note: the compiler unexpectedly panicked. this is a bug.
note: we would appreciate a bug report: https://github.com/rust-lang/rust/blob/master/CONTRIBUTING.md#bug-reports
note: rustc 1.35.0 (3c235d560 2019-05-20) running on x86_64-unknown-linux-gnu
note: compiler flags: -C debuginfo=2 -C incremental
note: some of the compiler flags provided by cargo are hidden

But I can't reproduce this on my computer.

Link to the Travis log: https://travis-ci.org/fflorent/nom_locate/jobs/558169389

A-incr-comp C-bug I-ICE P-high T-compiler regression-from-stable-to-stable

Most helpful comment

Okay I got it:

  1. Create a new library crate (this way unused stuff doesn't get optimized out): cargo new --lib test
  2. Create a public trait with a function named foo:
pub trait Something {
    fn foo();
}
  1. cargo build
  2. Change the trait to a structure (keeping the same name), and the function to a field (with the same name too). I don't think the type matters but I haven't tried with different types:
pub struct Something {
    foo: u8,
}
  1. cargo build again; here the error shows up for me.

All 42 comments

Cannot reproduce with local 1.35.0. Build passes with the exact same compiler, git commit and command line. Maybe something travis specific? Can you try to upgrade to the latest 1.36.0 and see if it still happens?

It works on 1.36.0-beta.7 on travis: https://travis-ci.org/fflorent/nom_locate/jobs/558169394

@ProgVal does this happen even after clearing the Travis cache?

That fixed it.

Should I close the issue?

No, that means it's an incremental compilation bug. Finding a way to reproduce this would be good.

I just got the same error on my project, and I am able to reproduce it first building on a first commit and then immediately building again on the next commit.

I am going to try to make an MCVE but if anyone wants to try this right now, here's the repo and the relevant commits (of course anyone is allowed to use this to create an MCVE or fix the bug in some other way):
Repo: https://github.com/nasso/screensnap
~~First commit (build from scratch from there): https://github.com/nasso/screensnap/commit/6e1e327e5f0d93c333af357114b07c10d999a768
~~Second commit (checkout that commit and trigger an incremental build from the previous one): https://github.com/nasso/screensnap/commit/56a104e2cbddb1479597a82be90ae9d787a75b6c~~

Here's what rustc gives me:

error: internal compiler error: src\librustc\dep_graph\graph.rs:688: DepNode Hir(screensnap[29a7]::screengrab[0]::Screenshot[0]::data[0]) should have been pre-allocated but wasn't.

thread 'rustc' panicked at 'Box<Any>', src\librustc_errors\lib.rs:637:9
note: Run with `RUST_BACKTRACE=1` environment variable to display a backtrace.
error: aborting due to previous error


note: the compiler unexpectedly panicked. this is a bug.

note: we would appreciate a bug report: https://github.com/rust-lang/rust/blob/master/CONTRIBUTING.md#bug-reports

note: rustc 1.36.0 (a53f9df32 2019-07-03) running on x86_64-pc-windows-gnu

note: compiler flags: -C debuginfo=2 -C ar=D:\Programmes\msys64\mingw64\bin\ar.exe -C linker=D:\Programmes\msys64\mingw64\bin\gcc.exe -C incremental --crate-type bin

note: some of the compiler flags provided by cargo are hidden

error: Could not compile `screensnap`.

Now I'm going to try to find a simpler way to reproduce it, but I'm leaving this here in case I wasn't able to.

Edit: see my message below

Okay I got it:

  1. Create a new library crate (this way unused stuff doesn't get optimized out): cargo new --lib test
  2. Create a public trait with a function named foo:
pub trait Something {
    fn foo();
}
  1. cargo build
  2. Change the trait to a structure (keeping the same name), and the function to a field (with the same name too). I don't think the type matters but I haven't tried with different types:
pub struct Something {
    foo: u8,
}
  1. cargo build again; here the error shows up for me.

Can confirm the mcve above! 馃帀

// revisions: rpass1 rpass2

#[cfg(rpass1)]
pub trait Something {
    fn foo();
}

#[cfg(rpass2)]
pub struct Something {
    pub foo: u8, 
}

fn main() {}

Regression in nightly-2019-05-06 (looking for the exact commit)
regression in d65e721ef824becd76773368718701cd0db83e59

CC https://github.com/rust-lang/rust/pull/60525 @eddyb

@rustbot modify labels: -E-needs-bisection

You have to be kidding me. Just from #60525 causing a regression, we can tell that:

  • trait Something's method foo
  • struct Something's field foo

used to have different DefPaths but don't after that PR.

Now they're both {type:Something}::{value:foo}. But there should exist examples of DefPath collisions before #60525, I don't think I introduced the only ones that cause this bug.

Probably the simplest thing to try is changing a method to an associated constant.
Or even a free function and a const. I'll report back if I find another testcase.

EDIT: I can't get those to trigger an ICE, traits might be special :(.

cc @Zoxc @michaelwoerister Any idea how matching DefPaths can cause an ICE? AFAIK you would still have the different HIR, which would result in the old query results being discarded.

Compiler team triage:

Marking as P-high as this is some sort of regression. No assignee yet.

Hi, I'm affected by this too, I saw from comment it's related to incremental build. What command should I use to workaround this?

Use cargo clean when you encounter it. It should not happen unless you change something between builds. Or you could disable incremental compilation

cc @Zoxc @michaelwoerister Any idea how matching DefPaths can cause an ICE? AFAIK you would still have the different HIR, which would result in the old query results being discarded.

DefPaths a probably expected to be unique in many places.

@michaelwoerister Not sure that makes sense to me, they're based on names, and names can get reused - sounds maybe like a small red-green oversight?
Or maybe just older quirks of the first incremental attempts, that stuck around?

they're based on names, and names can get reused

reused as in "the same name can refer to different things in subsequent compilation sessions" or reused as in "there can be two things with the same name in the same scope"?

The former should be handled just fine incremental compilation. That latter should lead to differing disambiguation indices in the given DefPath components.

@michaelwoerister The former, it's what's causing this ICE.
Something::foo changes from a trait method to a struct field and that breaks incremental.

Yes, that's a bug then. Just wanted to clarify.

Hey so @aturon and I were talking and he's going to take a crack at this -- @michaelwoerister and @eddyb, neither of you have any in-progress fixes, right?

No, I don't think anyone has gotten farther than the original testcases.
I tried to construct other examples that result in the same def paths (some of them have been like that forever), but they're not drastic enough to cause the bug to show up.

I haven't really been looking into this.

63150 looks like it probably is a duplicate of this.

I encountered this but too today. rustc 1.38.0 (625451e37 2019-09-23)

Did exactly the same. Had a trait and made it a struct later on. Different name and field though, which remained the same.

Only occurs if cargo is in the game. not if compiled with rustc ..rs

@aspera-non-spernit You can reproduce with rustc if you pass -C incremental=some-directory (Cargo passes -C incremental=target/incremental I think).

reassigning to self

Reproduced locally. Looking into it.

Potentially interesting observation: the bug only arises when replacing a trait with a struct. (Going in the other direction does not ICE.)

(I haven't looked at this since I last checked in, but I want to look more into it, because its one of the few cases of an incremental compilation ICE where we actually have enough information to reproduce the bug locally...)

The problem seems to be that as_local_hir_id called from type_of is untracked and reveals the HirId without adding a dependency. When Something::foo is changed from a method to a field, this HirId will change without causing type_of to be re-executed.

https://github.com/rust-lang/rust/pull/68289 just added a workaround for this, but doesn't really fix the issue.

I see 3 ways we can fix this:

  • use a query for as_local_hir_id to track the dependency
  • encode the HirId into DefPathData so type_of receives this information as an argument instead of discovering it through as_local_hir_id
  • make sure there's a 1:1 correspondence from DefIndex to HIR owners (HirId's with 0 as the local item)

local item

Nitpick: ItemLocalId is meant to mean "item-local (node) ID", maybe something like IntraOwnerId would be better at describing this but unsure.

DefId and HIR owners matching is something I want in the medium-term but I'm unsure how long it would take to get there.

@Zoxc do you think it was a mistake to add the "workaround" in PR #68289 ?

My observations was that these bugs were so hard to track down in practice, they simply weren't getting fixed, at least not in any reasonable time frame. Bug reporters almost never provided enough context for us to reproduce them, which meant they were not actionable on our end.

Now, if the ICE here were preventing an even worse outcome, such as incorrect codegen, then I would want to keep the ICE in place.

But my understanding, at least for PR #68289, that the alternative to ICE'ing in that case was to just do some potentially unnecessary re-compilation work. Which is unfortunate and wasteful, but not the same sort of criticial misstep that could really have bad conseqences downstream, in my opinion.


if we had some infrastructure in place to more readily reproduce such bugs when they arise, then I would be more readily convinced that we could remove a workarounds like PR #68289.

  • An example of such an infrastructure I have mused about: Snapshot the source for a crate on each successful compile (i.e. each compile where we emit build products that we will reuse as part of a subsequent incremental build). Then, when a compile fails due to an ICE originating from the incremental dep-graph infrastructure, request that user include the original source as well as the their changed version.

Neither you nor @michaelwoerister found the root cause, so I don't think the workaround PR was a mistake, but I do think it should be reverted once the root cause is fixed (it seems quite specific anyway).

I think auditing information reachable from TyCtxt is more effective than the incremental bug reports we get. We should also audit manual HashStable implementations (we sort collections for some reason, which is incorrect and slow, and there's probably other bugs lurking). After https://github.com/rust-lang/rust/pull/68944 showed incremental regressions I looked over the HIR map and found some bugs. The Definitions type might hold some other bugs too. Having query system from the start (before parsing) would help prevent bugs by ensuring things go through queries instead of through TyCtxt.

I'd like to understand this in more detail. I think #68289 is not a workaround, it just removes false positives from the assertion.

@Zoxc, do you know where in type_of the error occurs? Is there an untracked decision made based on the HirId? E.g. is as_local_hir_id used to determine if this is an item or just a field? That would indeed be problematic.

However, that also sounds like a different issue. The assertion triggered here checks if the DepNode for all current HIR items exist as expected. The problem you are describing is that we miss a change in a downstream query. I'd like to know how the two relate? Did you run into this assertion while working on https://github.com/rust-lang/rust/pull/68944?

(we sort collections for some reason, which is incorrect and slow, and there's probably other bugs lurking)

Only sets and maps should be sorted in order for generating a stable hash value. Otherwise they'd still be "sorted" just by whatever logic and random runtime values determine their position in the collection.

@Zoxc, do you know where in type_of the error occurs? Is there an untracked decision made based on the HirId? E.g. is as_local_hir_id used to determine if this is an item or just a field? That would indeed be problematic.

type_of calls tcx.hir().get(hir_id) which uses the hir_id to select which dep node to read. If the HIR id refers to a HIR owner, like the trait method Something::foo, we read from Hir(Something::foo), if it doesn't refer to a HIR owner, like the struct field Something::foo, then we read from the parent HIR owner, which will be Hir(Something).

In rpass1, type_of(Something::foo) will have a dependency on Hir(Something::foo).
In rpass2, type_of(Something::foo) will have a dependency on Hir(Something).

If our incremental state correspond to rpass1 and we try to build rpass2. We'll try to mark type_of(Something::foo) as green. We assume that the first dependency of type_of(Something::foo) must always be executed in rpass2 since type_of(Something::foo) has no information available to it to conditionally execute the first dependency. Hence we know that type_of(Something::foo) has a dependency on Hir(Something::foo) so executing Hir(Something::foo) is safe. However that is incorrect, since we know that type_of(Something::foo) in rpass2 should have a dependency on Hir(Something) not Hir(Something::foo). The compiler will (incorrectly) still go ahead and try to force Hir(Something::foo) which leads to this ICE (or another ICE in https://github.com/rust-lang/rust/pull/68944).

Hm, I think I see what's going on now.

From that description it looks to me like #68289 does the correct thing here:

  • In rpass1 we register a dependency on Hir(Something::foo).
  • In rpass2, where Something::foo is not a standalone item anymore, we

    • don't allocate a DepNode for Hir(Something::foo) (because Something::foo has been removed as far as dep-tracking is concerned), and then

    • during try-mark-green we reach one of the branches that appropriately deals with removed inputs by treating them as changed.

So far this is the behavior I would expect. However, at that point things can still go wrong:

  • try_mark_green(Hir(Something::foo)) came back as None, i.e. Hir(Something::foo) is considered Red and the caller of try_mark_green(Hir(Something::foo)), i.e. try_mark_green(type_of(Something::foo)) concludes that type_of(Something::foo) in order to find out if it has changed.
  • As a consequence force_from_dep_node(tcx, Something::foo) is called, which in turn will call tcx.type_of(Something::foo) but, alas, Something::foo is a field now, so calling type_of for it will ICE.

Is that correct? I.e. is the ICE happening when trying to force type_of(Something::foo) (as opposed to when trying to force Hir(Something::foo))?

Is that correct? I.e. is the ICE happening when trying to force type_of(Something::foo) (as opposed to when trying to force Hir(Something::foo))?

No. Forcing/executing type_of(Something::foo) works fine in both rpass1 and rpass2. This ICE happens in try_mark_previous_green(Hir(Something::foo)). When we call try_mark_previous_green(Hir(Something::foo)) in rpass2 the compiler is already in an invalid state. We're trying to mark a dep node which is not used or needed in rpass2 as green. That must never happen.

If instead type_of used an as_local_hir_id query. It would have an additional dependency AsLocalHirId(Something::foo) before the Hir / HirBody dependency. In that case in rpass2 when marking type_of(Something::foo), we'd see that AsLocalHirId(Something::foo) is red and exit early before we even get a chance to look at the non-existing Hir(Something::foo).

When we call try_mark_previous_green(Hir(Something::foo)) in rpass2 the compiler is already in an invalid state. We're trying to mark a dep node which is not used or needed in rpass2 as green. That must never happen.

I wouldn't call that an invalid state. The previous dep-graph is expected to contain nodes that don't exist in the current context anymore; it's how we deal with removed items. Failing to mark such removed dependencies as green is a way of handling this appropriately -- except that that strategy fails when query key reconstruction accidentally succeeds because of DefPathHash overlaps, as demonstrated in this bug.

I can see how an as_local_hir_id query would solve this particular problem. But the underlying problem would persist. Including the component-kind in the DefPathHash might be a more fundamental fix. I'm not sure how viable it is.

Switching to a setup where inputs cannot be removed (e.g. because there is always exactly one input like (src-code, commandline-args)) would also solve the problem.

@Zoxc and @michaelwoerister : am I correct in my understanding that with the "workaround", for lack of a better term, that landed in PR #68289, we should no longer see any ICE's here, and the remaining work item(s) here are about fixing more fundamental issues in the compiler internals that might allow us to remove that workaround?

If this is the case, then I would probably prefer that we open a fresh issue for that follow-up work, and (re)close this issue as fixed, if for no other reason than to keep the comment thread clean.

@pnkfelix I'm not entirely sure. I think the changes in #68289 make sense, but I also see how they might just "shift" the problem in some cases (as described here).

Also @Zoxc and I might have been talking about two separate problems, both of which can exist at the same time.

I think this particular issue should be closed as it talks about a "dep-node pre-allocation" failure (a false positive which is fixed by #68289). I'd like to understand more about the problem @Zoxc is seeing in https://github.com/rust-lang/rust/pull/68944. A separate issue with a reproduction case is a better venue for that. I would have expected that https://github.com/rust-lang/rust/pull/68944 makes most DepNode pre-allocation go away, but due to illness I haven't actually had the chance to review that PR in any detail.

In my efforts to add things to the glacier, I couldn't get the MVCE to ICE on the latest nightly.

Just wanted to confirm it again by renaming of trait to struct with rustc 1.41.0 (5e1a79984 2020-01-27)

Was this page helpful?
0 / 5 - 0 ratings