Rust: `HirId`-ification initiative

Created on 21 May 2018  路  30Comments  路  Source: rust-lang/rust

HirIds were introduced in bc259ee8 (March 2017) as a replacement for ast::NodeIds after HIR lowering; this is said to be good for incremental compilation. The commentary for pull request #43740 (August 2017) says:

In the future the HirId should completely replace the NodeId in HIR nodes. [...] Ideally we convert more and more code from NodeId to HirId in the future so that there are no more NodeIds after HIR lowering anywhere.

The HIR chapter of the rustc development guide similarly says "while [NodeIds] are still in common use, they are being slowly phased out" (emphasis in original).

However, a drawback of the "gradually migrate over to the new way of doing things in the course of addressing other issues" strategy is that it's all too easy for the codebase to remain in a transitional state indefinitely. Thus, perhaps we should have this dedicated issue to track progress towards not using NodeIds after HIR lowering.

A-hir C-cleanup T-compiler

Most helpful comment

Closing as discussed in https://rust-lang.zulipchat.com/#narrow/stream/131828-t-compiler/topic/.60HirId.60-ification.20initiative/near/201835747

All 30 comments

Thanks, @zackmdavis!

However, a drawback of the "gradually migrate over to the new way of doing things in the course of addressing other issues" strategy is that it's all too easy for the codebase to remain in a transitional state indefinitely.

Yes, that's indeed what has been happening over the last year. We've done enough to make incremental compilation work and then pretty much stopped.

Btw, it might be a good idea to add a custom implementation of Hash for HirId, something like:

impl Hash for HirId {
    fn hash<H: Hasher>(&self, state: &mut H) {
        let data: u64 = unsafe { mem::transmute(*self) };
        state.write_u64(data);
    }
}

It would have to be tested if that actually speeds things up.

Perhaps add tracking-issue label?

I'm currently working on this.

@ljedrz I would like to help with replacing the NodeId variants of some of the Hir-Map functions with the *_hir_id-Versions.

@fabric-and-ink Actually some of that is already in progress: https://github.com/rust-lang/rust/pull/59042. Please feel free to build on that.

@ljedrz Can you perhaps post a status update here as to what's left? If there's nothing left then we can probably close this.

NodeId is hardly used after the HIR lowering anymore; as far as I'm aware the only big thing left besides https://github.com/rust-lang/rust/pull/63849 is to skip the generation of the HirId to NodeId map in normal compilation (and only compute it for RLS and rustdoc); that, however, requires additional 3 changes I listed in the PR message of https://github.com/rust-lang/rust/pull/62975. I'd be happy to help with those, but I could use some guidance.

@ljedrz is this still ongoing

@tshepang I haven't looked into it in a while, but I'll be happy to check the state of this initiative tomorrow.

It seems that not much has progressed since my last comment and there are still some uses of NodeIds that could be removed. When I have a little bit of time I'm going to attempt another pass; I'll open individual issues if I encounter trouble and require guidance so this can move forward.

@ljedrz ok, do say if you want some help. I myself do not even know how to find such usages... I guess I can start by looking at your past PRs?

We are still using the node_id_to_hir_id in middle and ty. If we were able to remove those uses, I think the only ones left would be in save_analysis and docs which, if I recall correctly, would mean we could not create the NodeId to HirId map when not building the docs or RLS.

It appears that some of the remaining progress is blocked on https://github.com/rust-lang/rust/issues/71104.

I think @marmeladema and @eddyb were working on stuff with LocalDefIds (#70909) and I have a WIP PR that adds a few more hir::DefKinds (#70043)

Once https://github.com/rust-lang/rust/pull/72275 lands, the only caller of opt_node_id_to_hir_id will be in save_analysis but some calls to node_id_to_hir_id remain in middle. What would be the next steps here?

Seems like ResolverOutputs could be updated to use some combination of DefId and HirId instead of NodeId.

https://github.com/rust-lang/rust/blob/89988fe727a5f055da78a4278448cfbca0c49e19/src/librustc_middle/ty/mod.rs#L120-L132

It seems to be the only thing requiring calls to node_id_to_hir_id in rustc_middle::ty. Any thoughts @ljedrz? I see you touched this code once upon a time.

Yes, if we can change that one to use a different id then I think we could start looking at conditional creation of the NodeId > HirId map (so that it's only created for save_analysis and docs).

Seems like ResolverOutputs could be updated to use DefIds or HirIds instead of NodeIds.

So i've tried that without much success for now. What i tried is to change ResolverOutputs to store LocalDefId instead of NodeId and update Resolver::into_outputs and Resolver::clone_outputs to convert the node ids stored internally in the Resolver by calling local_def_id.

That currently does not work, and i think (but haven't had time to investigate further) that its because at the time when those methods are called, lowering is not properly done/finished yet.

I suspect we'll need a "pre-HIR" and "post-HIR" version of that data structure, converting between them during lowering. This is easier said than done obviously.

Actually storing directly HirId might work. At least, i've managed to pass all ui tests locally.

Once #72402 has landed, what would be the next steps here?

I think we could start looking at conditional creation of the NodeId > HirId map

The way i understand the code is that there are two maps currently created:

  1. NodeId <-> LocalDefId: populated with calls to create_root_def / create_def_with_parent during the different name resolution passes

  2. NodeId <-> HirId: populated during AST lowering to HIR with calls to lower_node_id_generic

Is the idea to introduce another map LocalDefId <-> HirId and get rid of the two others?

The LocalDefId <-> HirId map is trivial, since a HirId is just the combination of a LocalDefId and a ItemlLocalId (an index inside the LocalDefId).
The maps in the forward direction NodeId -> LocalDefId, HirId are used by HIR lowering a lot. Those allow to link together the HIR nodes.
I do not know where the backwards map HirId -> NodeId are used.

I do not know where the backwards map HirId -> NodeId are used.

They are used mainly in save_analysis and in rustdoc but internally conversion between LocalDefId to HirId still currently uses the the NodeId tables/maps:

    #[inline]
    pub fn local_def_id_to_hir_id(&self, id: LocalDefId) -> hir::HirId {
        let node_id = self.def_id_to_node_id[id];
        self.node_id_to_hir_id[node_id].unwrap()
    }

    #[inline]
    pub fn opt_local_def_id_to_hir_id(&self, id: LocalDefId) -> Option<hir::HirId> {
        let node_id = self.def_id_to_node_id[id];
        self.node_id_to_hir_id[node_id]
    }

That's why i was wondering if the plan is to get rid of that indirection level. There is also that comment from @eddyb :

    // FIXME(eddyb) don't go through `ast::NodeId` to convert between `HirId`
    // and `LocalDefId` - ideally all `LocalDefId`s would be HIR owners.

It seems to me that post HIR-lowering, NodeId tables/maps are only used for save_analysis and rustdoc and could probably being kept around conditionally for those two uses cases only?

Ideally, those conversions would be reduced to

#[inline]
pub fn local_def_id_to_hir_id(&self, id: LocalDefId) -> hir::HirId {
    hir::HirId { owner: id, local_id: ItemLocalId::new(0) }
}

The issue is that some LocalDefId do not correspond to HIR nodes. The two maps are not equivalent. I got bit recently by this one. This happens when a DefId is created, but the HIR node is associated another unrelated HirId, no none at all. Fixing this requires to map the creation of LocalDefIds to HIR owners when lowering.

I don't know if that goes in the right direction but I did some testing and it appears that a def_id_to_hir_id table can be built in init_node_id_to_hir_id_mapping and be used later on for LocalDefId -> HirId conversion. One side effect is that the Definitions::def_id_to_node_id table becomes useless after this step and can be dropped / shrunk. It might be possible to remove it entirely after converting most calls to as_local_node_id which i mostly did locally.

Also, I haven't tested it yet, i am confident that the hir_id_to_node_id map can be lazily initialized so that its only constructed for rustdoc and save_analysis.

So to understand better what the issue with DefId/HirId is, i used the test file src/test/rustdoc/macro-in-closure.rs and printed the different tables:

def_id_to_hir_id: [
    DefId(0:0): Some(HirId { owner: DefId(0:0), local_id: 0 }),
    DefId(0:1): Some(HirId { owner: DefId(0:1), local_id: 0 }),
    DefId(0:2): Some(HirId { owner: DefId(0:2), local_id: 0 }),
    DefId(0:3): Some(HirId { owner: DefId(0:3), local_id: 0 }),
    DefId(0:4): Some(HirId { owner: DefId(0:3), local_id: 3 }),
    DefId(0:5): Some(HirId { owner: DefId(0:5), local_id: 0 })
]

def_id_to_span: [
    DefId(0:0): Span { lo: BytePos(0), hi: BytePos(0), ctxt: #0 },
    DefId(0:1): Span { lo: BytePos(0), hi: BytePos(0), ctxt: #1 },
    DefId(0:2): Span { lo: BytePos(0), hi: BytePos(0), ctxt: #1 },
    DefId(0:3): Span { lo: BytePos(88), hi: BytePos(138), ctxt: #0 },
    DefId(0:4): Span { lo: BytePos(104), hi: BytePos(135), ctxt: #0 },
    DefId(0:5): Span { lo: BytePos(117), hi: BytePos(129), ctxt: #0 }
]

As you can see, DefId(0:4) which is the closure, is associated with an HirId whose owner is in fact DefId(0:3) which is the main function.

Taking a closer look at the code, DefId(0:4) is created here: https://github.com/rust-lang/rust/blob/master/src/librustc_resolve/def_collector.rs#L218 but later on during lowering at https://github.com/rust-lang/rust/blob/master/src/librustc_ast_lowering/expr.rs#L114 the expr.id is not assigned a fresh new HirId bur rather "attached" to the current parent one which in our case is the main function.

If I understand correctly, the end goal is to have:

DefId(0:4): Some(HirId { owner: DefId(0:4), local_id: 0 }),

What i am not exactly sure is if all "closure children nodes" should have as parent the closure, or the main function directly. I tried both, with different level of un-successfulness.

I _think_ i understand the problem, but i don't exactly know what would the right solution here. I could definitely use some help / mentoring.

Once #72777 and #72781 have landed, rustdoc should be free of NodeId.

In order to avoid re-introducing NodeId by mistake, would it be valuable to remove all NodeId related methods in hir map? The only remaining user of NodeId seems to be save_analysis and it can technically use the underlying Definitions object to access NodeIds.

Small update:

  • save_analysis has been converted to work fully on hir tree instead of mix of ast/hir in #72882
  • as a consequence, all NodeId related APIs have been removed from the hir map in #72996
  • post-lowering uses of NodeIds have been shrunk down further in #73090

I believe the solely remaining use of NodeIds post hir lowering is the conversion of the trait map:

I see two approaches to solve this:

  1. Either keep the trait_map in the Resolver but move the conversion as part of the lowering and store the resulting hir trait_map into the current crate. That is not great because the mapping NodeId/HirId would still be required.

  2. Or build that hir trait_map directly as part of the lowering. I don't know if its easily doable or not yet.

What do you people think about all of this?

Now that #73291 has landed, the table/map to convert NodeId to/from HirId are gone. They have been replaced by direct LocalDefId to/from HirId table/map.

Next step (and possibly final step?) is to move the remaining API's involving NodeId to the Resolver which means that there will be no way of dealing with NodeId once the lowering is done.

Closing as discussed in https://rust-lang.zulipchat.com/#narrow/stream/131828-t-compiler/topic/.60HirId.60-ification.20initiative/near/201835747

Was this page helpful?
0 / 5 - 0 ratings