Rust: x.py's naming of stages is confusing

Created on 11 Apr 2019  Â·  22Comments  Â·  Source: rust-lang/rust

Rust has the standard multi-stage structure that all bootstrapping compilers have.

A newcomer who knows about compiler stages will be confident with this, until they run a command like ./x.py build --stage 1 and get output like this:

Building stage0 std artifacts 
    ...
Copying stage0 std from stage0
Building stage0 test artifacts 
    ...
Copying stage0 test from stage0
Building stage0 compiler artifacts 
    ...
Copying stage0 rustc from stage0
Building stage0 codegen artifacts 
    ...
Assembling stage1 compiler 
Building stage1 std artifacts 
    ...
Copying stage1 std from stage1
Building stage1 test artifacts 
    ...
Copying stage1 test from stage1
Building stage1 compiler artifacts 
    ...
Copying stage1 rustc from stage1
Building stage1 codegen artifacts 
    ...
Building rustdoc for stage1 
    ...

For a newcomer, this is completely bizarre.

  • Why is it building stage 0? Isn't stage 0 the compiler you download?
  • How does it assemble the stage 1 compiler before building the stage 1 artifacts?
  • Why is it building two stages? Shouldn't it only build one stage?

The key to understanding this is that x.py uses a surprising naming convention:

  • A "stage N artifact" is an artifact that is produced by the stage N compiler.
  • The "stage (N+1) compiler" is assembled from "stage N artifacts".
  • A --stage N flag means build with stage N.

Somebody had to explain this to me when I started working on Rust. I have since had to explain it to multiple newcomers. Even though I understand it now, I still find it very confusing, and I have to think carefully about it all.

Here's a naming convention that makes more sense to me:

  • A "stage N artifact" is an artifact that is produced by the stage (N-1) compiler.
  • The "stage N compiler" is assembled from "stage N artifacts".
  • A --stage N flag means build stage N, using stage (N-1).

That way, a command like ./x.py build --stage 1 would produce output like this:

Building stage1 std artifacts 
    ...
Copying stage1 std from stage1
Building stage1 test artifacts 
    ...
Copying stage1 test from stage1
Building stage1 compiler artifacts 
    ...
Copying stage1 rustc from stage1
Building stage1 codegen artifacts 
    ...
Assembling stage1 compiler 

Is there any appetite for this change? I realize it would be invasive and people would have to update their aliases, build scripts, etc. But it might be worthwhile to avoid the ongoing confusion to both newcomers and experienced contributors. Deprecating the word "stage" in favour of "phase" could help with the transition.

A-rustbuild

Most helpful comment

I think this is essentially #57963. I agree it is confusing (I even drew a little picture to help me understand it).

All 22 comments

FWIW I also found this confusing for a while, and would support this chnage.

On Apr 10, 2019, at 7:23 PM, Nicholas Nethercote notifications@github.com wrote:

Rust has the standard multi-stage structure that all bootstrapping compilers have.

A newcomer who knows about compiler stages will be confident with this, until they run a command like ./x.py build --stage 1 and get output like this:

Building stage0 std artifacts
...
Copying stage0 std from stage0
Building stage0 test artifacts
...
Copying stage0 test from stage0
Building stage0 compiler artifacts
...
Copying stage0 rustc from stage0
Building stage0 codegen artifacts
...
Assembling stage1 compiler
Building stage1 std artifacts
...
Copying stage1 std from stage1
Building stage1 test artifacts
...
Copying stage1 test from stage1
Building stage1 compiler artifacts
...
Copying stage1 rustc from stage1
Building stage1 codegen artifacts
...
Building rustdoc for stage1
...
For a newcomer, this is completely bizarre.

Why is it building stage 0? Isn't stage 0 the compiler you download?
Why is it building two stages? Shouldn't it only build one stage?
The key to understanding this is that x.py uses a surprising naming convention:

A "stage N artifact" is an artifact that is produced by the stage N compiler.
The "stage (N+1) compiler" is assembled from "stage N artifacts".
A --stage N flag means build with stage N.
Somebody had to explain this to me when I started working on Rust. I have since had to explain it to multiple newcomers. Even though I understand it now, I still find it very confusing, and I have to think carefully about it all.

Here's a naming convention that makes more sense to me:

A "stage N artifact" is an artifact that is produced by the stage (N-1) compiler.
The "stage N compiler" is assembled from "stage N artifacts".
A --stage N flag means build stage N, using stage (N-1).
That way, a command like ./x.py build --stage 1 would produce output like this:

Building stage1 std artifacts
...
Copying stage1 std from stage1
Building stage1 test artifacts
...
Copying stage1 test from stage1
Building stage1 compiler artifacts
...
Copying stage1 rustc from stage1
Building stage1 codegen artifacts
...
Assembling stage1 compiler
Is there any appetite for this change? I realize it would be invasive and people would have to update their aliases, build scripts, etc. But it might be worthwhile to avoid the ongoing confusion to both newcomers and experienced contributors. Deprecating the word "stage" in favour of "phase" could help with the transition.

—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub, or mute the thread.

I think this is essentially #57963. I agree it is confusing (I even drew a little picture to help me understand it).

@ehuss that diagram is excellent!

cc @Mark-Simulacrum @alexcrichton -- related to the discussion we had at all-hands.

My plan is to put in sometime this weekend to figure out and propose PRs for the changes discussed at the all hands (which should, to an extent, remove this issue by making dealing with --stage largely unnecessary). We may separately want to land a patch that changes stage "naming" though.

I basically have to re-figure this out any time I'm doing anything complex with rustc, and yeah, we should change the naming here.

A key issue here is that the rust compiler is actually a thin wrapper around underlying libraries, which are shipped with rust like the stdlib is. So which stage is which differs based on whether you care about the stdlib or the compiler. Kind of. I've complained about this before and generally seen agreement that this needs fixing but not as many good solutions

@nnethercote I think your explanation and @ehuss diagram are perfect for rustc-guides :smile:. I also would like to see this change happen, meanwhile can we document this on Rustc? are you comfortable if your explanation and diagrams are reused for rustc-guides?.
@igaray was interested in getting started with rustc-guides documentation maybe it's something he can use as a way to start.

Friday was the first time I compiled rust and it was pretty confusing for me too. I really like the diagram.

At minimum, we should move that content to the rustc-guide, yes. I am agnostic about the names. I'm wary that all names will be confusing unless there is a picture, and if there is a picture, any name may be fine.

The current plan (discussed and mentioned at all hands) is to work towards not needing to say "--stage" unless you're hacking on rustbuild itself or doing something odd; I'm not sure how quickly we can make that happen, but hopefully quite soon.

@Mark-Simulacrum Does that apply to --keep-stage as well? That's the one I use most, since it's currently essentially if you're doing anything in core or std...

are you comfortable if your explanation and diagrams are reused for rustc-guides?.

Absolutely, please use the above words any way you like.

FWIW I suffered the exact same confusion until @eddyb explained the rustc stage naming to me.

Same for me. Does this means that if I only need to build the sources, I only need to pass --stage 0 and not --stage 1? I am currently hacking on rustdoc.

@robinmoussu The best way to get the correct behavior is to run tests, IME.
./x.py test --stage 1 src/test/rustdoc{,-ui} will build the compiler and rustdoc exactly once and run its tests. If you want to hack on rustdoc without the compiler being built, but instead relying on the rustc-dev component from a nightly/master build, then you need @Mark-Simulacrum's plan which should allow that (last mentioned on this internals post).

I think overall the worst UX aspect is that ./x.py build --stage 1 doesn't do the same thing as ./x.py build --stage 1 src/libstd but rather it does almost everything that
./x.py build --stage 2 src/libstd does.

And in general ./x.py build --stage N, is useless, since it's missing host libraries (I'm not even sure you can use Xargo or Cargo's -Z build-std without them).

So making src/libstd the default and having to write ./x.py build --stage N src/rustc to get the (less useful) compiler binary with no libraries, seems adequate to me.

Oh and we need to get rid of fulldeps tests and similar things: ./x.py test --stage 1 should not build the compiler twice, but it does (for fulldeps tests and a few other things).
Or we could keep them and run them at stage 2 (ideally without building the compiler thrice).

Right now I'm forced to use this command:
./x.py test --no-fail-fast --stage 1 src/test/{rustdoc,rustdoc-ui,mir-opt,codegen,codegen-units,incremental,pretty,debuginfo,compile-fail,run-make,run-fail,ui}
in order to make sure I'm not missing out on any important tests, without doing unnecessary builds.

Thanks.

And I agree this mess need to be clean-up. I hope someone will someday do the work.

Would it also help if the sysroot directories were named stage{1,2}-sysroot instead of just stage{1,2}? It would also match stage0-sysroot nicely.

Then some intuition there is that stage1-sysroot has executables (and their runtime dependencies) from stage0-rustc but libraries from stage1-std (built by stage1-sysroot/bin/rustc aka stage0-rustc).

A --stage N flag means build with stage N.

To make things even more confusing, this is not actually always true... ./x.py build src/tools/miri --stage 1 builds with the stage 1 compiler artifacts, i.e., it uses the stage 2 compiler. How that fits into the rest of the scheme is beyond me.

@oli-obk pointed out that stage 0 Miri is being built by the bootstrap compiler, but against the stage 0 compiler artifacts (i.e., it is linked with those). :exploding_head:

I have a proposal for explaining this better in https://github.com/rust-lang/rustc-dev-guide/pull/843. Please let me know if that helps at all and if not, what I could do to make it better :)

Was this page helpful?
0 / 5 - 0 ratings

Related issues

Mark-Simulacrum picture Mark-Simulacrum  Â·  681Comments

withoutboats picture withoutboats  Â·  202Comments

Leo1003 picture Leo1003  Â·  898Comments

Centril picture Centril  Â·  382Comments

nikomatsakis picture nikomatsakis  Â·  236Comments