cc @rust-lang/clippy @RalfJung
cc @rust-lang/compiler @rust-lang/infra we gate on clippy soon, so if you break clippy you need to fix it immediately in the same PR. This is trivial now though, since we are basically in a monorepo state, so you just edit stuff in src/tools/clippy/
until everything works again.
Synchronizations of the remote repo and the local subtree is explained in CONTRIBUTING.md
The first port (clippy) is being done in https://github.com/rust-lang/rust/pull/70655
Cc @rust-lang/miri because we have that now :D
Hmm, I was led to believe that git subrepo
was nesting git
repositories, in a way which git
natively handles. But it turns out it's just a monorepo sync tool, same as manually making large commits. I don't think losing commit history is acceptable.
At a glance, git subtree
looks like it might do the right thing, i.e. keep the repository nesting of a submodule, but I haven't tried it yet to see how it actually behaves in practice.
EDIT: ah, no, git subtree
just seems to create synthetic commit histories. Still, could be better than a single sync mega-commit.
EDIT2: somehow it still managed to preserve commit history?! Compare:
https://github.com/rust-lang/rust-clippy/commits/master
https://github.com/rust-lang/rust/commits/c211cea3e99e04c2980a853b6637de22881b72eb
I moved to git subtree.
In case it gets lost on the PR, I said: https://github.com/rust-lang/rust/pull/70655#issuecomment-607240782
Whoa, clippy commit hashes were kept intact, this makes me really happy!11
I moved to git subtree.
Would be good to get the pro's and con's of that decision documented somewhere. git submodule
advertises itself as a better alternative to git subtree
; why is subtree
still the better choice for us?
(Also I have no idea in which thread to post now, there's at least this issue and 2 PRs. It's very confusing.^^)
I don't know why git subrepo
is better than git subtree
other than what the documentation of git subrepo
states at https://github.com/ingydotnet/git-subrepo#benefits
Fixes known rebase failures with git-subtree.
git subtree
is better than git subrepo
because it preserves all commits in history instead of squishing them in the name of the person who synchronizes it.
git subtree
is upstream, which is nice, and it also supports preserving the commits better.
I believe the "rebase failures" that git subrepo
talks about are actually an anti-pattern: rebasing a merge commit, it's not meant to work at all (the rebase will include both branches and lose the merge commit - this can already happens if you accidentally rebase the wrong thing in rust-lang/rust
).
git subrepo
seems to want to just throw all the changes into single large commits so you "don't have to deal with it", whereas git subtree
is more history-preserving-oriented, which I prefer.
git subtree is better than git subrepo because it preserves all commits in history instead of squishing them in the name of the person who synchronizes it.
Is this true in both directions (subtree pull/push)?
Does git subtree
also have the problem that a "subtree push" needs a commit on the rustc side?
no, git subtree push
does not need a commit on the rustc side.
The only downside of git subtree
is that you need to specify the folder name and target repo on every synchronization (because there's no synchronization metadata file)
I would prefer the history preserving option. I was under the impression that subrepo had a no squash mode but that could be subtree.
The only downside of git subtree is that you need to specify the folder name and target repo on every synchronization (because there's no synchronization metadata file)
This still needs to be reflected in https://github.com/rust-lang/rust/pull/70654 then.
I presume we could have scripts for that in the rustc repo?
This still needs to be reflected in #70654 then.
it already is:
You always need to specifiy the
-P
prefix to the subtree directory. If you specify the wrong directory
you'll get very fun merges that try to push the wrong directory to the remote repository. Luckily you
can just abort this without any consequences.
I was under the impression that subrepo had a no squash mode but that could be subtree.
subtree doesn't squash by default and has a sqash option.
@oli-obk IIUC, Clippy can then fix a nightly version and doesn't have to use the rustc master anymore (correct?). If that's the case, I'd go ahead and make these changes to the Clippy CI/infra, so it will be ready once the subtree
PRs are ready to get merged.
it already is:
It doesn't say that you need to give the target repo (and the example commands do not mention a target repo, just a folder name).
Clippy can then fix a nightly version and doesn't have to use the rustc master anymore (correct?). If that's the case, I'd go ahead and make these changes to the Clippy CI/infra, so it will be ready once the subtree PRs are ready to get merged.
well, only if we make the synchronization at specific commits that work on the current nightly. Between two nightlies there can be many states where the clippy subtree will only compile with a master-rustc.
IMO we want to do this, since otherwise, we would have to do rustup commits in Clippy just as often as before. It will just be easier, since it will only be a git subtree push
.
Well, we can do it at random points in time, but yes, we can target those points in time to the commit that caused the most recent nightly to be created.
Right, we can have a nightly clippy CI job that notices when clippy stops building with master, and sync then.
at that point we can just have a nightly job that synchronizes once a night automatically to ensure that even nonbreaking changes get synchronized as often as possible.
Clippy's CI could perform pull every time it tries to merge PR.
That would avoid most of conflicts (from maintainer PoV) when new lint using the "old code" is being added but Clippy in Rust repo was updated for the "new code" already.
Edit: There would be still Rustups necessary for lints added to Clippy shortly before breaking changes since Clippy in Rust repo won't have them yet.
@oli-obk @RalfJung What do you think of this logic? Just thought of it, not sure where to leave it.
git subtree pull
is an injection, git subtree push
is a projection, hence why pulling preserves commits verbatim (hash included) while pushing rewrites them to keep only the relevant parts.
They're both information preserving (wrt file diffs and commit messages) for the subtree directory, and only for that directory.
git subtree pull is an injection, git subtree push is a projection
I have no idea what this means in this context.^^
I mean I know waht both do as you explained it elsewhere, but this terminology here is not useful at all I feel.
I'm really glad this effort is underway. On Fuchsia we try to roll to a new tip-of-tree rust compiler every 3-6 weeks, which helps with catching bugs. If we could we'd roll more often. But the main thing slowing us down is the constant state of toolstate breakage.
Some of this (like rustfmt) we can work around in practice by distributing separate versions, but it's a pain to set up and maintain. Some tools (like clippy and rls) really have to match the version of the compiler.
This seems like it would make toolstate breakage a thing of the past, which is why I'm very supportive of this effort and would be glad to help however I can.
We've protyped this with clippy but are waiting for an upstream fix to git subtree
before it becomes feasible for the rustc repo without a patched git
Yeah, I'm hoping for that fix to land before I start seriously pushing it for others. I wanna start having the conversations soon though
Hi! I contribute to Git in my spare-time and I'm really interested in improving the user experience around submodules. I would really be interested to know the reasons for why this change of approach is being made for your project :) After all, git subtree
is in contrib/
, so it's not an "official" Git command and it's not being actively developed...
I summarized the reasons in https://github.com/rust-lang/compiler-team/issues/266
TLDR: keeping submodules in sync is a lot of effort, we need to be able to easily make changes to the "sub-whatever" to keep it from breaking due to other changes in the main repo, but at the same time not have a pure monorepo, because the "sub-whatever" is also being developed out of tree.
@phil-blain The links Oli posted should give some context, but in short:
While I do think the UX around submodules could use a lot of love (and I'm glad someone is looking into it!), in this case I think it's more of them not being the right tool for the job.
TLDR: we could create a git-subtree
command installable via cargo install rustc-git-subtree
to make the use of git subtree
on the rustc repo easy.
Some links to start out with:
doc
directory to a subtreeEven if the git subtree patch is merged and distributed, it may take years to appear in everyone's distro.
@Mark-Simulacrum suggested that we could remove the blocker on the git subtree patch by distributing it ourselves. All we need is a version that works for us after all. So basically we create a new rust project in the rust-lang org and all it does when installed is invoke the scripts (that we'll shamelessly-but-with-the-right-license copy from the git-PR). This way anyone who wants do subtree synchronizations can "overwrite" their git subtree
command, by installing ours via cargo install rustc-git-subtree
. This will put the git-subtree
binary into the PATH
like it does with cargo-*
subcommands. This will take precedence over the git-subtree
of your OS, and thus just work out of the box.
This will put the git-subtree binary into the PATH like it does with cargo-* subcommands. This will take precedence over the git-subtree of your OS, and thus just work out of the box.
Careful though, people running git-subtree
will get your version, but people using git subtree
will get the original script from contrib/
in git.git, which gets installed to git --exec-path
by most distros' git package, since they install subtree by default.
Probably we want to call it rustc-subtree then, or something like that. Doesn't seem like a huge concern -- people interacting with it are a relatively small set, hopefully.
This is a good idea, IMO. Some things the rustc-subtree
command should take care of:
ulimit -s 60000
) for large subrepos on first use.It would be funny to put rustc-subtree
in its own repo and include it here as a subtree repo. But sadly I can't think of a reason, why it has to be in the Rust repo 馃槃
I was assuming this would be another rustup component
Though I guess this doesn't actually need to be versioned with Rust so a cargo install makes more sense
I marked this as "looking for participation". Working on this git subcommand that can be installed via cargo install
requires zero work on rustc. Basically what you need to do is to grab the latest version of the subcommand and copy the file with the appropriate license mentioned into a new cargo project. The main function can then include_str!
the script and invoke sh
to execute the text (maybe piped in via stdin
?) by forwarding all command line args to the script. You'll need to add rust-lang-owner
to the owners of your git repository and the crates.io
owners of your crate.
Ideally you test the crate by talking with the clippy team on zulip and doing a clippy sync with the "new" tool. If all that works, we'll update the documentation on our git subtree
usage to reflect the fact that ppl need to install the tool and invoke a slightly different subcommand name.
I'd like to work on this issue!
great! If you have any questions, feel free to ping me on zulip or just ask here. There's also a bot, so if you write @rustbot claim
it should assign this issue to you.
@rustbot claim
I'm wondering if we need this to actually be on crates.io. We could vendor the script in the repository and call ./x.py subtree
.
I like having it as an independent thing. Others may want to use it, too. It's not rustc specific
Here's the gitree
crate which wraps the git subtree
script. I have included the script form the git PR mentioned here. I am not well versed in handling git, so can anyone test it or give me the instructions for doing it? Here's the Zulip topic about this crate.
The crate metadata doesn't refer to the repo, so posting it here for others trying to find it: https://github.com/mominul/gitree
Most helpful comment
I'm really glad this effort is underway. On Fuchsia we try to roll to a new tip-of-tree rust compiler every 3-6 weeks, which helps with catching bugs. If we could we'd roll more often. But the main thing slowing us down is the constant state of toolstate breakage.
Some of this (like rustfmt) we can work around in practice by distributing separate versions, but it's a pain to set up and maintain. Some tools (like clippy and rls) really have to match the version of the compiler.
This seems like it would make toolstate breakage a thing of the past, which is why I'm very supportive of this effort and would be glad to help however I can.