When two devs install dependencies on separate branches, it is very easy to end up merge-conflicted, in particular, the metadata.content-hash key often changes. It is very unclear how to resolve this manually, so I often delete the lockfile (or perhaps just that key) and rebuild it and, basically, hope that it comes out the same.
It seems like in some scenarios that merge conflicts could be resolved automatically based on pyproject.toml. Yarn does this, for instance.
+1 Resolving the content-hash and trivial conflicts in the individual hashes section would be very welcome.
Workaround:
poetry add pathlib2poetry remove pathlib2or some other similar innocuous package.
Will this issue be resolved soon?
Does a missing content-hash hurt security or not?
Yarn's implementation of this is way less insane than I was expecting:
Extract the two versions of the conflicted file.
https://github.com/yarnpkg/yarn/blob/a7334da31bf783af7a3efab451589fe7ac34c748/src/lockfile/parse.js#L397
Blindly try to parse the files, and if that's successful, shallowly merge them.
https://github.com/yarnpkg/yarn/blob/a7334da31bf783af7a3efab451589fe7ac34c748/src/lockfile/parse.js#L399
If the merge conflict resulted in a syntax error, it fails. Yarn's lockfile structure is designed to make this easier: it's flat, it's sorted alphabetically and node is okay with duplicated dependencies in the tree.
is content-hash mandatory though? the main problem I am encountering is that dependabot will have to ALWAYS rebase all the PRs, which could be avoided if content-hash was not in poetry.lock. in theory shouldn't be necessary, as it can always be computed by the other hashes, right?
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
I think this is still a valid issue and should not be closed.
I have written a stand-alone script as a stopgap. It only goes half the way, resolving the merge conflict for metadata.content-hash. The remaining conflicts, if any, should be quite trivial to merge manually. Use at your own risk.
https://github.com/cjolowicz/scripts/blob/master/python/poetry-merge-lock.py
Update:
Please use poetry-merge-lock from PyPI instead (see comment below). This should allow you to merge without manual conflict resolution, in most cases.
Here is a stand-alone tool which should handle most merge conflicts in the lock file:
pip install --user --upgrade poetry-merge-lock
This tool is in early development. If you're interested in trying it out, please let me know on its issue tracker if you encounter any problems.
@sdispater, @finswimmer : Would you be interested in a PR to add this as a Poetry command?
Hello @cjolowicz ,
your contribution is very welcome. I find this feature very useful. :+1:
However, I cannot promise if and when this can be included. Including new features is decided by @sdispater .
But please go on!
fin swimmer
as long as there is a line content-hash we will never be able to conveniently use dependabot. When dependabot runs, it creates 1 branch and merge request per update, w/ the idea that you let your CI run, and then auto-merge.
However, because of content-hash, with poetry, every single one of these branches has a merge conflict and must be manually dealt with, increasing the human time from a few seconds to ~10-minutes per, and, causing another run of the CI to be required.
The repository I just tried this on, 5 MR were created. If it uses tox, it takes no human time and 5 CI runs.
If I use poetry it takes 10 CI runs, and a manual clone/checkout/rebase/poetry update hash somehow/push, which took me more than an hour.
Multiple this by my 30 repos, and make it a daily thing to have to deal with and it becomes intractable.
The solution cannot be in a new poetry command to resolve the lock, that would still require all the manual work.
Instead, we need to be able to merge poetry.lock files that are not otherwise conflicting.
Beacuse all these MR are created at once, in parallel, we cannot teach dependabot to do it either, at the time of each MR they all come from the same spot on master so there is no conflict, its only when the first is accepted that we need to rebase the 2nd, and when it is accepted we need to rebase (twice) the 3rd, and so on.
Can we not make the issuance of content-hash optional?
event without dependabot its troublesome.
Ideas:
so perhaps if there was a --no-output-hash option to the poetry update --lock ?
@donbowman Doesn't Dependabot rebase your PRs automatically if they conflict? This is what happens for me. I only need to resolve conflicts when rebasing or cherry-picking my own commits, in which case poetry-merge-lock mostly works fine. I sometimes follow it by poetry add insecure-package && poetry remove insecure-package to ensure that the lock file is up-to-date. (insecure-package is just an empty dummy package that's never part of my dependency tree.)
@donbowman Doesn't Dependabot rebase your PRs automatically if they conflict? This is what happens for me. I only need to resolve conflicts when rebasing or cherry-picking my own commits, in which case poetry-merge-lock mostly works fine. I sometimes follow it by
poetry add insecure-package && poetry remove insecure-packageto ensure that the lock file is up-to-date. (insecure-package is just an empty dummy package that's never part of my dependency tree.)
this is on gitlab.
it creates e.g. 5 merge requests at the same instant.
as soon as i accept the first one, the other 4 are all in merge conflict.
How would it magically wake up and rebase these 4? You mean the next time i run it it would see the conflic, rebase, then i would accept 1 more, then it would rebase the next 3, and so on?
also, it would have the same merge conflict, how would it know how to resolve it?
I added https://github.com/python-poetry/poetry/pull/2654 to poetry as a suggested solution.
@donbowman I assumed you were referring to GitHub Dependabot. The bot recreates the PR (not a rebase, sorry for the sloppy terminology) when the changes conflict with the target branch.
i see.
but it still leaves the original issue, that if i issue many update's, each as a single MR, after the first is accepted, the rest all conflict.
If the dependabot script is taught to then delete and recreate, ... its very slow.
if I were to run it daily, and had 10 updates, it would take 10 days to get them all in w/ this technique.
i guess, inefficiently, i could create some webhook that when the first mr is accepted, it could somehow run the dependabot again to recreate all the remaining updates, and then repeat every hour or so until all are done.
It looks like Dependabot actually _does_ automatically rebase (well, rewrite-force-pushes) the PR by default, according to the documentation. Judging by the docs and that issue, it sounds like it does it with webhooks or another timely mechanism rather than waiting for the next run, though neither one is explicit on how quick the turnaround is.
Maybe you're seeing merge conflicts triggered by content-hash because you've disabled this behavior?
It looks like Dependabot actually _does_ automatically rebase (well, rewrite-force-pushes) the PR by default, according to the documentation. Judging by the docs and that issue, it sounds like it does it with webhooks or another timely mechanism rather than waiting for the next run, though neither one is explicit on how quick the turnaround is.
Maybe you're seeing merge conflicts triggered by
content-hashbecause you've disabled this behavior?
content-hash conflicts even w/o dependabot. if 2 people change 2 branches, its in conflict.
the hosted dependabot of github may in fact have some faster trigger.
but the dependabot-core on a private gitlab does not. its a cron.
My 2 cents, if the content-hash format can be structured as a list of sorted main dependency names and hashes calculated from resolved sub-dependencies, then it would reduce merge conflict chance a lot. Like
[metadata]
content-hash = [
"astroid:03472c30eb2c53",
"flask:bb564576db6a918",
#...
]
Here's an approach that has worked well for me and only uses git and (recent) Poetry:
git restore --staged --worktree poetry.lock
poetry lock --no-update
When rebasing a feature branch on main, this preserves pins from the main branch, and recomputes pins for your feature branch. You would then follow up with these commands to continue the rebase:
git add poetry.lock
git rebase --continue
Most helpful comment
I think this is still a valid issue and should not be closed.