As per this discussion thread:
https://mail.python.org/pipermail/distutils-sig/2015-October/thread.html#27234
it would be very nice if there where better ergonomics around package uploads -- in particular some way to upload a new release, and then take a look over it to double-check that everything is correct before you -- as a second step -- hit the button to make it "go live". Glyph suggests that in particular he'd like to be able to actually run a test install against the uploaded data as an end-to-end test:
https://mail.python.org/pipermail/distutils-sig/2015-October/027259.html
which indeed sounds glorious, and I think the super-slick way to do this would be to provide a unique private index URL which gives a view of what pypi would look like after your release goes live, and could be used like
pip install --index-url https://pypi.python.org/tmp/SOMEPKG/acd1538afe267/ SOMEPKG
(https://mail.python.org/pipermail/distutils-sig/2015-October/027263.html)
The idea would be basically that any request to /tmp/SOMEPKG/acd1538afe267/WHATEVER would return the same thing as a request to /WHATEVER, except for requests that would be affected by the addition of the new release, would act as if the release had already been made.
The use of a unique URL for each trial upload means that this still plays nicely with caching/CDNs. The inclusion of the package name in the tmp URL allows people to double-check that if they see a URL like this, then they know that the files there were actually uploaded by someone who is trusted to upload that package. You'd want to expire them after some short-but-reasonable time (a few days?) to prevent them being abused as private indices by unscrupulous people, and also just for general hygiene, but that's fine.
Obviously this is very much a post-"become PyPI" wishlist priority request.
Another advantage of this feature would be that it would provide a way for distributed teams to coordinate releases. E.g. one could use a workflow like:
We've had problems in the past with numpy where we used a similar workflow but directly on the real pypi, so that there was a period between steps one and two where the source release was live and the binary release was not. Since pip prefers a 1.10.3 sdist to a 1.10.2 wheel, any users who try to install during this gap suddenly revert from the wheel experience to the far far inferior sdist experience. The solution of course is to coordinate everything offline and like email wheels back and forth to collect them all on a single person's machine before doing the upload in one go. But this is difficult and unpleasant.
You don't need to collect onto a single users machine. You just need to upload all of the wheels first but that's still obviously not as nice of an experience. I'm not opposed to this feature though. It's just not a priority at the moment.
Sent from my iPhone
On Jan 7, 2016, at 4:32 PM, Nathaniel J. Smith [email protected] wrote:
Another advantage of this feature would be that it would provide a way for distributed teams to coordinate releases. E.g. one could use a workflow like:
main release manager uploads (hopefully) final source release to one of these staging areas
the volunteer who provides osx builds fetches the source from this staging area, builds wheels, and then uploads the wheels to the staging area. (This could even be done via an automated script, or even a fully automated build bot).
once all the wheels have been uploaded, the main release manager double-checks things, and then hits the switch to make the release live
the source and binary releases appear on pypi in a single atomic transaction
We've had problems in the past with numpy where we used a similar workflow but directly on the real pypi, so that there was a period between steps one and two where the source release was live and the binary release was not. Since pip prefers a 1.10.3 sdist to a 1.10.2 wheel, any users who try to install during this gap suddenly revert from the wheel experience to the far far inferior sdist experience. The solution of course is to coordinate everything offline and like email wheels back and forth to collect them all on a single person's machine before doing the upload in one go. But this is difficult and unpleasant.—
Reply to this email directly or view it on GitHub.
It also means there's an extra step on top of twine upload
We hesitated on just uploading wheels before the sdist first because we had no idea what pip would do if it saw that a 1.10.3 release was available, but only in formats that can't be used by the current machine. I'm actually not sure whether in this case I'd prefer pip to error out or to silently fall back to an earlier version, though obviously the fallback is better for handling this particular situation. And you also still have the problem of how to distribute the sdist to the build volunteers if not via pypi.
But yeah, it's manageable. The main advantage of this approach is the testing use case described in the OP; I just wanted to make a note that it is also have other advantages. (And of course these are complementary -- you can use the same staging area to first assemble the wheels and then to test the whole assemblage.) Numpy's had to skip release numbers twice in the last few months due to issues interacting with pypi. I'm not saying this is pypi's fault -- I think one was user error and one was a network error -- but rather just, there's a lot of advantages to reducing the cost of errors.
That's easy enough to manage. Just have it default to immediate publish with a flag to enable the two step upload.
Sent from my iPhone
On Jan 7, 2016, at 4:38 PM, Ian Cordasco [email protected] wrote:
It also means there's an extra step on top of twine upload
—
Reply to this email directly or view it on GitHub.
It'll install the older sdist.
Sent from my iPhone
On Jan 7, 2016, at 5:45 PM, Nathaniel J. Smith [email protected] wrote:
We hesitated on just uploading wheels before the sdist first because we had no idea what pip would do if it saw that a 1.10.3 release was available, but only in formats that can't be used by the current machine.
@dstufft, from a twine UX perspective, I take it you mean something like twine upload --no-publish <artifacts>? I think that would work well, especially if it wrote out the private URL so you could subsequently do twine publish <unpublished release URL>
@ncoghlan Yea something like that. We'll want to consider what it means if/when we move twine into pip too. But either way I think it's reasonably doable in a way that doesn't change the default behavior but makes it easy to opt in.
I saw a link to this in the context of replacing TestPyPI. It's an awesome idea which I'd love to see implemented, but I'd like to note that it's not the only use case for TestPyPI. It's also useful for:
So I hope that there will continue to be a test server after this happens. :-)
Those are interesting use cases, which I feel like the existing TestPyPI does a pretty poor job of handling right now as well. The key difference I think is that both of those things would be best served by something with ephemeral names that automatically expired after some period of time. So I think they're still going to require some broad changes in how TestPyPI functions, and it may make sense to roll those into PyPI at large, but I can see a use for them.
I filed https://github.com/pypa/warehouse/issues/2286 as a separate issue to cover the learning & interoperability testing use cases for Test PyPI.
There's a fair bit of overlap between this issue and #720. See https://github.com/pypa/warehouse/issues/720#issuecomment-347750123 for a more specific design proposal I put together that would give us 3 potential states for a release:
The choice between which kind of release process to use (immediate publication with partially mutable releases vs staged publication with immutable releases) would be made on a per-project basis.
It sounds like it'd be simpler to just allow uploading multiple artifacts simultaneously. That's somewhat more work when cross-compiling, but it reduces extra state tracking. At some level perhaps it's just a tooling thing. For the average package that's just running something like:
$ twine upload dist/*
It seems like it should be possible to instrument the client to upload things in a different way.
We already support uploading multiple artifacts simultaneously. There are a few reasons it doesn't fully solve the problem:
There probably would be some value in making multi-file upload an atomic transaction, though, so e.g. wheels and sdists appear simultaneously and if one upload fails the whole thing is rolled back to try again. I don't think that's what twine upload dist/* currently does. It's definitely not a replacement for real two-phase upload, but it has the nice property that it could be implemented without changing peoples workflows at all; twine would just become a tiny bit more robust. I don't know that it's a high priority though, unless it's really easy to implement.
@taion It's not an either/or situation: we can do both, and several of the technical building blocks will be shared. I do agree that if running twine upload for multiple artifacts isn't already an atomic operation, enabling that would be the place to start.
The main concrete benefit I personally see to going further and actually exposing a separate staging index is that it would offer us a path towards truly immutable releases that doesn't require unilaterally changing the rules for existing package publishers.
That makes sense.
I will also note, though, that as a user, I don't really care that much about immediate, hard immutability.
To offer a strawman, if, instead of "immutable", you had "immutable after 24 hours", your flows (2) and (3) above for publishers would still generally work. And while it's certainly possible that, as a user, I'll upgrade immediately to a new release of a dependency, but will then get bitten by mutability; in practice I'm much more likely to get bitten by bugs in the release itself.
The really frustrating thing is instead if I have some set of dependencies that I haven't touched in a while that suddenly starts breaking because something changed.
To be clear, we don't really support multiple files being uploaded at once-- it just appears that we do because most of the tooling supports passing in multiple files and uploads them one after another. From the point of view of Warehouse though these are completely independent events (and even if you try to upload them at the same time instead of serially, Warehouse will serialize the database transactions). So the only difference between twine upload dist/*one one computer, and multiple computers is the amount of time between the distinct events.
I don't think it's possible to get a "glitch in the artifact transmission" anymore, the upload API has several hashes it computes on the uploaded file and then Warehouse will compute those same hashes and verify them. The errors that typically happen are ones where people don't test their artifacts beforehand or their testing didn't discover some error and they want to undo it. This can be made more likely to happen by the fact that the tooling doesn't exactly make it easy to pre-test your artifacts without setting up a bunch of your own infrastructure.
I don't think allowing multiple artifacts in a single HTTP request solves the problem, because often times you have many different machines involved so you want some idea of an atomic unit of work that transcends a single HTTP request.
I still think that the idea of a staged upload (what this ticket calls two stage) is the right way to enable everything but the "test mode" API, which is really best served by a sandbox like environment where everything is transient and automatically expires and is cleaned out automatically.
A while ago, due to a glitch on npm's end, I published a bad release to npm for a package that now gets ~3mm downloads a month. It wasn't the end of the world, though my Twitter mentions weren't the prettiest thing in the world that afternoon.
I've since accidentally published broken packages to both npm and PyPI due to bugs in my own build tooling more than once. For my users and for me, though, it just hasn't been that big a deal. The staging area handles this in principle, but so does the publisher bumping the version again, re-publishing, and moving on.
For the multiple-machine-upload case – of course I'd be pretty unhappy if I unintentionally ended up building numpy from source with the a bad BLAS, but this still seems like something that can be solved well outside of PyPI in user space. Per @njsmith, emailing around wheels sounds painful, but doesn't something like a Dropbox shared folder take care of this pretty well?
Not sure what argument you're trying to make here :-). No-one's saying that we need this feature to prevent the end of the world -- obviously we've survived this long without it. But that doesn't mean it wouldn't be nice to have? Wouldn't you rather not have frustrated those people in your twitter mentions, if you had the option?
Another way of looking at this idea is as offering a "Preview" option for releases, the same way GitHub offers that for comments, most blogging engines do for posts, and pull requests do for source code comments.
Folks can definitely live without those previews, but it's really nice to have the option of checking things over before you actually hit the "Ship It" button :)
@njsmith
Hah, fair enough. I don't know though. I've never published or maintained anything as involved from a compilation perspective as numpy, so I don't have your perspective there, but overall my experience with Python packaging is that it's been too hard to publish releases rather than too easy. Maybe I'm just lazy, but I expect that most people are no more likely to actually preview their releases than they are to, say, install and test the actual wheels from setup.py bdist_wheel before uploading them.
I guess what I'm really advocating for is that "hard immutability" via a staging area is a somewhat niche-y feature that likely won't be used by most publishers. By contrast, some sort of pragmatic limitations on mutability to prevent old packages or releases from getting deleted can prevent a lot of pain for users. That was how I stumbled into this discussion in the first place, and I suppose my preference is that the former not block the latter.
I don't believe the two are strictly related, I think the main thing blocking an immutability on releases older than N hours/days/months/whatever is just that I'm not entirely convinced it's a good idea to prevent people from being able to upload wheels for old releases.
I think I may have muddled things up a bit. There's really four factors here around the staging area and immutability:
For the folks in this thread who don't already know the context: The folks working on Warehouse have gotten funding to concentrate on improving and deploying Warehouse, and have kicked off work towards our development roadmap -- the most urgent task is to improve Warehouse to the point where we can redirect pypi.python.org to pypi.org so the site is more sustainable and reliable.
Since this feature isn't something that the legacy site has, I've moved it to a future milestone. But I have opened #2891 for a related feature that we might be able to do in the next few months.
Thanks and sorry again for the wait.
Legacy PyPI used to have "hidden releases", and Warehouse does not. I didn't realize that we'd removed that feature till after writing up the list of deprecations in the PSF blog post. Given that, I'd like for the Warehouse community to prioritize this feature, but am open to pushback of course.
Hidden releases didn't do anything useful, it just hid them in the API, they were still completely there for pip install and such. Though I'm not against prioritizing this feature.
I think @dstufft meant "web UI" rather than API. Still, it wouldn't hurt to have it back (with a UX that makes it clearer that it's just a cosmetic change, and doesn't prevent installation of the nominally hidden version)
Er yea, I meant the web UI. I also think it would hurt, I think it is legitimately confusing to have the concept of a hidden release both for the project maintainers and for the end users installing the software.
For project maintainers, no matter how well we document it, there's always going to be confusion about whether or not it hides it from pip or if it's just in the UI. You can't document your way out of a UX problem, and I think that's exactly what it would be. The closest thing I could imagine doing along these lines is allowing the maintainer to pick the "promoted" release, with the default being "latest non-prerelease, or latest pre-release if none are non-prerelease". That would just make it the default version they land on when you go to the un-versioned URL. However, even that feels iffy to me, because I think it breaks the consistency of Warehouse, and I think that offering end users consistent behavior between projects is important here.
For end users, I think it's confusing as all hell if pip install ... can give them a version that doesn't show up at all in the release history for the project. The promoted idea above solves this, but I think it introduces a new confusion where pip install foo and /p/foo/ give you different results. This isn't an entirely new confusion though, since that's already the case if python-requires is in play and your version of Python doesn't match the latest, but it certainly makes it more likely.
As a data point, issue #3699 is an example of how this marks a change from legacy pypi and specifically the way gevent uses it---specifically, we pretty much took the "promoted" release approach. gevent has a stable 1.2.2 release and has been making alpha 1.3 releases for a few months now, and recently produced a beta 1.3. During the alpha period, I used legacy pypi to keep the 1.3aX releases hidden, showing 1.2.2 by default. During the beta period, I usually have both 1.2.2 and 1.3bX listed so visiting pypi.python.org/pypi/gevent/ shows them both as I try to move testing over to 1.3. If there's a release candidate phase, I would have that 1.3rcX release as the only non-hidden release and thus shown by default, and of course once 1.3.0 came out that would be the only non-hidden release.
I use a similar approach for RelStorage, where I have an old version still listed because of its legacy compatibility constraints.
Of course this is a manual process, and it's sometimes error prone---In fact, #3699 reminded me that I'd forgotten to switch the hidden releases on pypi.python.org. Whoops!
I understand the arguments for consistency with what pip is going to do by default. I only mention this to provide an example of projects that did use that feature (basically for advertising purposes). Since I'm not offering to write a PR that implements this for warehouse, I won't be at all upset if it is a rejected feature 😄
To be clear: this issue is not about adding back legacy-style hidden releases, it's about adding a "staging" phase for new uploads before they are published. If folks really want legacy-style hidden releases, it probably should be a separate issue.
I created https://github.com/pypa/warehouse/issues/3709 to separate out the "legitimate use cases of a hidden-releases-like feature" discussion.
I reached to a similar approach to what @ncoghlan's linked to in https://github.com/pypa/warehouse/issues/726#issuecomment-348042448.
At the PyPA minisummit in early May (at PyCon North America), we discussed the desire to increase pip's and PyPI's strictness with regard to metadata. Part of that progress will require staged releases -- a temporary state that a release can be in where PyPI has it but hasn't published it yet. This will let project owners/maintainers review where their fresh releases are out of compliance, and help us provide soft warnings during the intermediate period where we're warning maintainers/owners about failing strictness checks but not yet blocking releases on those new stricter checks. I think of this state/feature as "package preview".
We'll need database support for understanding the release state ("is this published or not").
As I understand it, this is a feature we want and are soliciting help to implement.
Yep, "package preview" would be a good name for it.
I would definitely use this (and I thought test.pypi.org let us do this, but it does not).
Since we want this feature, and since so many other features depend on it, I've added it to the Fundable Packaging Improvements wiki page; organizations that are interested in donating a directed gift to implement this, please contact the Packaging Working Group and we can start scoping it out further.
This would be great. I also incorrectly assumed Test PyPI did exactly this, like many before me, and got the dreaded File already exists while trying to test upload metadata
Filed issue #6843 after seeing the debacle three hours with a source package only can cause. Sorry, might be a duplicate of this.
Per request, repeating here my thoughts from #6843. I see three possible solutions, each of which would fix the underlying problem (not suggesting that I'm the first one to think about them, but just trying to consolidate them here; having a "hidden" release-in-progress like proposed above would work too):
be able to mark a specific version as "default". It would be this version that is used by a plain "pip install xxx". With this, we would be able to complete and test the release before switching the "default" to it. It should be an opt-in behavior, of course; if the "default" version is not explicitly set then it's the most recent version like now.
or, pip could prefer a binary wheel from a previous release over the source code from the current release (again, only if the project is configured in that way). This may require changes inside pip, though.
or, we could upload and test the new version on TestPyPI. This solution would only require a single-click button to atomically transfer a complete release from TestPyPI to PyPI. It might be the simplest to implement.
I would like to add a late "+1" to the idea of separating the 'upload' and 'publish' steps. I'm working on a simple automated release pipeline which:
If there's some problem with the package on PyPI-Test, then I have to bump the version stored in the commit, or PyPI-Test will tell me that the file already exists (well, it's right - it does).
However, this now either ties the version on PyPI to the number of failures getting to PyPI-Test (making the first release on PyPI be version 1.0.2 would certainly look a bit odd), or I have to keep a 'prerelease' part of the version in place until everything passes through the PyPI-Test stages, and then do another commit with the 'real' version and run the pipeline again. Commits and version numbers are fairly cheap, but still have costs.
Currently, I'm working around this by making my __init__.py aware (via an environment variable) of whether it's for PyPI-Test or PyPI, and having it tweak the version reported. However, that's quite clumsy. Having the separate upload and publish steps (with the package fully mutable - including delete - until publish), along with pip install --unpublished mypkg==1.0.0 would be a much nicer solution.
@di My recollection is that you mentioned that you're working on this. How is that going?
I'd love to have something like this but for security reasons.
I love automation, and want to be able to use something like Github Actions/Travis/Gitlab to automatically build packages and upload artifacts to PyPI... especially since it allows me to easily build packages across multiple platforms/versions (Linux/macOS/Windows)
BUT I don't want anything to be automatically published (not behind some flag on twine either, my upload token can only be used to upload new packages/artifacts but there is no way to use said token to publish).
Then the maintainers for a package (hopefully with 2FA enabled) would have to login to PyPI and then clickety clack buttons to publish the package to the wider web.
Preferably with some way of validating that the shasum on pypi matches what is available in the CI logs so that I can validate that the packages have not been tampered with after being built.
I would also want to have some way to remove an unpublished artifact, in case the upload token was stolen, and was abused, so that an attacker can't force version numbers to be bumped unnecessarily.
As I said in https://github.com/pypa/warehouse/issues/726#issuecomment-500162652 , having this feature will help us roll out various "hey maintainer! watch out!" notifications. This will help, among other things, with smoothing out the rollout of the new and improved pip resolver, so we can tell maintainers "hey! the new resolver is having a hard time with your package! help us out?" Or similar.
We'd love help with this feature, if anyone wants to step up. (Dustin hasn't started working on it.)
Chatting with @alanbato about this today, our thoughts on how to do here:
Release model:published - a timestamp, added from created for all existing releasesProject model:auto_publish_releases - boolean, default to truepublished is set for public UIs, simple UIa1b2c3 is a hash of the release idpublished timestamp is setLong term, we'll also want the following:
Any feedback here welcome, especially thoughts about:
pip (obfuscated simple index and --index-url?)Sounds great @di
And just to be clear, an unpublished release can be deleted entirely, right?
For everyone who wanders here and would like to chip in, an implementation proposal is now being discussed at:
https://discuss.python.org/t/feature-proposal-for-pypi-draft-releases/3903
@alanbato would like comments on this proposal by 30 April (10 days from now). I posted some background context on distutils-sig.
@alanbato listed some still-outstanding questions on Discourse.
@alanbato How's this going?
I haven't had the chance to work on the proof of concept, as I'm going thorugh the last weeks of getting my degree.
Once that's done I'll spin up a PR with a PoC so people can comment on the implementation rather than the idea (since nobody had nothing else to say).
For the unanswered questions, I'll make my own decisions and assumptions and I have something to show I'm sure it'll be easier to make a decision.
@alanbato Congrats on the final weeks of getting your degree! Should we expect to hear from you again around, say, 7 July?
@brainwane Yes, I'll have something to show by then :)
Hi, @alanbato! How is the feature coming along?
how installing an unpublished release should work with
pip(obfuscated simple index and--index-url?)
After experimenting with @alanbato on this today, we determined this should be --extra-index-url instead, so the "draft"/obfusticated index only needs to include the draft release of the draft package, and doesn't need to include every other package available.
(@ewdurbin originally said this here)
I think that PyPI is getting closer to this feature nowadays with the ability to "yank" and "un-yank" releases. You can make but "yank" your release as quickly as possible, which makes it not used by default pip install myproject. It will be used by an explicit pip install myproject==1.3. When everything seems to work, "un-yank" it. There is still a window at the start during which a high-volume project is likely to get a few downloads from unsuspecting users---which will then be stuck with your release candidate even if you yank it as long as they are using the same virtualenv, as far as I could test---but that's still progress.
If only we could create a new release initially in the "yanked" state, we would actually have a way to do atomic PyPI releases.
@alanbato How's this going?
Hey @brainwane! So, the current status is this:
Once @di has time to review, I'll make the necessary changes and get it ready for a formal PR to this repo :)
Most helpful comment
At the PyPA minisummit in early May (at PyCon North America), we discussed the desire to increase pip's and PyPI's strictness with regard to metadata. Part of that progress will require staged releases -- a temporary state that a release can be in where PyPI has it but hasn't published it yet. This will let project owners/maintainers review where their fresh releases are out of compliance, and help us provide soft warnings during the intermediate period where we're warning maintainers/owners about failing strictness checks but not yet blocking releases on those new stricter checks. I think of this state/feature as "package preview".
We'll need database support for understanding the release state ("is this published or not").
As I understand it, this is a feature we want and are soliciting help to implement.