Pip: Installation candidate suggestion

Created on 4 Apr 2018  路  12Comments  路  Source: pypa/pip

  • Pip version: All
  • Python version: All
  • Operating system: All

Description:

Let's say I run

pip install package

is not available, so we get the following message, "Could not find a version that satisfies the requirement (from versions: ) No matching distribution found for "

What I would like to have is a list of possible installation candidates that the user might have meant.

pip install package

Looks like isn't available. Did you mean ?

  • Candidate-1
  • Candidate-2
  • Candidate-3
    and so on...

The use-case is to provide the user with alternatives if he/she has misspelt the package name, or are not sure of the package name. We can use spell-check/approximate string matching to display a list of installation candidates.

I would like to hear everyone's opinion on this. I would like to work on it :)

needs discussion feature request

All 12 comments

Hi @saikatkumardey! This sounds like a nice idea.

We're currently in the beta period for pip 10.0, which means we won't be merging any new features or enhancements until pip 10.0.0 is released.

Sorry to keep you waiting but I hope you'd understand. :)

I would like to work on it :)

Awesome. If you want to get started, you can work on one of the issues with a "good first issue" label on them.

Thanks @pradyunsg . :)

Ping @saikatkumardey! :)

Pong @pradyunsg! :)

Are you still interested in working on this (or something else related to pip)?

I am interested in working on this.

I think a good first step would be to investigate how to get a list of similarly-named PyPI projects, when an installation command fails. I'm not sure if we have such an endpoint on PyPI; this could even likely work with an endpoint that gives us all registered project names.

If the above aren't available, you'd need to make changes to PyPI to make them available for pip to consume and provide these suggestions.

If they are, then the next step would be figuring out "similar names" and then modifying the installation command to provide the suggestions.

Here's what I think:

An endpoint to get all package names exist at https://pypi.org/simple/ . We could periodically call this endpoint and cache the project names on disk. The only problem here is that we need to scrape the html and the number of items are huge.

An endpoint could be added to PyPI to allow partial match. In that case, we could simply do a pip search when installation fails.

I'll investigate more and update here.

As mentioned in PEP 503, I believe https://pypi.org/simple/ could be used to build out the first version of this feature without having to make any change to PyPI.

My approach:

  1. Store the list of packages on client's disk.
  2. Allow the client to update the list using pip update. This idea is inspired from apt update command.
  3. Use this list to provide "similar" package-names to the user when the installation fails. The "similarity" could be based on the approach by Peter Norvig https://norvig.com/spell-correct.html .

@pradyunsg Let me know your thoughts.

Having "pip update" only for getting a listing of packages on pypi names which goes out of date quickly and potential for confusion with a pip upgrade command that's on the horizon make this a little difficult for me me just say, yeah, let's do that. :/

The alternative of a multi-megabyte request (even with some caching) makes it difficult to consider the alternative of requesting everytime there's an error due to not all distributions available, is also not suitable IMO. :/

I agree with @pradyunsg here, none of the alternatives sound particularly attractive.

The simple index is the correct way to get the list of all projects on an index server (based on the defined capabilities of an index server on PEP 503). Storing a local copy shouldn't be necessary - that's why we cache HTTP requests after all. I'm a pretty strong -1 on any local copy that needs user intervention to keep up to date - if we need that, then our HTTP caching is failing to do its job. (FWIW, I don't like the apt approach of expecting the user to do apt update either - like @pradyunsg says, it's too easy to confuse "update the cache" with "update all my packages", as well as being an annoying manual step for the user).

However, grabbing, parsing, and extracting possible alternative spellings of a requirement from that (large) list is potentially costly, and the benefit is small. So while we should get it working first, then worry about optimising, I do think that a proof of concept that focuses on confirming up front that the cost is not too high, would be worthwhile.

If we were to add a new PyPI endpoint, it should be defined as an (optional) capability for all index servers and documented in PEP 503. The code in pip would need to cater for the possibility that the user's index didn't support that API and fail gracefully in that case (not everyone uses PyPI). Also, how will the code handle --find-links and similar? We need to consider how to ensure that users don't end up raising issues like "when I typed pip install foo by mistake, why didn't pip suggest fooo which I have in my local index/wheelhouse?" This doesn't have to be perfect in the initial implementation, but we should be considering it now, so that the design is flexible enough to deal with it properly in the longer term.

Overall, this proposal feels to me like it's a lot of complexity and effort for minimal benefit to the user. It's not that suggesting what the user might have meant isn't nice to have - but I'm not sure it justifies the cost in this case.

Thanks for your input, @pradyunsg and @pfmoore . Totally makes sense. I'll investigate further & build a POC in the process and update here. :)

Was this page helpful?
0 / 5 - 0 ratings

Related issues

jiapei100 picture jiapei100  路  3Comments

gyorireka picture gyorireka  路  3Comments

therefromhere picture therefromhere  路  3Comments

imzi picture imzi  路  3Comments

sbidoul picture sbidoul  路  3Comments