Pipenv: Running `pipenv --completion` is very slow

Created on 23 Dec 2017  Â·  31Comments  Â·  Source: pypa/pipenv

Running pipenv --completion is consistently slow (it takes about 1 second to run), even after the command has been run multiple times and everything has been loaded into memory.

This means including eval $(pipenv --completion) in a .bashrc file makes the user wait for a second until the shell prompt comes up.

Describe your environment
  1. OS: Kubuntu 17.04
  2. Python version: 2.7.13
  3. Pipenv version: 9.0.1

My shell is bash.

Expected result

The command shouldn't take so long to run.

Actual result

When running in bash, the command takes a whole second to run.

$ time pipenv --completion

_pipenv_completion() {
    local IFS=$'\t'
    COMPREPLY=( $( env COMP_WORDS="${COMP_WORDS[*]}" \
                   COMP_CWORD=$COMP_CWORD \
                   _PIPENV_COMPLETE=complete-bash $1 ) )
    return 0
}

complete -F _pipenv_completion -o default pipenv


real    0m0.987s
user    0m0.888s
sys     0m0.080s
Steps to replicate

Run time pipenv --completion in a bash shell.

P.S. This is a great project.

Type help wanted

Most helpful comment

@eli-schwartz click handles much of the overhead so it saves us from having to be experts on this specific thing most of the time

Yes, but I sort of wish click generated proper completions. :smiley: There are tools which generate "proper" completions thereby saving every application developer the effort of knowing how it works, while still not recursively calling the tool itself for every little thing.

Anyway, not understanding how to write completions is one thing, not knowing how to install them is another.

@eli-schwartz though this doesn’t stay up-to-date when you upgrade. i have far too many things in my completions to remember to re-export their completion script every time up upgrade.

But it does. The current completion simply executes pipenv every time you try to complete anything; this doesn't need to be checked for updates to the completion definition...

Either way it does not stop the completion from being delivered in data_files to share/bash-completion/completions, which should work, and be upgraded, whether installed via sudo pip (:cry:), pip install --user (since https://github.com/scop/bash-completion/commit/1d25d72ca8633c19cb089dff447d08c531379c59 it will be loaded from ~/.local), or python setup.py install --root="$package_manager_install_root".
(It will not work for wheels, in which case oh well, you're certainly not worse off than if you didn't use data_files at all. Also IMHO wheels suck.)

This would save on one initial call to pipenv on every single new interactive shell, which while not quite as significant as one call on every single pipenv <TAB> is still not insignificant. It's why bash-completion created the idea of lazy-loaded completions and tries to encourage people to use that over the "COMPAT_DIR" in /etc/bash_completion.d.

you should not assume that people don’t know much at all about thing x. it’s better to assume that their workflow, use-case, or tolerance is different to yours.

In the ideal case, users do not know, because it is automagically handled by developers or distro packagers. For example, https://git.archlinux.org/svntogit/community.git/commit/trunk?h=packages/python-pipenv&id=484eec9eabad806654fea29801577dc2cbe4188a

...

At the very least, put your eval thing into the lazy-load directory, so it doesn't execute on every single new shell init but only gets executed the first time you try to tab-complete this specific tool.

All 31 comments

Same here using zsh.

Describe your environment

  1. OS: Arch Linux (4.14.8-1-ARCH)
  2. Python version: 3.6.3
  3. Pipenv version: 9.0.1

Result

$ time pipenv --completion 

#compdef pipenv
_pipenv() {
  eval $(env COMMANDLINE="${words[1,$CURRENT]}" _PIPENV_COMPLETE=complete-zsh  pipenv)
}
if [[ "$(basename ${(%):-%x})" != "_pipenv" ]]; then
  autoload -U compinit && compinit
  compdef _pipenv pipenv
fi

pipenv --completion  1.20s user 0.09s system 99% cpu 1.301 total

much more slow ..

pipenv --completion  2.64s user 0.29s system 99% cpu 2.954 total

I got a same issue

  1. OSX sierra
  2. Python 2.7.14
  3. pipenv 9.0.1
pipenv --completion  1.15s user 0.20s system 98% cpu 1.374 total 

Same with zsh

  1. elementary OS Loki (based on Ubuntu 16.04)
  2. Python 3.5.2
  3. pipenv 9.0.1

pipenv --completion 1,11s user 0,08s system 100% cpu 1,186 total

fish on Xubuntu 17.10 (virtualbox), python 3.6.3, pipenv 9.0.1

1.28user 0.17system 0:02.09elapsed 69%CPU

Um, this seems to be turning into a report zone. I think by this point everyone reading this thread realises pipenv --completion is very slow, and would like to ask people from here on to stop putting up your time results. Thanks in advance.

Okay, so I just had this issue so decided to go looking at the code... The reason would seem (from a quick skim) that everything is in the https://github.com/pypa/pipenv/blob/master/pipenv/cli.py file, which has ~58 lines of imports. Of course, 99% of these probably aren't necessary to generate the ~8 fairly static lines for completion!

I'd say the solution would be to split the CLI commands out into at least something like cli for parsing, and cli_heavy where commands are actually run (and do all the importing there, _after_ the arg parser has done it's thing).

I might get around to trying something like this in a bit. If not, it's documented here from others :)

@RickyCook there actually was an implementation that used lazy imports to approach this problem and was largely functional but had to be reverted due to corruption of master. I am definitely not the click expert on the maintainer team but if you know how to speed up imports I'm sure we would love to see an implementation and would be willing to offer some suggestions

Actually when I first investigated the issue myself I believe I discovered that the reason is that pipenv always does some network communication to determine if it needs to update itself, even if it's just printing the output for pipenv --completion. Is that not the case? It would make sense to disable auto-update for this specific command, especially since it is meant to be run only once at the beginning of a shell session. Theoretically pipenv could go out of date during the session anyway, resulting in incorrect completion behavior.

@techalchemy I have a basic version that's doing a kind of lazy imports (but it's pretty gross; could be done much more intuitively with a bit more time and effort). at the moment, it doesn't speed up much ~because of some "delegate" thing, which is a subprocess wrapper of some sort? seems very strange; about to look into it~ (I take that all back; delegate effects time very little, as shown with the timings below. The lazy imports cut out almost all the 0.38s spent on importing, as expected)

@bdusell update is off by default, but if it's on then that will definitely effect performance

updates are at
https://github.com/pypa/pipenv/blob/master/pipenv/cli.py#L1625

completion is at
https://github.com/pypa/pipenv/blob/master/pipenv/cli.py#L1638

after some more simplistic investigation, I'm not really sure where the time is going :\
after smattering some timing statements from before the first import through to the sys.exit:

out of 1.47 seconds,

  • 0.38 spent on imports
  • 0.001 spent on executing cli.py (from after the last import to the last line in the file)
  • 0.001 spent on getting from cli.py executed to the start of the def cli(...)
  • 0.000006 spent getting from the start of the cli func to just before the if completion block
  • 0.003 spent on executing the completion block

clearly there's a huge chunk of... something missing here

methodology for this was:

import time

times = []
times.append(time.clock())

# ... do things ...

times.append(time.clock())
print("something happened", times[-1] - times[-2])

# ... do more things ...

times.append(time.clock())
print("more things happened", times[-1] - times[-2])

this should add up pretty well to the full execution length of the program if continued right to the end, so there's clearly something happening before the first import, and/or after the os.exit

for reference, with the lazy load hack the same timings were:

total time is 1.05s

  • 0.02 until end of imports
  • 0.001 until end of cli.py
  • 0.0005 until start of def cli(...)
  • start of the cli func is also start of if completion block
  • 0.002 until os.exit

here's the super hacky POC with debug timing: https://github.com/rickycook/pipenv/blob/cli-speed/pipenv/cli.py

@RickyCook The general idea is not hacky at all; I use this kind of technique a lot to gain speed around seldom used imports. The load_heavy_command() decorator is a bit too magical to me though; I would prefer just explicitly import corresponding implementation in each command instead.

@click...
@a lot of click things...
def install(*args, **kwargs):
    from .cli_heavy import install
    install(*args, **kwargs)

yeah, the decorator was the hacky weird bit that I just used to avoid duplicating a bunch with copy/paste while testing. the part I specifically didn't like was the disconnect between the function body, and the actual executed code :P

... but this doesn't really fix the loading speeds anyway, so it's probably not worth changing unless we can figure out what the cause of the other 60% of the lost time is

Updated to remove the delegator, and replace with call directly to click_completion and we're down to 0.4s: https://github.com/rickycook/pipenv/blob/cli-speed/pipenv/cli.py#L245

It's ~0.9 for me:

pipenv --completion 0.92s user 0.13s system 99% cpu 1.062 total

Also, even 0.4s is so slow.

you get what you pay for :)

tldr: PR incoming to ~halve the time pipenv --completion takes, at least for me.


(click to expand, if somehow you're interested in what are basically random notes I was making while doing this :P)

Even with a hot cache, it's still over a second here. It seems to take an inordinate amount of processing power to essentially just print some text (yes, that's briefly 60% on one core):

image

The actual completion itself is ok, it's just outputting the setup that takes way longer than it should.

Let's try to work out why!

Here's some data from my (admittedly somewhat old) machine, gathered via sprinkling print(..., time.time()) around in interesting places:

__init__: startup 1520456183.8500586
__init__: importing .cli: 1520456183.8503625
cli: startup 1520456183.8590631
cli: stdlib imports done 1520456183.8704586
cli: 3rd-party imports done 1520456183.9269598
cli: 1st-party imports done 1520456183.9281874
cli: end of module reached 1520456183.9306123
__init__: imported .cli: 1520456183.9306629
__main__: importing cli 1520456183.9308915
__main__: imported cli 1520456183.930915
__main__: running cli() 1520456183.9309256
cli: start of cli() 1520456183.9327562
cli: imported .core 1520456184.3628652
cli: not doing update check 1520456184.3629308
cli: started doing completion 1520456184.3629456
cli: beginning process 1520456184.3629816
cli: started process 1520456184.8666577

_pipenv_completion() {
    local IFS=$'\t'
    COMPREPLY=( $( env COMP_WORDS="${COMP_WORDS[*]}" \
                   COMP_CWORD=$COMP_CWORD \
                   _PIPENV_COMPLETE=complete-bash $1 ) )
    return 0
}

complete -F _pipenv_completion -o default pipenv

cli: printed process output 1520456184.8668966

Conclusions:

  • it takes about 50ms to import all the 3rd-party libraries used in pipenv.cli
  • importing pipenv.cli itself takes about 80ms (~10ms to start actually executing cli.py, 50ms of 3rd-party imports, a little more here and there)
  • importing pipenv.core takes about 420ms
  • running pipenv again via delegator.py takes about 480ms (in comparison, running true by the same method takes only about 4ms)
  • printing the actual output from the child pipenv takes about 1ms

Pain points:

  • from . import core
  • delegator.run('pipenv')

Profiling pipenv.core

core: startup 1520456888.883466
core: done import section 1 1520456888.8928788
core: done import section 2 1520456889.2973905
core: done import section 3 1520456889.2974393
core: done all imports 1520456889.317209
core: end of module reached 1520456889.3178222

Looks like the second group of imports is taking 400ms (of 420ms for the whole core module to run)! Drilling down further:

core: done import section 1 1520457284.6354196
core: importing requests 1520457284.6517467
core: importing pipfile 1520457284.7360592
core: importing pipdeptree 1520457284.7379525
core: importing semver 1520457285.025176
core: imported those 1520457285.0279028
core: done import section 2 1520457285.0321164

Requests is taking ~100ms (it's a big comprehensive library, not that surprising), but pipdeptree takes nearly 300ms all by itself!

I could probably try and work out why importing that takes so long, but I have to wonder – why is pipenv.core being imported before completion runs? If I move the if completion: block right up to the start of pipenv.cli.cli, that halves the amount of time pipenv --completion takes to run!

(I went and looked at why pipdeptree takes so long to import – it's because pip takes a relatively long time to import (import pipdeptree + python overhead took about 500ms in testing, with import pip being 350ms of that).)

i wouldn’t have called 0.4s a satisfying completion to this issue; it’s still a LOT slower than it should be, and still fairly unusable to add to shell rc scripts

I'm guessing you're getting that super, super fast 0.4s time because you're not running on a Raspberry Pi 3? I'm currently getting 6 seconds on v11.8.0.

[me@piserver ~]$ time pipenv --completion

_pipenv_completion() {
    local IFS=$'\t'
    COMPREPLY=( $( env COMP_WORDS="${COMP_WORDS[*]}" \
                   COMP_CWORD=$COMP_CWORD \
                   _PIPENV_COMPLETE=complete-bash $1 ) )
    return 0
}

complete -F _pipenv_completion -o default pipenv

real    0m6.283s
user    0m6.199s
sys 0m0.080s

This might be a naive question, but is there any reason we can't cache the output of the command for the current Pipenv version?

the solution is to just pipe the output of --completion and put the output into your .profile

this is pretty obvious :)

What a great time to be alive. Output some static text - 0.4 seconds

@Djaler you're welcome to fix it and pr your fix if you have one

the solution is to just pipe the output of --completion and put the output into your .profile

The solution is to pipe the output to /usr/share/bash-completion/completions/pipenv or ~/.local/share/bash/completion/completions/pipenv, but surprisingly few people seem understand much at all about completion.

Of course, this does nothing to speed up the actual tab-completion part, despite the fact that most completion definitions (i.e. the non python-autogenerated ones) are written to avoid shelling out to any external commands more than necessary, using shell builtins wherever possible and generating then hardcoding (via shell arrays) anything application-specific in the 99% of cases where it won't change without source code changes.

@eli-schwartz click handles much of the overhead so it saves us from having to be experts on this specific thing most of the time

@eli-schwartz though this doesn’t stay up-to-date when you upgrade. i have far too many things in my completions to remember to re-export their completion script every time up upgrade.

so far, the best solution for my workflow has been to remove the completion, and stop using pipenv until it’s fixed.

you should not assume that people don’t know much at all about thing x. it’s better to assume that their workflow, use-case, or tolerance is different to yours.

As a small FYI pipenv load time should be substantially faster in the next release. I don’t really know what would be acceptable if 0.4s made you uninstall and stop using pipenv entirely though (what is the implication here? You can’t use pipenv unless it has completion? I rarely use completion so I’m not even sure why this would make you abandon a tool entirely

@eli-schwartz click handles much of the overhead so it saves us from having to be experts on this specific thing most of the time

Yes, but I sort of wish click generated proper completions. :smiley: There are tools which generate "proper" completions thereby saving every application developer the effort of knowing how it works, while still not recursively calling the tool itself for every little thing.

Anyway, not understanding how to write completions is one thing, not knowing how to install them is another.

@eli-schwartz though this doesn’t stay up-to-date when you upgrade. i have far too many things in my completions to remember to re-export their completion script every time up upgrade.

But it does. The current completion simply executes pipenv every time you try to complete anything; this doesn't need to be checked for updates to the completion definition...

Either way it does not stop the completion from being delivered in data_files to share/bash-completion/completions, which should work, and be upgraded, whether installed via sudo pip (:cry:), pip install --user (since https://github.com/scop/bash-completion/commit/1d25d72ca8633c19cb089dff447d08c531379c59 it will be loaded from ~/.local), or python setup.py install --root="$package_manager_install_root".
(It will not work for wheels, in which case oh well, you're certainly not worse off than if you didn't use data_files at all. Also IMHO wheels suck.)

This would save on one initial call to pipenv on every single new interactive shell, which while not quite as significant as one call on every single pipenv <TAB> is still not insignificant. It's why bash-completion created the idea of lazy-loaded completions and tries to encourage people to use that over the "COMPAT_DIR" in /etc/bash_completion.d.

you should not assume that people don’t know much at all about thing x. it’s better to assume that their workflow, use-case, or tolerance is different to yours.

In the ideal case, users do not know, because it is automagically handled by developers or distro packagers. For example, https://git.archlinux.org/svntogit/community.git/commit/trunk?h=packages/python-pipenv&id=484eec9eabad806654fea29801577dc2cbe4188a

...

At the very least, put your eval thing into the lazy-load directory, so it doesn't execute on every single new shell init but only gets executed the first time you try to tab-complete this specific tool.

Was this page helpful?
0 / 5 - 0 ratings