Plotly.py: Importing plotly takes a lot of time

Created on 25 Apr 2017  路  13Comments  路  Source: plotly/plotly.py

Hello,
I hope this is the right place to ask, but is it normal, that plotly takes a huge amount of time? I have two PCs both with SSDs and Anaconda installed. To the default anaconda installation I added plotly by running

 conda install plotly

On both PCs the time to import plotly is just enormous. On my desktop PC running Ubuntu 17.04 64-Bit

time python -c "import matplotlib"
real    0m0.150s
user    0m0.140s
sys 0m0.008s


time python -c "import pandas"

real    0m0.384s
user    0m0.340s
sys 0m0.040s


time python -c "import plotly"

real    0m2.184s
user    0m1.156s
sys 0m0.072s

and on my Laptop running Windows 7 64-Bit

time python -c "import matplotlib"
real    0m0.442s
user    0m0.015s
sys 0m0.000s


time python -c "import pandas"

real    0m0.933s
user    0m0.016s
sys 0m0.015s


time python -c "import plotly"

real    0m4.881s
user    0m0.015s
sys 0m0.016s

Is this in the normal range?

EDIT: I upgraded my plotly version from 1.12.9 to 2.0.7 and the times dropped significantly, albeit still high. Nice to see the progress being made. Now it's:

Ubuntu

real    0m0.834s
user    0m0.788s
sys 0m0.044s

Windows

real    0m2.816s
user    0m0.000s
sys 0m0.015s
performance

Most helpful comment

As of version 3, importing plotly no longer performs any network connectivity, but the import time is still 1-2 seconds due to the number of code-generated files that are now present in the plotly.graph_objs hierarchy. I'm not sure how much we'll be able to optimize this, but I'll leave this issue open with the performance label to remind us to give it another pass at some point.

All 13 comments

I'm currently looking at porting a project to run on AWS lambda and the startup time of plotly is almost 90% of the startup time of the program.

A benchmark of a python file which just imports plotly:

import plotly

takes almost 3 seconds.

The vast majority of this time (which will vary depending on your connection) is that when you import, it makes a connection to api.plot.ly and downloads a graph reference.
Disabling this by blocking requests to api.plot.ly took the load time down to 770ms which is still very slow.

The next biggest bottleneck is ensure_local_plotly_files which is slow and its result isn't cached, so it ends up being called multiple times. Of the 770ms, this one function was responsible for 550ms. It performs huge json encodes which take up pretty much all of that run time.
Disabling this function took it down to 220ms but I have no idea if it's broken anything, haven't tested thoroughly enough.

Edit: Just noticed your edit. The above numbers are on 1.12.9. If we upgrade to 2.x maybe some of this is fixed and we might see some performance improvements.

As of version 3, importing plotly no longer performs any network connectivity, but the import time is still 1-2 seconds due to the number of code-generated files that are now present in the plotly.graph_objs hierarchy. I'm not sure how much we'll be able to optimize this, but I'll leave this issue open with the performance label to remind us to give it another pass at some point.

Ouch! Win10, i7, SSD.

$ time python -c "import plotly"

real    0m13.315s
user    0m0.047s
sys     0m0.000s

Ouch! On my web server:

timeit.timeit('import plotly', number=1)
7.154063940048218

mean 8.6sec with 4 repeats. It is really slowing down my app. I had to put plotly import inside a function to avoid it slowing down my index page.

@liuyigh can you give some details? Python version, plotly version, OS, etc.

Running Python 3.6.7, Plotly 3.6.0 I get

>>> timeit.timeit('import plotly', number=1)
0.26097527099955187

My server is cheap f1-micro google cloud VM instance, 600MB RAM, 0.2 virtual-CPU. I ran python 2.7.15, plotly 3.5.0 on ubuntu 18.04, it can take 16-17 seconds to import plotly, average 8.6 seconds.

In comparison, pandas only takes average 0.5 sec.

My pyenv environment is configured under root/system as detailed by this Japanese website:
https://qiita.com/u_kan/items/d7e602bf1cf52f6b0935

Does plotly import attempt to login with retries?

On my local macbook, MacOS10.14.2 Python 3.7.2 plotly 3.4.2, plotly takes 1-3 seconds to load.

Thanks for the details @liuyigh.

Does plotly import attempt to login with retries?

No, plotly doesn't perform any network / login logic on import. I believe the import time is primarily due to the number of classes/files that are code generated into the plotly.graph_objs hierarchy.

One experiment I'd like to try at some point is to see if the import time improves if we reduce the number of files involved. Right now each graph_objs class is code generated into a separate .py file, and then all graph_obj classes in a single level of the hierarchy are imported by the __init__.py file of that hierarchy level. I'm wondering if there would be a noticeable improvement if we updated the code generation logic to inline all of the classes into the associated __init__.py file.

Please chime in if anyone has experience with this or is interested in trying this experiment.

I had the same issue and was digging into it. https://github.com/plotly/plotly.py/commit/3678aa925489b9ed429dc28863040dbb391dadb1#diff-9d80df10529ec57d3bd692811816978d does some refactoring here.

Quoting the commit message:

## Import optimization
This PR also makes a relatively minor change to the code generation logic for
`graph_objs` and `validator` that yields an import time reduction of ~10-20% .
Rather that creating a single file for each datatype and validator class, all of
the classes in a `graph_obj` or `validator` module are specified directly in
the `__init__.py` file.  This reduces the number of files significantly, which
seems  to yield a modest but consistent speedup while being 100% backward
compatible.

For what it's worth, my stats on Macbook Pro python 3.6 are:
0m2.005s - plotly 3.7.1
0m0.733s - plotly 3.8.1

Thanks for sharing your timing results @AbdealiJK, glad you're seeing an improvement. Could you try timing this version as well

from _plotly_future_ import remove_deprecations
import plotly

This removes automatic importing of modules that will be moved/deprecated in version 4 (see https://github.com/plotly/plotly.py/pull/1476) and should be a little faster still. This is roughy what I expect the version 4 import time time be.

@liuyigh, @marvingee, @gdw2 if you're still interested in this, would also appreciate your timing tests for the import above with the remove_deprecations future option enabled.

Sorry @jonmmease, I no longer have a win10 machine!

Seems like I missed the ping.
Adding to my previous benchmark with time python -c "import plotly":

1.35 s - plotly 3.7.1
903 ms - plotly 3.8.1
390 ms - plotly 3.8.1 with `from _plotly_future_ import remove_deprecations`

very nice!

Import time and initialization time should be much improved on Python 3.7 with PR https://github.com/plotly/plotly.py/pull/2368.

Now that the improvements from #2368 have been released, as part of version 4.7.1, I'll go ahead and close this issue 馃帀

Was this page helpful?
0 / 5 - 0 ratings