Hello,
I hope this is the right place to ask, but is it normal, that plotly takes a huge amount of time? I have two PCs both with SSDs and Anaconda installed. To the default anaconda installation I added plotly by running
conda install plotly
On both PCs the time to import plotly is just enormous. On my desktop PC running Ubuntu 17.04 64-Bit
time python -c "import matplotlib"
real 0m0.150s
user 0m0.140s
sys 0m0.008s
time python -c "import pandas"
real 0m0.384s
user 0m0.340s
sys 0m0.040s
time python -c "import plotly"
real 0m2.184s
user 0m1.156s
sys 0m0.072s
and on my Laptop running Windows 7 64-Bit
time python -c "import matplotlib"
real 0m0.442s
user 0m0.015s
sys 0m0.000s
time python -c "import pandas"
real 0m0.933s
user 0m0.016s
sys 0m0.015s
time python -c "import plotly"
real 0m4.881s
user 0m0.015s
sys 0m0.016s
Is this in the normal range?
EDIT: I upgraded my plotly version from 1.12.9 to 2.0.7 and the times dropped significantly, albeit still high. Nice to see the progress being made. Now it's:
Ubuntu
real 0m0.834s
user 0m0.788s
sys 0m0.044s
Windows
real 0m2.816s
user 0m0.000s
sys 0m0.015s
I'm currently looking at porting a project to run on AWS lambda and the startup time of plotly is almost 90% of the startup time of the program.
A benchmark of a python file which just imports plotly:
import plotly
takes almost 3 seconds.
The vast majority of this time (which will vary depending on your connection) is that when you import, it makes a connection to api.plot.ly and downloads a graph reference.
Disabling this by blocking requests to api.plot.ly took the load time down to 770ms which is still very slow.
The next biggest bottleneck is ensure_local_plotly_files which is slow and its result isn't cached, so it ends up being called multiple times. Of the 770ms, this one function was responsible for 550ms. It performs huge json encodes which take up pretty much all of that run time.
Disabling this function took it down to 220ms but I have no idea if it's broken anything, haven't tested thoroughly enough.
Edit: Just noticed your edit. The above numbers are on 1.12.9. If we upgrade to 2.x maybe some of this is fixed and we might see some performance improvements.
As of version 3, importing plotly no longer performs any network connectivity, but the import time is still 1-2 seconds due to the number of code-generated files that are now present in the plotly.graph_objs hierarchy. I'm not sure how much we'll be able to optimize this, but I'll leave this issue open with the performance label to remind us to give it another pass at some point.
Ouch! Win10, i7, SSD.
$ time python -c "import plotly"
real 0m13.315s
user 0m0.047s
sys 0m0.000s
Ouch! On my web server:
timeit.timeit('import plotly', number=1)
7.154063940048218
mean 8.6sec with 4 repeats. It is really slowing down my app. I had to put plotly import inside a function to avoid it slowing down my index page.
@liuyigh can you give some details? Python version, plotly version, OS, etc.
Running Python 3.6.7, Plotly 3.6.0 I get
>>> timeit.timeit('import plotly', number=1)
0.26097527099955187
My server is cheap f1-micro google cloud VM instance, 600MB RAM, 0.2 virtual-CPU. I ran python 2.7.15, plotly 3.5.0 on ubuntu 18.04, it can take 16-17 seconds to import plotly, average 8.6 seconds.
In comparison, pandas only takes average 0.5 sec.
My pyenv environment is configured under root/system as detailed by this Japanese website:
https://qiita.com/u_kan/items/d7e602bf1cf52f6b0935
Does plotly import attempt to login with retries?
On my local macbook, MacOS10.14.2 Python 3.7.2 plotly 3.4.2, plotly takes 1-3 seconds to load.
Thanks for the details @liuyigh.
Does plotly import attempt to login with retries?
No, plotly doesn't perform any network / login logic on import. I believe the import time is primarily due to the number of classes/files that are code generated into the plotly.graph_objs hierarchy.
One experiment I'd like to try at some point is to see if the import time improves if we reduce the number of files involved. Right now each graph_objs class is code generated into a separate .py file, and then all graph_obj classes in a single level of the hierarchy are imported by the __init__.py file of that hierarchy level. I'm wondering if there would be a noticeable improvement if we updated the code generation logic to inline all of the classes into the associated __init__.py file.
Please chime in if anyone has experience with this or is interested in trying this experiment.
I had the same issue and was digging into it. https://github.com/plotly/plotly.py/commit/3678aa925489b9ed429dc28863040dbb391dadb1#diff-9d80df10529ec57d3bd692811816978d does some refactoring here.
Quoting the commit message:
## Import optimization
This PR also makes a relatively minor change to the code generation logic for
`graph_objs` and `validator` that yields an import time reduction of ~10-20% .
Rather that creating a single file for each datatype and validator class, all of
the classes in a `graph_obj` or `validator` module are specified directly in
the `__init__.py` file. This reduces the number of files significantly, which
seems to yield a modest but consistent speedup while being 100% backward
compatible.
For what it's worth, my stats on Macbook Pro python 3.6 are:
0m2.005s - plotly 3.7.1
0m0.733s - plotly 3.8.1
Thanks for sharing your timing results @AbdealiJK, glad you're seeing an improvement. Could you try timing this version as well
from _plotly_future_ import remove_deprecations
import plotly
This removes automatic importing of modules that will be moved/deprecated in version 4 (see https://github.com/plotly/plotly.py/pull/1476) and should be a little faster still. This is roughy what I expect the version 4 import time time be.
@liuyigh, @marvingee, @gdw2 if you're still interested in this, would also appreciate your timing tests for the import above with the remove_deprecations future option enabled.
Sorry @jonmmease, I no longer have a win10 machine!
Seems like I missed the ping.
Adding to my previous benchmark with time python -c "import plotly":
1.35 s - plotly 3.7.1
903 ms - plotly 3.8.1
390 ms - plotly 3.8.1 with `from _plotly_future_ import remove_deprecations`
very nice!
Import time and initialization time should be much improved on Python 3.7 with PR https://github.com/plotly/plotly.py/pull/2368.
Now that the improvements from #2368 have been released, as part of version 4.7.1, I'll go ahead and close this issue 馃帀
Most helpful comment
As of version 3, importing plotly no longer performs any network connectivity, but the import time is still 1-2 seconds due to the number of code-generated files that are now present in the
plotly.graph_objshierarchy. I'm not sure how much we'll be able to optimize this, but I'll leave this issue open with the performance label to remind us to give it another pass at some point.