Spyder: Variable Explorer very slow with large dataframe

Created on 2 May 2020 · 6Comments · Source: spyder-ide/spyder

Issue Report Checklist

[X] Searched the issues page for similar reports
[X] Read the relevant sections of the Spyder Troubleshooting Guide and followed its advice
[X] Reproduced the issue after updating with conda update spyder (or pip, if not using Anaconda)
[ ] Could not reproduce inside jupyter qtconsole (if console-related)
[ ] Tried basic troubleshooting (if a bug/error)
- [X] Restarted Spyder
- [X] Reset preferences with spyder --reset
- [X] Reinstalled the latest version of Anaconda
- [ ] Tried the other applicable steps from the Troubleshooting Guide
[X] Completed the Problem Description, Steps to Reproduce and Version sections below

Problem Description

Variable Explorer is extremely slow in a dataframe containing 30k+ rows.
It's nearly nstantaneous in Spyder 3.3.3.
Scrolling is extremely laggy, window freezes frequently.

What steps reproduce the problem?

Run the code following:

import pandas as pd
import numpy as np
from datetime import datetime

bdates = pd.bdate_range(start = datetime(1900, 1, 1), end = datetime.today())
vect_random = np.random.normal(size=len(bdates))
df = pd.DataFrame(vect_random, index = bdates)

And open the df variable in the Variable Explorer, and scroll.

What is the expected output? What do you see instead?

Should be as fast to open and scroll as in 3.3.3

Versions

Spyder version: 4.1.2
Python version: 3.7.7
Qt version: 5.9.6
PyQt version: 5.9.2
Operating System name/version: Windows 10

Variable Explorer Bug

Source

Skullnick

All 6 comments

Hi @Skullnick, thanks for the report.

I can confirm that this is indeed an issue with the provided example. The scrolling is really slow.

goanpeca on 4 May 2020

Hi @dalthviz it seems we have some performance degradation. Please take a look at this one to see what might be caiusing the slowdowns.

goanpeca on 4 May 2020

👍1

Note: Seems like the problem is the handling of the datetime index. With a numeric index no performance degradation is experienced (even testing with more than 30k rows):

No degradation case:

import pandas as pd
import numpy as np

bdates = range(50000)
vect_random = np.random.normal(size=len(bdates))
df = pd.DataFrame(vect_random, index = bdates)