Rubberduck: code exploer very slow

Created on 21 May 2017 · 6Comments · Source: rubberduck-vba/Rubberduck

The code exploer is slow to respond after it has been populated. Fixng a single issue triggers another check on the full project, whcih is taking minutes to run.

feature-code-explorer performance support

Source

Talorthain

Most helpful comment

Just one comment on re-resolving declarations. We actually only re-resolve the modules we re-parse and those referencing these modules. This gives a significant boost in performance on a re-parse.

MDoerner on 22 May 2017

👍3

All 6 comments

Hi there.

which version of Rubberduck are you running?
Do note that the code explorer relies on a correct parser state, which means that changing the code (like fixing an inspection) requires reparsing the invalidated components and refreshing all UI elements that access the Parser state.

Also note that the next release will include the possibility of fixing multiple issues in one pass and a significantly more intelligent parser, which should make wait-times drop significantly.

Greets

Vogel612 on 21 May 2017

Hi, It is the newest version for download 2.0.13..

The new version does seem a good next step. looking forward to trying it.

Talorthain on 21 May 2017

"Fixing multiple issues in one pass" is a bit misleading though: the TokenStreamRewriter cannot process the same lexer token twice. However using the TokenStreamRewriter (instead of rewriting directly in the code pane) is a major step forward, since we no longer need to deal with offset token positions resulting from a code rewrite. The rewriter replaces all module content at once, but we still need to consider it invalidated after applying a quickfix/refactoring, otherwise we would end up with conflicting rewrite operations.

The VBIDE API doesn't tell us when a module is modified, and because we need the token positions in the lexer stream to match exactly with their position in the code pane, we must parse an entire module every time we parse. And then since we don't know what was added/removed when a module was changed, we must re-resolve identifier references for all declarations, every time we parse. Not doing that would be harmful, since we would end up missing identifier references, or worse, we would have stale references that no longer exist. Accurate parser state is fundamental for everything else to work, so correctness was favored over performance.

I'm all ears for ideas.

retailcoder on 21 May 2017

Just one comment on re-resolving declarations. We actually only re-resolve the modules we re-parse and those referencing these modules. This gives a significant boost in performance on a re-parse.

MDoerner on 22 May 2017

👍3

Test for keyboard input, if no key pressed don’t reparse!!

From: Mathieu Guindon [mailto:[email protected]]
Sent: 21 May 2017 15:55
To: rubberduck-vba/Rubberduck Rubberduck@noreply.github.com
Cc: Talorthain anthony_taylor7@btinternet.com; State change state_change@noreply.github.com
Subject: Re: [rubberduck-vba/Rubberduck] code exploer very slow (#3025)

"Fixing multiple issues in one pass" is a bit misleading though: the TokenStreamRewriter cannot process the same lexer token twice. However using the TokenStreamRewriter (instead of rewriting directly in the code pane) is a major step forward, since we no longer need to deal with offset token positions resulting from a code rewrite. The rewriter replaces all module content at once, but we still need to consider it invalidated after applying a quickfix/refactoring, otherwise we would end up with conflicting rewrite operations.

I'm all ears for ideas.

—
You are receiving this because you modified the open/close state.
Reply to this email directly, view it on GitHub https://github.com/rubberduck-vba/Rubberduck/issues/3025#issuecomment-302941682 , or mute the thread https://github.com/notifications/unsubscribe-auth/AGe018j4PekNp4jePirxp24F8f-uZoZAks5r8FAwgaJpZM4Nhlee . https://github.com/notifications/beacon/AGe019HXBtoZUoc-jU31KFgcfhd1Ij8Lks5r8FAwgaJpZM4Nhlee.gif

Talorthain on 22 May 2017

@Talorthain we only parse modified modules; we also have mouse and keyboard hooks that help us track the current selection (we use that to enable/disable commands, e.g. you can only encapsulate field when a field is selected); we could trigger a parse task on every keypress, and even indent the code as you hit enter (just need the time to do it). But until we can somehow bind the code in the code panes to parse tree nodes, process fragments of code (the line or instruction you just modified), and somehow magically update / shift token positions accordingly for the rest of the module, we'll have to parse the entire module at once.

My comment about the VBIDE API being underfeatured made it sound like we were facing issues we really actually aren't: the hard part isn't knowing whether a module was modified (comparing content hashes works nicely) - it's maintaining the semantic understanding of the code as efficiently as possible that's challenging. It's more about resolving and cache invalidation than it is about parsing: when Module1.DoSomething is modified and now increments a variable that's declared in Module2, we're not re-parsing Module2.. but we need to find that variable in our semantic model and update its references. Parsing is just the tokenization part; the real fun begins when we have the updated tokens and their positions.