Ubuntu 12.04, vscode 0.10.1
I have found Go to file's indexing against a full Chromium workspace to be very slow. It took ~40 seconds to find "Tab.java" whereas a simple find
command took less than a second:
$ time find . -name "Tab.java"
./chrome/android/java/src/org/chromium/chrome/browser/tab/Tab.java
real 0m0.559s
user 0m0.268s
sys 0m0.284s
Note that the workspace is on an SSD.
I can confirm that this issue also exists on windows in the newest february 2016 release. This is a usability break for large projects
@bpasero could a find
for (term)*
be run in parallel and those result(s) shown while the fuzzy search is happening to speed up an ideal search?
@Tyriar in my experiments using find
or any other external process did not yield significant speed improvement over what we do now. the reason is that we do quite some heavy pattern matching and other things with the result before it even reaches the user. The better fix is to keep the list of paths in memory once you ran the search once and reuse that information.
Check this post on reddit by author of sublime text on how they totally nail it https://www.reddit.com/r/programming/comments/4cfz8r/reverse_engineering_sublime_texts_fuzzy_match/
@haisum thanks so much for sharing, what a coincidence that I am currently looking into improving our scoring algorithm.
Note however that this is not really talking about how to make find in files fast, rather how to score the results for the quick open box.
@bpasero yup I checked, it's not in much depth but I thought it may help in getting some perspective.
Let me know if I am link spamming here but I think this also seems relevant example. https://github.com/wincent/command-t/blob/master/doc/command-t.txt. It's open source and super fast vim plugin for file searching.
No worries, keep 'em coming :+1:
You guys thought about using something like lucene to do that sort of
pattern matching? Just index all the source in the background. Pretty quick
and portable (but we don't use it for source indexing)
On Wed, Mar 30, 2016 at 9:55 PM, Benjamin Pasero [email protected]
wrote:
No worries, keep 'em coming [image: :+1:]
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub
https://github.com/Microsoft/vscode/issues/55#issuecomment-203329911
Justin Romaine
Senior Systems Architect
Spark Dental Technology
justin.[email protected]
ph 021 764 506
hm 09 445 9166
The fact that we need to search source code might make Lucene a less ideal candidate. We also need to support regular expression searches.
true.
On Thu, Mar 31, 2016 at 6:05 PM, Benjamin Pasero [email protected]
wrote:
The fact that we need to search source code might make Lucene a less ideal
candidate. We also need to support regular expression searches.—
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub
https://github.com/Microsoft/vscode/issues/55#issuecomment-203754273
Justin Romaine
Senior Systems Architect
Spark Dental Technology
justin.[email protected]
ph 021 764 506
hm 09 445 9166
Please also look into how fzf is implemented. It has a simpler scoring system (just plain subsequence search I believe) but it's extremely fast. It doesn't even build an index AFAIK.
FWIW this would be the killer feature that would most certainly get me to switch to VS Code. But in any case thanks for your hard work.
@bpasero would it be interesting to have something like: https://github.com/ggreer/the_silver_searcher as a dependency and use it? Seems powerful for file searches.
If fuzzy search can be brought to parity with other editors it would make VS Code the hands-down choice for me. I still prefer it for smaller projects, but for larger ones I really have no choice but to use Sublime or Atom.
Not sure what the timeline looks like on this.
The fact that we need to search source code might make Lucene a less ideal candidate. We also need to support regular expression searches.
Would it be reasonable to layer in different kinds of results at different times (with some sort of indicator that a search is still progressing)? Even fuzzy matching file names has a fair bit of delay for medium sized projects (with, say, node_modules/
_excluded_)
I have added a unit test for measuring file search performance with a large workspace, instructions and results from the optimization with #9380 are in #9545.
I'm using "insider build" and using vscode to browse Linux kernel over sshfs.
Even when entering full path like 'mm/slab.c', file search (using Ctrl + P) takes a long time. Also, it seems there is no caching of file paths, so repeated searches in same sub-paths remain slow.
In comparison, sublime text over sshfs is able to fuzzy find files almost instantaneously. It must be caching FS tree. Hitting sshfs (or any network mount) for every find request is not feasible.
+1
Please don't comment "+1". Use the reaction button on the comment to show
your approval/appreciation.
On Aug 8, 2016 4:28 PM, "Acai" [email protected] wrote:
+1
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
https://github.com/Microsoft/vscode/issues/55#issuecomment-238254831,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AA9KSJ-l3mTnqiPvafgosqzsBZdo9Qinks5qdz0cgaJpZM4Gk1Ax
.
Moving to Christoph who is making great progress on tuning this (already for last milestone, continuing in this milestone) 👍
For the fuzzy finding, over in Nuclide, we use fuzzy-native, which are Node bindings for wincent/command-t with multithreading support. It's crazy fast.
Thanks for sharing @zertosh!
Our fuzzy sorting is reasonably fast at this point. It might still be an overall improvement if we can make it faster (time permitting).
Most of the time is currently spent in the file traversal. I've just switched over to use native commands (find/dir) to do that, yet it remains the main part where time is spent.
My measurements using Pythons os.walk() indicate that it would be faster to use a native module that uses Posix readdir (and a similar API on Windows, like Python does). Implementing this ourselves is out of scope at this time, it would likely benefit the wider node community.
I find this post very interesting: https://lists.freebsd.org/pipermail/freebsd-current/2010-August/019310.html
There is an npm package implementing a simplified version of Boyer-Moore mentioned in the above article: https://www.npmjs.com/package/streamsearch
Our full text search matching actually uses RegExp matching on a full line (see https://github.com/Microsoft/vscode/blob/master/src/vs/workbench/services/search/node/textSearch.ts#L121) and part of the reason is that we allow for searches with regular expressions.
I wonder how much faster BM would be in this case given that RegEx matching is highly optimized. We should just start an experiment and measure this 👍
One possible optimization is to not split the file into lines but only do it if a match is found.
Late in the game but how about https://github.com/monochromegane/the_platinum_searcher ?
Is a project's directory tree currently being cached in memory? I suspect that a lot of the slowness I'm experiencing with fuzzy search when I use sshfs
is that VSCode is remaking requests to build the directory structure unnecessarily. I could be wrong about that though.
It is cached for quick open (= the file picker with fuzzy matching) but currently not when doing full text searches (planned next).
I'm seeing the same problem, (Ubuntu 16.04, VSCode 1.5.2, a project with ~1500 non-ignored files). As others have reported, this drastically slows down development time, and so I'm regrettably switching back to ST3 until this is fixed. This is a real shame because I'm really liking the other features of vscode - particularly the node debugger.
Is there any way to see where the fuzzy matcher is taking the most time? In ST there is a console you can open which outputs useful debug information.
@fiznool That's unexpected at this point, could you open a bug report with the output from time (find -L . -type f | wc -l); echo $?
when run in your project folder?
@chrmarti this initially takes a very long time, but this is because I have (amongst other things) a node_modules
folder with tens of thousands of files. I expected VSCode to not recurse into here since I have excluded this folder from my project, am I right in thinking that the Quick Find should not search in folders which are excluded from the settings.json
configuration?
@fiznool A change I included was to use 'find' to retrieve all files and then apply the exclusion filter on top. 'find' seemed to be fast enough to make up for the additional load, that's not always the case as we now discover. I'm tracking this as #11874.
@chrmarti as per #11874 has this now been fixed?
I updated vscode this morning to v1.5.3 and the problem still appears to persist.
@fiznool That fix will be in 1.6.
I have a relatively large project, that is on a remote drive (mounted in windows) and the file search (ctrl+e) does not find many of the files. Is there maybe a way to increase some timeouts on indexing or something like that, that might be causing this? Or is there a limit on number of indexed files?
@PunchyRascal Opened #14913 to track your issue.
Though everything has gotten significantly more snappy in recent releases (great work guys!), my experience with VS Code is that it still doesn't:
1) Crawl the directory tree of a project in the background to cache for fuzzy search
2) Cache files as they're revealed in the Explorer pane tree
I think adding one or both of these would greatly improve the overall fuzzy search experience. :)
I mount filesystem via Fuse and it's slow to retrieve all file names via the network. I love Sublime Text because it caches file names and creates search index (in background) that allows me to jump to any file blazing fast. Can you please do the same here, in the VS Code? If you will do it - it will help me switch to the VS Code, because I like autocompletion and other things in it. But super slow (2+ minutes waiting time) for Just To File - it's what stops me from using VS Code right now in our big project.
Do you have plans to deliver this improvement with caching the file tree?
P.S. I use Python extension. Maybe it somehow slows done Jump To File functionality?
P.P.S. I found that after opening a workspace and waiting few minutes - Jump To File now works pretty good. Why not save the cache to use it when I open this workspace next time? I see that it's no native Projects support (and it potentially allows to create a separate folder for the project and save cache and settings there). Why not to use a Project Manager extension as a native solution for all VS Code users?
Can't believe something as important as this is still an issue.
I'm using Windows 64-bit insider builds, working on a project mounted on a network share.
Sublime text finds files instantly, project consists of only a few thousand files.
VSCode takes minutes :(
node_modules typically inflates the number of files in a project. Its a sane optimization to give an option to opt-out from search or only opt-in if the file path entered has node_modules
in it
I observed a significant improvement in speed of search and quick open when we switched from samba share to NFS share. Now it takes at about 5 - 7 seconds as opposed to ~30 with samba.
Could be my imagination, but did this receive some TLC in the November VS Code update? Doesn't seem to be anything about fuzzy search in the release notes, but my network-mounted drive seems to be indexed in the background and actually fuzzy-searchable now.
I thought it was my new machine, but I too have experienced lightning fast searching in the last weeks, even on samba share.
I can reproduce the performance improvement too. This is on the chromium "src" repository with third_party folder excluded. Search across files seems faster too.
I have opposite anecdotal results, The search index for me is now terrible, and only seems to include files I've already opened. (Version 1.19.1, problem since 1.19)
@jamie-pate Please open an issue (Help > Report Issues to include your setup info) for us to investigate.
Actually I think it's the workbench.action.quickOpen
which was broken/slow, not sure how it's related to search. I left it open overnight and it seems to be working now :flushed:
For me it was also very slow, but when going through my settings I found I had set "search.quickOpen.includeSymbols" to true. After setting it to false it becamse a fast as I was used to. Performance becamse slow when opening a project of ~18k files with that setting set to true.
"Go to file" is ridiculously slow for too.
I have a Windows 7 machine with SSD and my workspace has 38,426 Files, 11,228 Folders. Searching in VSCode regularly takes 10+ seconds to find a file where Eclipse search is instant! The only time VSCode shows the result quickly is if it's still in the recently opened list.
I was able to alleviate the problem just a little bit by excluding some of my workspace folders by adding them to the search.exclude
list in Settings.
Please fix this, it's one of my most frequently used functions!!!
Closing this because quite a lot has happened since 2015 (search has been rewritten entirely twice since then).
Anything else that's slower than it should be, please file new issues.
@roblourens The "Go to file" function still has terrible performance for large projects with 30,000 files. I don't understand how is that possible because even "Find in Files" is a lot faster and that one has to check the entire file contents, not only the file name
I feel like the computer hardware performance is important here, too. Since I started using a new notebook, I think the search has sped up significantly. The CPU load is indeed very high during search operations. So maybe bear that in mind.
I just published a wiki page for troubleshooting search issues which might also be helpful: https://github.com/Microsoft/vscode/wiki/Search-Issues
For anything else, please open a new issue.
Most helpful comment
If fuzzy search can be brought to parity with other editors it would make VS Code the hands-down choice for me. I still prefer it for smaller projects, but for larger ones I really have no choice but to use Sublime or Atom.
Not sure what the timeline looks like on this.