Nugetgallery: Improve search on the gallery

Created on 15 Apr 2015  路  11Comments  路  Source: NuGet/NuGetGallery

We want to tweak search to provide better results overall. This is a place holder bug, please add samples here so when we do the work we can consider many cases at once.

  1. https://github.com/NuGet/NuGetGallery/issues/7486 - Satellite package should stay hidden
  2. https://github.com/NuGet/Home/issues/257 - Download count should count a bit more.
  3. Can we consider page hits as well? Perhaps using Bing as input
Gallery UI Search V2 Feed V3 Feed Priority - 2

Most helpful comment

To me, the largest problem with NuGet search is that entering multiple terms creates an OR condition rather than an AND condition. Example: if I search for _Azure PCL_, I expect to see packages with both _Azure_ AND _PCL_, not either _Azure_ OR _PCL_. One simple change to the search functionality - changing the default combination clause from OR to AND - would make the whole thing far more useful, both on the web site and in the clients. This would make the problem of ordering the search results less important, as we'll be able to narrow the results we're looking at.

All 11 comments

Also follow suggestions from: NuGet/NuGetGallery#2399

Just another example, do a search for JSON, and the results seem to have no order at all. Newtonsoft's library is at the top as expected, but beyond that, there is no rhyme or reason to which items are sorted higher than others. Based on total downloads, Manatee.Json (mine) should be in the top ten easily. As it is, it's on page 3 with numerous libraries which have a tenth of the downloads ahead of it.

To me, the largest problem with NuGet search is that entering multiple terms creates an OR condition rather than an AND condition. Example: if I search for _Azure PCL_, I expect to see packages with both _Azure_ AND _PCL_, not either _Azure_ OR _PCL_. One simple change to the search functionality - changing the default combination clause from OR to AND - would make the whole thing far more useful, both on the web site and in the clients. This would make the problem of ordering the search results less important, as we'll be able to narrow the results we're looking at.

"OR" search by default is _always_ wrong. You can always perform "OR" manually by just doing multiple searches, but there is no workaround to get the equivalent of an "AND" search.

The search is essentially an OR and an AND where the AND if it should happen to be true gets a boost. For example, a query for [Azure Cryptography] should match all packages that use either word anywhere in the id, title, description, tags, etc.etc. however if both words occur in the Id or the Title then that package is supposed to get a good boost. It was 18 months ago when we put this logic in place and it seemed to work reasonably but no doubt we could improve it - what we have is very simple. Walking through the "drill down" scenario seems like a great idea - thanks.

Issue #1984 covers improved search results. We could perhaps consider indexing typos for the id column. Though that can get difficult for computer ids because often they are not proper English anyhow. Perhaps we could look at some edit-distance algorithm.

@gregsdennis here is an update, the package shows a bit more consistent (this is experimental in a PR that adds mild boosting by download count). In essence it changes the order for packages that have similar matched scores, but doesn't affect packages that have disparate initial scores. Not calling it done yet, just WIP.

https://github.com/NuGet/NuGet.Services.Metadata/pull/69

image

For "JSON", the result is now on the first page. Still some optimization to do but this is definitely better :-)

(https://www.nuget.org/packages?q=JSON - refresh a couple of times if the package does not show up)

We will be tweaking this further in the coming time.

Closing this issue since it is very similar to https://github.com/NuGet/NuGetGallery/issues/4124. If you are having problems with a specific search query, please report it there.

@joelverhagen this is #2405. Did you mean to link to another issue?

Oops thanks for the catch. Meant to link to:
https://github.com/NuGet/NuGetGallery/issues/4124

Was this page helpful?
0 / 5 - 0 ratings

Related issues

gep13 picture gep13  路  4Comments

yishaigalatzer picture yishaigalatzer  路  4Comments

skofman1 picture skofman1  路  3Comments

yevgen-nykytenko picture yevgen-nykytenko  路  5Comments

xied75 picture xied75  路  5Comments