When using the V2 search service, I have noticed that some packages do not have the SupportedFrameworks field populated.
Example: packageid:Newtonsoft.Json
{
...
"SupportedFrameworks":[],
...
}
Others seem fine (validity of results should be examined): packageid:WindowsAzure.Storage
{
...
"SupportedFrameworks": [
"net40",
"net40-client",
"netstandard1.3",
"aspnet50",
"win8",
"wp8",
"wpa"
],
...
}
As a side note, the targetFramework field sent in by the gallery and by the NuGet client seems to be ignored by the search service.
/cc @yishaigalatzer
Long answer coming :smile:
To answer why the output data is not in search, a short answer could be: because the data is not in the catalog as such.
https://api.nuget.org/v3/catalog0/data/2016.06.27.12.35.49/newtonsoft.json.9.0.1.jsonhttps://api.nuget.org/v3/catalog0/data/2016.08.11.01.57.01/windowsazure.storage.7.2.0.jsonWhy not? Because it's not added in the sparql query we run.
But that's okay! Really, it is! Because we switched Catalog2Lucene to make use of a newly implemented CatalogPackageArchiveReader and fetch the supported frameworks from the dependencies and file list, which is more reliable anyway. So no worries, supportedFrameworks does not need a place in the catalog.
If you run this repro code (in the NuGet.Services.Metadata project, you will see that supportedFrameworks are all parsed okay, and added to the search index the way we want and expect.
var httpClient = new HttpClient();
var catalogJson = await httpClient.GetStringAsync(new Uri("https://api.nuget.org/v3/catalog0/data/2016.06.27.12.35.49/newtonsoft.json.9.0.1.json"));
var catalogJObject = JsonConvert.DeserializeObject<JObject>(catalogJson, new JsonSerializerSettings
{
DateParseHandling = DateParseHandling.DateTimeOffset
});
var md = CatalogPackageMetadataExtraction.MakePackageMetadata(catalogJObject);
Console.WriteLine(md["supportedFrameworks"]);
Or, in easy bullet points:
Sql2Lucene does add suportedFrameworks when we regenerate the index from database, but only for packages where this is actually stored. Last re-index was done on DB, and the two example packages were added from DB.
WindowsAzure.Storage, this was done correctly.Newtonsoft.Json, this has not been executed correctly, because of this line of code which seems to ignore all frameworks if one of them is null, instead of just skipping the null one and adding all others. So that means this data is not in the database... I checked for Newtonsoft.Json and found that the first one is indeed null, so no supported frameworks are stored.
(screenshot from a repro, but the logic is the same)
So let's expand our easy bullet points:
Filtering seems to have been disabled for V2 search at some point and was not ported into consolidated search.
This can of course at some point be added again. This is quite a tricky one to do right though.
This field is hard coded to be empty now since it is never used by the gallery.
https://github.com/NuGet/NuGet.Jobs/blob/cb8e8996f0d85272bcaa812f14a61c1bd81bb495/src/NuGet.Services.AzureSearch/SearchService/SearchResponseBuilder.cs#L485
Most helpful comment
Long answer coming :smile:
Why is data not in search output?
To answer why the output data is not in search, a short answer could be: because the data is not in the catalog as such.
https://api.nuget.org/v3/catalog0/data/2016.06.27.12.35.49/newtonsoft.json.9.0.1.jsonhttps://api.nuget.org/v3/catalog0/data/2016.08.11.01.57.01/windowsazure.storage.7.2.0.jsonWhy not? Because it's not added in the sparql query we run.
But that's okay! Really, it is! Because we switched Catalog2Lucene to make use of a newly implemented
CatalogPackageArchiveReaderand fetch the supported frameworks from the dependencies and file list, which is more reliable anyway. So no worries,supportedFrameworksdoes not need a place in the catalog.If you run this repro code (in the
NuGet.Services.Metadataproject, you will see thatsupportedFrameworksare all parsed okay, and added to the search index the way we want and expect.Or, in easy bullet points:
So _why_ is data not in search output? And why is suportedFrameworks in search for one but not the other?
Sql2Lucene does add
suportedFrameworkswhen we regenerate the index from database, but only for packages where this is actually stored. Last re-index was done on DB, and the two example packages were added from DB.WindowsAzure.Storage, this was done correctly.Newtonsoft.Json, this has not been executed correctly, because of this line of code which seems to ignore all frameworks if one of them is null, instead of just skipping the null one and adding all others. So that means this data is not in the database... I checked forNewtonsoft.Jsonand found that the first one is indeed null, so no supported frameworks are stored.(screenshot from a repro, but the logic is the same)
So let's expand our easy bullet points:
Why is data not used for search filtering?
Filtering seems to have been disabled for V2 search at some point and was not ported into consolidated search.
This can of course at some point be added again. This is quite a tricky one to do right though.