NEST/Elasticsearch.Net version:
7.9.0
Elasticsearch version:
7.6.1 (Docker)
Description of the problem including expected versus actual behavior:
Hi, I basically need to query on text fields using a case insensitive Prefix to get all records having their productSearchName field "starting with" something.
Example of data :
productSearchName= "12 tons of ..."
productSearchName= "12 Tons of ..."
productSearchName= "12 TONS of ..."
I would like to get all these three records using a Prefix query, like
GET catalog/_search
{
"_source": ["id", "productSearchName"],
"query":
{
"prefix": {
"productSearchName.Keyword": {
"value": "12 tons"
}
}
}
}
When defining a Normalizer (as I understood it's the correct approach, is it ?) and telling my field should use it, the field is not mapped as expected, it's a "basic" text fields on Keyword
"productSearchName": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
}
My query against the field indeed does not retrieve all records as expected.
I indeed reindexed my whole documents on a dedicated index with the provided mapping.
Steps to reproduce:
Here is my IndexDescriptor
c => c
.Settings(s => s.Analysis(a => a
.Normalizers(n => n.Custom("case_insensitive", c => c.Filters("lowercase")))
))
.Map<Catalog>(m => m
.AutoMap()
.Properties(p => p
.Keyword(st => st
.Name(n => n.ProductSearchName)
.Normalizer("case_insensitive")
)))
Expected behavior
I expect the ProductSearchName field to use the provided normalizer, and then being able to query in a case insensitive way using Prefix instruction, and then get all my 3 example records.
Hi @NicolasReyDotNet,
Creating an index with a normalizer specified in analysis and applying to a type field works as expected. Here's an example that'll also log out the request and response
```c#
private static void Main()
{
var defaultIndex = "documents";
var pool = new SingleNodeConnectionPool(new Uri($"https://localhost:9200"));
var settings = new ConnectionSettings(pool)
.DefaultIndex(defaultIndex)
.DisableDirectStreaming()
.PrettyJson()
.OnRequestCompleted(callDetails =>
{
if (callDetails.RequestBodyInBytes != null)
{
var json = JObject.Parse(Encoding.UTF8.GetString(callDetails.RequestBodyInBytes));
Console.WriteLine(
$"{callDetails.HttpMethod} {callDetails.Uri} \n" +
$"{json.ToString(Newtonsoft.Json.Formatting.Indented)}");
}
else
{
Console.WriteLine($"{callDetails.HttpMethod} {callDetails.Uri}");
}
Console.WriteLine();
if (callDetails.ResponseBodyInBytes != null)
{
Console.WriteLine($"Status: {callDetails.HttpStatusCode}\n" +
$"{Encoding.UTF8.GetString(callDetails.ResponseBodyInBytes)}\n" +
$"{new string('-', 30)}\n");
}
else
{
Console.WriteLine($"Status: {callDetails.HttpStatusCode}\n" +
$"{new string('-', 30)}\n");
}
});
var client = new ElasticClient(settings);
var createIndexResponse = client.Indices.Create("foo", c => c
.Settings(s => s.Analysis(a => a
.Normalizers(n => n.Custom("case_insensitive", c => c.Filters("lowercase")))
))
.Map<Catalog>(m => m
.AutoMap()
.Properties(p => p
.Keyword(st => st
.Name(n => n.ProductSearchName)
.Normalizer("case_insensitive")
)))
);
}
public class Catalog
{
public string ProductSearchName {get;set;}
}
which yields
```json
PUT https://localhost:9200/foo?pretty=true
{
"mappings": {
"properties": {
"productSearchName": {
"normalizer": "case_insensitive",
"type": "keyword"
}
}
},
"settings": {
"analysis": {
"normalizer": {
"case_insensitive": {
"filter": [
"lowercase"
],
"type": "custom"
}
}
}
}
}
It _looks_ like a document may have been indexed into the index before the explicit index creation and mapping was applied; The productSearchName is mapped with the default inference mapping for a string that would be generated.
My bad, many many thanks for your personalized example, here was my problem :
I didn't checked the return of elaticClient.Indices.Create() method, which was
Invalid NEST response built from a unsuccessful (400) low level call on PUT: /low6?pretty=true&error_trace=true
# Audit trail of this API call:
- [1] BadResponse: Node: http://elasticsearch:9200/ Took: 00:00:00.0670044
# OriginalException: Elasticsearch.Net.ElasticsearchClientException: Request failed to execute. Call: Status code 400 from: PUT /low6?pretty=true&error_trace=true. ServerError: Type: mapper_parsing_exception Reason: "Failed to parse mapping [_doc]: Field name [size.keyword] which is a multi field of [size] cannot contain '.'" CausedBy: "Type: mapper_parsing_exception Reason: "Field name [size.keyword] which is a multi field of [size] cannot contain '.'""
Indeed my complete indexDescriptor was simplified for the issue, but the whole was
c => c
.Settings(s => s.Analysis(a => a
.Normalizers(n => n.Custom("case_insensitive", c => c.Filters("lowercase")))
))
.Map<Catalog>(m => m
.AutoMap()
// this useless part was causing the invalid mapping !
.Properties(p => p
.Nested<Variant>(n => n
.Name(n => n.Variants)
.Properties(eps => eps
.Text(s => s
.Name(e => e.Size)
.Fields(ff => ff
.Keyword(ss => ss
.Name("size.keyword")))))
))
// this part was ignored then
.Properties(p => p
.Keyword(st => st
.Name(n => n.ProductSearchName)
.Normalizer("case_insensitive")
))
Then exactly as you said, the mapping was the default one based on the POCO definition on first indexed document.
I'll throw an exception to prevent this now, thank you very much!
No worries, @NicolasReyDotNet. You can also check to see if the index exists before attempting to create, with client.Indices.Exist()
Got it thx
@russcam
One more thing please, how could I apply the the normalizer on fields of sub class in a list ?
example
public class Catalog
{
public string ProductSearchName {get;set;}
public List<Variant> Variants { get; set; }
}
public class Variant
{
public string Description {get;set;} // <= property where to apply the normalizer
}
Currently I use
.Properties(p => p
.Nested<Variant>(n => n
.Name(n => n.Variants)
.Properties(eps => eps
.Keyword(st => st
.Name(n => n.Description)
.Normalizer("case_insensitive")))))
but it makes my object Nested, which is not my goal.
Thank you for your help
Nevermind, using Object<> instead of Nested<> does the trick.
.Properties(p => p
.Object<Variant>(o => o
.Name(n => n.Variants)
.Properties(eps => eps
.Keyword(st => st
.Name(n => n.Description)
.Normalizer("case_insensitive")))))
Your lib is awesome !