Elasticsearch-net: Nested attribute on List<string> generating unwanted properties

Created on 1 Sep 2016  路  9Comments  路  Source: elastic/elasticsearch-net

NEST/Elasticsearch.Net version: 2.3.3

Elasticsearch version: 2.3.2

Description of the problem including expected versus actual behavior:

I have a mapping for a list set with the Nested attribute and it's creating unwanted properties on the ES mapping.

[Nested(IncludeInParent=true)]
public List<String> raw_amenities { get; set; }

yields

raw_amenities": {
     "type": "nested",
     "include_in_parent": true,
     "properties": {
          "chars": {
               "type": "string"
           },
           "length": {
               "type": "integer"
            }
      }
}

Most helpful comment

Just to follow up on what @gmarz has said, you would want to map it as a string type

``` c#
[String]
public List raw_amenities { get; set; }

in fact, if you don't need to set any other properties for the type e.g. a different analyzer, not analyzed, etc. then you don't even need the attribute applied, just calling `AutoMap()` will infer a `string` type mapping for `List<string>`. With Elasticsearch mappings, there is no distinction between a single property type and a collection property type i.e.

``` c#
public class Household
{
    public string Amenity { get; set; }
    public List<string> Amenities { get; set; } 
}

Both Amenity and Amenities would be mapped as string data type in Elasticsearch.

Now, if you were dealing with say a List<Amenity> where Amenity type was

``` c#
public class Amenity
{
public string Name { get; set; }
public DateTimeOffset Added { get; set; }
}

then the default inferred mapping for this would be `object`. This is fine for some scenarios, but with `object` mapping, the association between `Name` and `Added` for a given `Amenity` instance is not stored in the inverted index. For example, imagine the following amenities are indexed for a `Household` document

``` c#
var household = new Household 
{
    Id = 1,
    Amenities = new List<Amenity>
    {
        new Amenity { Name = "electricity", Added = new DateTime(2016, 1, 1) },
        new Amenity { Name = "gas", Added = new DateTime(2016, 1, 8) },
    }
}

client.Index(household);

if Amenities is mapped as an object type, a search for Name="electricity" and Added=new DateTime(2016, 1, 8) would return the document as a match. If Amenities is mapped as a nested type however, the document would not be a match.

I hope that helps explain the difference.

All 9 comments

Hey @shakdoesgithub the nested type is for objects and doesn't make any sense for strings. What's happening here is a side effect of the client expecting an object when the Nested attribute is applied, treating string as an object and mapping its properties.

@russcam @Mpdreamz Perhaps, we should thrown an exception in this case instead of naively inferring the CLR properties?

hey @gmarz,

So I'm a little big confused, since https://www.elastic.co/guide/en/elasticsearch/reference/2.3/nested.html says that

"The nested type is a specialised version of the object datatype that allows arrays of objects to be indexed and queried independently of each other."

My use case is to store the list of strings into this mapping. How would I go on about doing that, if not using a nested type?

arrays of objects

:) Under the hood, everything is flattened in Lucene. There is no actual array type. So an array of any core type (string, integer, etc...) is really just its underlying type. Check out the docs on array datatypes.

In this case you would just map raw_amenities as a string.

Hope that makes sense?

Just to follow up on what @gmarz has said, you would want to map it as a string type

``` c#
[String]
public List raw_amenities { get; set; }

in fact, if you don't need to set any other properties for the type e.g. a different analyzer, not analyzed, etc. then you don't even need the attribute applied, just calling `AutoMap()` will infer a `string` type mapping for `List<string>`. With Elasticsearch mappings, there is no distinction between a single property type and a collection property type i.e.

``` c#
public class Household
{
    public string Amenity { get; set; }
    public List<string> Amenities { get; set; } 
}

Both Amenity and Amenities would be mapped as string data type in Elasticsearch.

Now, if you were dealing with say a List<Amenity> where Amenity type was

``` c#
public class Amenity
{
public string Name { get; set; }
public DateTimeOffset Added { get; set; }
}

then the default inferred mapping for this would be `object`. This is fine for some scenarios, but with `object` mapping, the association between `Name` and `Added` for a given `Amenity` instance is not stored in the inverted index. For example, imagine the following amenities are indexed for a `Household` document

``` c#
var household = new Household 
{
    Id = 1,
    Amenities = new List<Amenity>
    {
        new Amenity { Name = "electricity", Added = new DateTime(2016, 1, 1) },
        new Amenity { Name = "gas", Added = new DateTime(2016, 1, 8) },
    }
}

client.Index(household);

if Amenities is mapped as an object type, a search for Name="electricity" and Added=new DateTime(2016, 1, 8) would return the document as a match. If Amenities is mapped as a nested type however, the document would not be a match.

I hope that helps explain the difference.

@gmarz I'm in two minds about whether we should throw an exception here; the example mapping may be perfectly valid for some use case.

Were you thinking throw for certain types and not for others?

@russcam thanks for that explanation. I know I'v been down that path before getting mixed up between nested and object types, but your example helped me understand it.

Reopening because of ongoing discussion between @russcam and @gmarz

Were you thinking throw for certain types and not for others?

@russcam I'm thinking that we shouldn't allow Nested or Object on types that aren't actual objects (i.e. non-reference types).

I can't think of any use cases where this would be valid otherwise...?

Closing this before we attempt something too clever :smile: we'd have to unpack PropertyType see if its IEnumerable<T> then check if T is a reference type all for a small gain.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

foleta-developers picture foleta-developers  路  5Comments

ngonzalezromero picture ngonzalezromero  路  4Comments

ejsmith picture ejsmith  路  3Comments

russcam picture russcam  路  3Comments

alwag85 picture alwag85  路  5Comments