For instance indexing the following document would fail if the foo.bar field is not mapped explicitly:
{
"foo": [
{
"bar": "baz"
},
{
"bar": 42
}
]
}
The reason is that mapping updates for the 2 sub documents are generated independently (one triggers the creation of a string field and the other one of a long field) but they can't be reconciliated in order to generate the mapping update that will be sent to the master node.
Bug report courtesy of Benjamin Gathmann at https://discuss.elastic.co/t/coerce-long-to-double-not-working/36882.
Right! So what's the intended behavior?
@davidvgalbraith Good question, I think I'm leaning towards documenting this as a limitation of dynamic mappings. I think it's fair that dynamic mappings don't work if a single document does not use consistent json types for fields that have the same path?
We could potentially avoid conflicts by using the most generic type (ie. string > double, double > long, etc.) but I'm afraid this would hide issues more than it would solve problems. I'd rather rely on something more explicit like dynamic templates.
Thank you for mentioning dynamic templates. I will try this out. The point is that like in my case, you simply have to deal with the data you have, and throwing this data into Elasticsearch is not quite as easy as I expected.
Here's another example where the fact that it doesn't work is more unexpected:
PUT /speed/record/1
{
"speed_rec": [
{
"speed": 61.23
},
{
"speed": 61
}
]
}
True, but when not all numbers have a dot, you are going to have problems anyway. For instance if the first document only contains integers, the field will be mapped as a long and the decimal part of documents that come afterwards will be truncated. (Related to #16018)
Hello. I would like to add that if someone wants to use the global coerce setting to disable this kind of validation. It doesn't work. I have tried with the latest docker elasticsearch official image.
I recognize that this is not necessarily a "solvable" problem for the int vs float case, but isn't it reasonable to just index all other fields anyway and make a note of the error? The current behavior I'm experiencing is the same as described by @clintongormley , and it results in that log message just being dropped from the index. Is there some way I can at least let this message be parsed and indexed and just keep those conflicted fields out of the index?
@elastic/es-search-aggs
I've been pulling a lot of hair because of some ill formatted API returning me documents to index with a content similar to
"parents": [
{
"itemId": 18124
},
{
"itemId": ""
}
]
And dynamic template couldn't do anything, even with
"mapping": {
"coerce": "true",
"ignore_malformed": "true"
}
I think an option to default to the most generic type would be usefull
Most helpful comment
Here's another example where the fact that it doesn't work is more unexpected: