Elasticsearch: Different behaviors but both look buggy for docs.deleted in 6.8 and 7.6

Created on 11 May 2020 · 11Comments · Source: elastic/elasticsearch

I found ES 6.7/6.8 and ES 7.6 behave differently for docs.deleted counter.
But both look not correct or at least not easy to understand the logic

ES 6.7 & 6.8

Expected result

When performing DELETE <index>/_doc/<doc_id>, docs.deleted should increase 1.

Symptom (Actual result)

docs.deleted shows 0 when deleting the document. (Clear up immediately)

Repro Steps

# create index and put docs
DELETE my_test
PUT my_test
{"settings":{"number_of_replicas":0,"number_of_shards":1}}

PUT my_test/_doc/1
{"title":"aaa"}

PUT my_test/_doc/2
{"title":"bbb"}

# delete
DELETE my_test/_doc/1

# check docs.deleted
GET _cat/indices?v&index=my_test&h=index,health,status,docs.deleted

# response
index   health status docs.deleted
my_test green  open              0

ES 7.6

Expected result

When performing DELETE <index>/_doc/<doc_id>, docs.deleted should increase 1.

Symptom (Actual result)

docs.deleted shows 2 when deleting a single document. (Increased wrongly.)

Repro Steps

# create index and put docs
DELETE my_test
PUT my_test
{"settings":{"number_of_replicas":0,"number_of_shards":1}}

PUT my_test/_doc/1
{"title":"aaa"}

PUT my_test/_doc/2
{"title":"bbb"}

# delete
DELETE my_test/_doc/1

# check docs.deleted
GET _cat/indices?v&index=my_test&h=index,health,status,docs.deleted

# response
index   health status docs.deleted
my_test green  open              2

Side notes

In 7.6, doc update will make docs.deleted increase 1, which is the correct and expected behavior (As the document is deleted and indexed again internally, even for partial update.)
But in 6.7/6.8, doc update doesn't trigger a docs.deleted increment, which appears to be another bug...

:DistributeCRUD >docs Distributed Docs

Source

kunisen

Most helpful comment

Like Nhat I'm also in favour of documenting the internal nature of docs.deleted rather than changing it or adding a more detailed breakdown which may constrain future work in this area. Lucene's tracking of deleted docs should IMO be considered a deep implementation detail. Users ought to rely on Elasticsearch keeping them under control in the background rather than trying to actively manage them, especially since the only tools to do so are rather blunt things like a force-merge, and I think that adding more detailed stats will encourage the opposite behaviour.

DaveCTurner on 13 May 2020

👍2

All 11 comments

Pinging @elastic/es-distributed (:Distributed/CRUD)

elasticmachine on 11 May 2020

Pinging @elastic/es-core-features (:Core/Features/CAT APIs)

elasticmachine on 11 May 2020

The docs.deleted stats report on the segments in the index, and your test case does not do any refreshing so does not create any segments. Adding some appropriate refreshes recovers the expected behaviour in 6.x:

DELETE /my_test

# {
#   "acknowledged": true
# }

PUT /my_test
{
  "settings": {
    "number_of_shards": 1,
    "number_of_replicas": 0
  }
}

# {
#   "shards_acknowledged": true,
#   "acknowledged": true,
#   "index": "my_test"
# }

PUT /my_test/_doc/1
{
  "title": "aaa"
}

# {
#   "_type": "_doc",
#   "_primary_term": 1,
#   "_id": "1",
#   "_shards": {
#     "successful": 1,
#     "total": 1,
#     "failed": 0
#   },
#   "_index": "my_test",
#   "result": "created",
#   "_version": 1,
#   "_seq_no": 0
# }

PUT /my_test/_doc/2
{
  "title": "bbb"
}

# {
#   "_type": "_doc",
#   "_primary_term": 1,
#   "_id": "2",
#   "_shards": {
#     "successful": 1,
#     "total": 1,
#     "failed": 0
#   },
#   "_index": "my_test",
#   "result": "created",
#   "_version": 1,
#   "_seq_no": 1
# }

POST /my_test/_refresh

# {
#   "_shards": {
#     "successful": 1,
#     "total": 1,
#     "failed": 0
#   }
# }

DELETE /my_test/_doc/1

# {
#   "_type": "_doc",
#   "_primary_term": 1,
#   "_id": "1",
#   "_shards": {
#     "successful": 1,
#     "total": 1,
#     "failed": 0
#   },
#   "_index": "my_test",
#   "result": "deleted",
#   "_version": 2,
#   "_seq_no": 2
# }

POST /my_test/_refresh

# {
#   "_shards": {
#     "successful": 1,
#     "total": 1,
#     "failed": 0
#   }
# }

GET /_cat/indices?v&index=my_test&h=index,health,status,docs.deleted

# index   health status docs.deleted
# my_test green  open              1
#

In 7.x it's more complicated since we delete the document and then add a tombstone to record the deletion, and I think we count both of these as deleted docs. I think that makes sense, the tombstone is a genuine doc in the index that should be cleaned up later on once it's no longer needed for peer recovery, and indeed after thirty seconds (the peer recovery lease resync interval) and a flush I see that happen automatically:

DELETE /my_test
# at 2020-05-11T11:22:27.304Z

# {
#   "acknowledged": true
# }
# at 2020-05-11T11:22:27.381Z
# (0.077211s elapsed)

PUT /my_test
{
  "settings": {
    "number_of_shards": 1,
    "number_of_replicas": 0
  }
}
# at 2020-05-11T11:22:27.381Z

# {
#   "shards_acknowledged": true,
#   "acknowledged": true,
#   "index": "my_test"
# }
# at 2020-05-11T11:22:27.632Z
# (0.250563s elapsed)

PUT /my_test/_doc/1
{
  "title": "aaa"
}
# at 2020-05-11T11:22:27.632Z

# {
#   "_type": "_doc",
#   "_primary_term": 1,
#   "_id": "1",
#   "_shards": {
#     "successful": 1,
#     "total": 1,
#     "failed": 0
#   },
#   "_index": "my_test",
#   "result": "created",
#   "_version": 1,
#   "_seq_no": 0
# }
# at 2020-05-11T11:22:27.699Z
# (0.066634s elapsed)

PUT /my_test/_doc/2
{
  "title": "bbb"
}
# at 2020-05-11T11:22:27.699Z

# {
#   "_type": "_doc",
#   "_primary_term": 1,
#   "_id": "2",
#   "_shards": {
#     "successful": 1,
#     "total": 1,
#     "failed": 0
#   },
#   "_index": "my_test",
#   "result": "created",
#   "_version": 1,
#   "_seq_no": 1
# }
# at 2020-05-11T11:22:27.718Z
# (0.018532s elapsed)

POST /my_test/_refresh
# at 2020-05-11T11:22:27.718Z

# {
#   "_shards": {
#     "successful": 1,
#     "total": 1,
#     "failed": 0
#   }
# }
# at 2020-05-11T11:22:27.742Z
# (0.023456s elapsed)

DELETE /my_test/_doc/1
# at 2020-05-11T11:22:27.742Z

# {
#   "_type": "_doc",
#   "_primary_term": 1,
#   "_id": "1",
#   "_shards": {
#     "successful": 1,
#     "total": 1,
#     "failed": 0
#   },
#   "_index": "my_test",
#   "result": "deleted",
#   "_version": 2,
#   "_seq_no": 2
# }
# at 2020-05-11T11:22:27.755Z
# (0.013136s elapsed)

POST /my_test/_refresh
# at 2020-05-11T11:22:27.772Z

# {
#   "_shards": {
#     "successful": 1,
#     "total": 1,
#     "failed": 0
#   }
# }
# at 2020-05-11T11:22:27.787Z
# (0.015387s elapsed)

GET /_cat/indices?v&index=my_test&h=index,health,status,docs.deleted
# at 2020-05-11T11:22:27.787Z

# index   health status docs.deleted
# my_test green  open              2
# 
# at 2020-05-11T11:22:27.793Z
# (0.005123s elapsed)

### NOTE ≥30-second pause here

POST /my_test/_flush
# at 2020-05-11T11:22:58.937Z

# {
#   "_shards": {
#     "successful": 1,
#     "total": 1,
#     "failed": 0
#   }
# }
# at 2020-05-11T11:22:59.037Z
# (0.1003s elapsed)

GET /_cat/indices?v&index=my_test&h=index,health,status,docs.deleted
# at 2020-05-11T11:22:59.037Z

# index   health status docs.deleted
# my_test green  open              1
# 
# at 2020-05-11T11:22:59.041Z
# (0.003604s elapsed)

DaveCTurner on 11 May 2020

I've marked this for team discussion in order to contemplate whether we can make this behaviour any less surprising without compromising the fidelity of the stats.

DaveCTurner on 11 May 2020

👍2

Thanks @kunisen and @DaveCTurner. We can make the deleted count more consistent by excluding tombstone documents. However, I think we should instead explain in the documentation that the doc count and deleted count might include some 'system' documents.

dnhatn on 12 May 2020

If we had a parameter to reveal more information such as a further breakdown about what sub-counts per "type"/category of document comprise the overall count they are then perhaps that would make it more obvious to users, similar to how index stats show total count vs deleted count (at lucene level).

This might also be useful for users confused about some APIs counting Elasticsearch documents vs Lucene documents where nested documents show a higher Lucene document count for a smaller Elasticsearch document count.

If we had a detailed or breakdown type of parameter we could then just show the counts for each underlying group of documents be that top level/ES vs lucene vs system/tombstone documents perhaps, maybe depending on the context of API as well (eg split by ES vs tombstone/system/other OR split by ES vs Lucene if those contexts make the most sense depending on API)

Though if we did this wrong it might make things less clear, it might still be useful but as an undocumented expert setting perhaps.

geekpete on 13 May 2020

echoing @geekpete
I did find some hint from nodes stats API.
It appears the "indices.docs.deleted" is actually from Lucene.

Maybe it's good to add that part to the API description breakdown page.
https://www.elastic.co/guide/en/elasticsearch/reference/current/cluster-nodes-stats.html

Also, if we have any difference in between _cat/indices and nodes stats API, then I would say it might be good to also mention that too, along with adding the comment of "Tombstone" and "_flush".

kunisen on 13 May 2020

The simple example I use for checking why doc counts differ depending on the api:

Doc count differences by API

#
# Why does doc count differ depending on API?
#

# using a variant of the nested docs example from the documentation can help highlight the difference
# between top level ES doc counts and lower level Lucene doc counts.
DELETE my_index
PUT my_index
{
  "mappings": {
    "properties": {
      "user": {
        "type": "nested"
      }
    }
  }
}


PUT my_index/_doc/1
{
  "group" : "fans",
  "user" : [
    {
      "first" : "John",
      "last" :  "Smith"
    },
    {
      "first" : "Alice",
      "last" :  "White"
    }
  ]
}

PUT my_index/_doc/2
{
  "group" : "fans",
  "user" : [
    {
      "first" : "Bob",
      "last" :  "Smith"
    },
    {
      "first" : "Harry",
      "last" :  "White"
    },
    {
      "first" : "Terry",
      "last" :  "Arthur"
    }
  ]
}

# flush to ensure docs are searchable/counta
POST my_index/_flush

# index count api shows top level ES docs
# https://www.elastic.co/guide/en/elasticsearch/reference/7.6/search-count.html
GET my_index/_count


# CAT Indices API shows lower level lucene doc count, nested fields are stored as separate lucene docs.
# This behaviour is documented: https://www.elastic.co/guide/en/elasticsearch/reference/current/cat-indices.html#cat-indices-api-desc
GET /_cat/indices/my_index?v

# CAT Count api shows top level ES docs:
# https://www.elastic.co/guide/en/elasticsearch/reference/7.6/search-count.html
GET /_cat/count/my_index?v

# Index stats shows Lucene doc count
GET /my_index/_stats?filter_path=indices.my_index.total.docs

For this example, maybe an additional tombstones count in there would be handy.

{
  "indices" : {
    "my_index" : {
      "total" : {
        "docs" : {
          "count" : 7,
          "deleted" : 0,
          "tombstones" : 0,
        }
      }
    }
  }
}

and

health status index    uuid                   pri rep docs.count docs.deleted docs.tombstones store.size pri.store.size
green  open   my_index 31yBbQgxTx6Nu2ObIJMThw   1   0          7            0               0      9.8kb          9.8kb

either with some optional parameter or by default,etc.

geekpete on 13 May 2020

DaveCTurner on 13 May 2020

👍2

Thanks @DaveCTurner for the pointers!

kunisen on 13 May 2020

Pinging @elastic/es-docs (>docs)

elasticmachine on 13 May 2020

Was this page helpful?

0 / 5 - 0 ratings