If processors fail in Ingest node, a 500 status code "Internal Server Error" is reported for the event at hand. This can be addressed in the ingest node pipeline definition by adding on_failure settings. Although Beats normally do so, users with custom pipelines and not being aware the error might run into an infinite retry in Filebeat.
The bulk API returns a status (and often an error trace) for every single event being send. We should have a look if there is some information in the response, that allows us to detect an issue with the Ingest Node pipeline and drop the event.
For reference: https://github.com/elastic/apm-server/issues/2880
This seems to be unexpected behavior on the ES side, see https://github.com/elastic/elasticsearch/issues/48803
We should have a look if there is some information in the response, that allows us to detect an issue with the Ingest Node pipeline and drop the event.
This sounds like a workaround that would accumulate some technical debt. As @simitt points out, it鈥檚 an upstream issue that we treat all client errors as server errors. We鈥檙e working on a fix. See elastic/elasticsearch#48810.
SGTM. Thanks @jasontedor
The upstream fix will be available in the 7.5.0 release.
Most helpful comment
The upstream fix will be available in the 7.5.0 release.