Beats: [Elastic Agent] Agent datastreams are conflicting with Filebeat setup

Created on 24 Jun 2020  路  29Comments  路  Source: elastic/beats

While trying to set up filebeat in my 8.0 snapshot cluster and I got this error message. Is it possible filebeat modules are conflicting with agent?

jason@jason-VB-Development:~/filebeat-8.0.0-SNAPSHOT-linux-x86_64$ ./filebeat setup
Overwriting ILM policy is disabled. Set `setup.ilm.overwrite:true` for enabling.
Exiting: failed to check for alias 'filebeat-8.0.0': (status=400) {"error":{"root_cause":[{"type":"illegal_argument_exception","reason":"The provided expression [logs-agent-default] matches a data stream, specify the corresponding concrete indices instead."}],"type":"illegal_argument_exception","reason":"The provided expression [logs-agent-default] matches a data stream, specify the corresponding concrete indices instead."},"status":400}: 400 Bad Request: {"error":{"root_cause":[{"type":"illegal_argument_exception","reason":"The provided expression [logs-agent-default] matches a data stream, specify the corresponding concrete indices instead."}],"type":"illegal_argument_exception","reason":"The provided expression [logs-agent-default] matches a data stream, specify the corresponding concrete indices instead."},"status":400}

For confirmed bugs, please report:

  • Version: 8.0 snapshot
  • Operating System: Ubuntu
  • Steps to Reproduce:

    1. Install Elastic Agent on host, enroll into fleet and run it

    2. Install filebeat

    3. ./filebeat modules enable system

    4. ./filebeat setup

@EricDavisX is it worth having a test case to make sure that Elastic Agent is compatible with Filebeat running on the same cluster? I imagine a large % of customers use Beats.

beta1 Ingest Management bug

All 29 comments

Pinging @elastic/ingest-management (Team:Ingest Management)

@michalpristas What I see above in the error comes probably from the agent / Filebeat run through the Agent. Is Agent somehow interferring with other filebeat binaries?

I am confused by this.

@mostlyjason Could you share the YML configuration that you are using for filebeat?

I'm just using the default filebeat.yml with the cloud.id and and cloud.auth pasted in. Seems to be reproducable since I tried it with a fresh 8.0 cluster on staging, and a fresh directory for filebeat and elastic agent. The error message is generated by standalone filebeat. You can see I'm running the setup command in the directory where I extracted the filebeat tar.gz.

@mostlyjason i'm confused by this as well. are you running agent + filebeat (independent of filebeat included in agent) on the same machine and then try to configure the standalone one?

Exiting: failed to check for alias 'filebeat-8.0.0': (status=400) {"error":{"root_cause":[{"type":"illegal_argument_exception","reason":"The provided expression [logs-agent-default] matches a data stream, specify the corresponding concrete indices instead."}],"type":"illegal_argument_exception","reason":"The provided expression [logs-agent-default] matches a data stream, specify the corresponding concrete indices instead."},"status":400}: 400 Bad Request: {"error":{"root_cause":[{"type":"illegal_argument_exception","reason":"The provided expression [logs-agent-default] matches a data stream, specify the corresponding concrete indices instead."}],"type":"illegal_argument_exception","reason":"The provided expression [logs-agent-default] matches a data stream, specify the corresponding concrete indices instead."},"status":400}

So, taking a step back and looking more deeply in the error messages.

  1. Overwriting ILM policy is disabled: this mean that there is a matching ILM policy on the remote cluster.
  2. failed to check for alias 'filebeat-8.0.0: Filebeat is indeed running into ILM and check that the writing alias filebeat-8.0.0 exists.
  3. logs-agent-default This is weird, because this new indexing strategy only concern Agent, Filebeat have no knowledge of that concept at all.

I do have a theory, there is a conflict between the ILM policy in ingest manager and the one shipped with Filebeat, what point me to this is the datastream error logs-agent-default.

I am not sure if this is an error with Agent / Filebeat but instead with how we deal with packages, @michalpristas Can you try to reproduce the above use case?

Thanks @ph thats a helpful breakdown. @michalpristas yes what you said is correct

@michalpristas Lets try to reproduce on our hand, but I suspect the problem isn't on the agent and the Filebeat side but in EPM.

I have found this issue with Metricbeat too:

2020-07-06T18:23:26.628+0200    ERROR   [publisher_pipeline_output] pipeline/output.go:154  Failed to connect to backoff(elasticsearch(http://127.0.0.1:9200)): Connection marked as failed because the onConnect callback failed: failed to check for alias 'metricbeat-8.0.0': (status=400) {"error":{"root_cause":[{"type":"illegal_argument_exception","reason":"The provided expression [logs-agent-default] matches a data stream, specify the corresponding concrete indices instead."}],"type":"illegal_argument_exception","reason":"The provided expression [logs-agent-default] matches a data stream, specify the corresponding concrete indices instead."},"status":400}: 400 Bad Request: {"error":{"root_cause":[{"type":"illegal_argument_exception","reason":"The provided expression [logs-agent-default] matches a data stream, specify the corresponding concrete indices instead."}],"type":"illegal_argument_exception","reason":"The provided expression [logs-agent-default] matches a data stream, specify the corresponding concrete indices instead."},"status":400}

Once this error appears Metricbeat stops ingesting, so it could affect working deployments where giving a try to the Agent.

Steps to reproduce:

  • Have metribeat running.
  • Start ingesting with an agent in the same cluster.
  • Restart original metricbeat.

Then original metricbeat cannot ingest anymore. To reproduce this it is not enough with adding needed to add a configuration, an agent needs to ingest data.

I can reproduce it with this metricbeat configuration:

metricbeat.modules:
- module: zookeeper
  metricsets: [mntr, server, connection]
  hosts:
  - localhost:2181

output.elasticsearch:
  hosts: [127.0.0.1:9200]
  username: elastic
  password: changeme

And elastic agent on standalone mode (not managed by fleet), this configuration seems to be enough:

outputs:
  default:
    type: elasticsearch
    hosts: [127.0.0.1:9200]
    username: elastic
    password: changeme

logging.to_stderr: true

inputs:
  - type: system/metrics
    dataset.namespace: default
    use_output: default
    streams:
      - metricset: cpu
        dataset.name: system.cpu
      - metricset: memory
        dataset.name: system.memory
      - metricset: network
        dataset.name: system.network
      - metricset: filesystem
        dataset.name: system.filesystem

@jsoriano I am going to reproduce it thanks for the config.

@jsoriano The error that you see:

2020-07-06T18:23:26.628+0200    ERROR   [publisher_pipeline_output] pipeline/output.go:154  Failed to connect to backoff(elasticsearch(http://127.0.0.1:9200)): Connection marked as failed because the onConnect callback failed: failed to check for alias 'metricbeat-8.0.0': (status=400) {"error":{"root_cause":[{"type":"illegal_argument_exception","reason":"The provided expression [logs-agent-default] matches a data stream, specify the corresponding concrete indices instead."}],"type":"illegal_argument_exception","reason":"The provided expression [logs-agent-default] matches a data stream, specify the corresponding concrete indices instead."},"status":400}: 400 Bad Request: {"error":{"root_cause":[{"type":"illegal_argument_exception","reason":"The provided expression [logs-agent-default] matches a data stream, specify the corresponding concrete indices instead."}],"type":"illegal_argument_exception","reason":"The provided expression [logs-agent-default] matches a data stream, specify the corresponding concrete indices instead."},"status":400}

This is in the metricbeat log?

Yes, this is in the Metricbeat log.

@jsoriano Just to clarify the steps.

In your reproducable case, did you ever run kibana / ingest manager?

Yes, I run Kibana and ES with the 8.0.0-SNAPSHOTs, and Metricbeat and Agent built from master. Not sure though if I opened the ingest manager UI, I think I didn't do it on my last try with the posted configurations, but I could try again to confirm.

Are you having problems to reproduce this?

Checking with @blakerouse we aren't sure how this is possible yet.

trying to repro filebeat issue from both masters i had no issues
image

i will try metricbeat as described above

@jsoriano do you have logs from metricbeat logs from both original metricbeat and one running by agent (in data/logs/default/metrcibeat) available.
Not using zookeeper but kibana module but i'm ingesting still

@ph were you able to repro?

assigning me as i'm playing with it

I've haven't been able to reproduce it, looking at the error its maybe a package that we install via EPM. :(

An update of this, I have tried to reproduce it again with master and with 7.x, good news is that with 7.x this issue doesn't happen to me. So maybe this is caused by some breaking change in Beats/ES/Kibana for 8.x and we are good with 7.x, but we should be careful in case we backport what is causing this.

To reproduce this I am using Elasticsearch and Kibana from the scenario in https://github.com/elastic/integrations/tree/master/testing/environments. I don't open Kibana or install any package on any moment.

For master:

  • I run the scenario as is in the integrations repository, that starts the stack using 8.0 snapshots.
  • I run metricbeat built from master branch.
  • I run elastic-agent built from master branch (with mage package) and with the reference config file included in the tar.gz.
  • After running elastic agent, if I restart metricbeat it cannot ingest anymore, with this error:
2020-07-28T11:07:49.883+0200    ERROR   [publisher_pipeline_output] pipeline/output.go:154  Failed to connect to backoff(elasticsearch(http://127.0.0.1:9200)): Connection marked as failed because the onConnect callback failed: failed to check for alias 'metricbeat-8.0.0': (status=400) {"error":{"root_cause":[{"type":"illegal_argument_exception","reason":"The provided expression [logs-elastic.agent-default] matches a data stream, specify the corresponding concrete indices instead."}],"type":"illegal_argument_exception","reason":"The provided expression [logs-elastic.agent-default] matches a data stream, specify the corresponding concrete indices instead."},"status":400}: 400 Bad Request: {"error":{"root_cause":[{"type":"illegal_argument_exception","reason":"The provided expression [logs-elastic.agent-default] matches a data stream, specify the corresponding concrete indices instead."}],"type":"illegal_argument_exception","reason":"The provided expression [logs-elastic.agent-default] matches a data stream, specify the corresponding concrete indices instead."},"status":400}

Note: running PLATFORMS=linux/amd64 mage package fails for me with current master with the following error, but tar.gz seems to be correctly generated:

COPY failed: stat /var/lib/docker/tmp/docker-builder878593990/beat: no such file or directory
...
Error: failed building elastic-agent type=docker for platform=linux/amd64: failed to build docker: running "docker build -t docker.elastic.co/beats/elastic-agent:8.0.0 build/package/elastic-agent/elastic-agent-linux-amd64.docker/docker-build" failed with exit code 1

For 7.x:

  • I run the scenario as in the integrations repository, but replacing the versions with 7.9.0-SNAPSHOT.
  • I run metricbeat built from 7.x branch.
  • I run elastic-agent built from 7.x branch (with mage package) and with the reference config file included in the tar.gz.
  • Metricbeat has no problem ingesting even after restarting it.

not sure what changed but today i'm hitting some race on ^^^ beat build dir. sometimes it gets build completely sometimes partially and sometimes not at all before proceeding to composing docker image.

edit: i may have a clue actually

@jsoriano try latest master for build issue, should be ok now

@jsoriano try latest master for build issue, should be ok now

It builds now, yes, thanks!

Can we close this?

Can we close this?

This issue still happens with master, are we ok with this?

It doesn't seem to affect 7.9/beta1, so in any case I think we can remove the Ingest Management:beta1 label.

If it still happens in master, we should keep it open and continue investigating. I would really like to understand why it happens so we can figure out if 7.x is also effected.

Let's keep it open.

I am worried, are we missing a commit in 7.9?

tried to repro today without luck (using cloud 8.0.0 snapshot for ES and kibana) but i found this which looks exactly like the same issue

https://github.com/elastic/kibana/issues/69061

and their fix here

https://github.com/elastic/kibana/pull/68794

not sure what we can fix as this appears to manifests on ILM setup while trying to check whether alias exists

I can confirm that I cannot reproduce this issue anymore with latest 8.0 snapshots. And I agree that elastic/kibana#69061 looks like the same issue. So let's close this one. Thanks for the investigation!

Was this page helpful?
0 / 5 - 0 ratings