Hi,
@ycombinator and I are working on system tests for integrations. While debugging a basic use case with Apache logs, we identified an issue around assigning policy to the Agent:
➜ apache git:(system-test-apache) /Users/marcin.tojek/go/bin/elastic-package test system -v
2020/09/16 15:17:14 DEBUG Enable verbose logging
Run system tests for the package
2020/09/16 15:17:14 INFO setting up service...
2020/09/16 15:17:14 DEBUG setting up service using Docker Compose service deployer
2020/09/16 15:17:14 DEBUG running command: /usr/local/bin/docker-compose -f /Users/marcin.tojek/go/src/github.com/elastic/integrations/packages/apache/_dev/deploy/docker-compose.yml -p elastic-package-service up -d
Creating network "elastic-package-service_default" with the default driver
Creating elastic-package-service_apache_1 ... done
2020/09/16 15:17:16 DEBUG creating temp file /Users/marcin.tojek/.elastic-package/tmp/service_logs/stdout to hold service container elastic-package-service_apache_1 STDOUT
2020/09/16 15:17:16 DEBUG creating temp file /Users/marcin.tojek/.elastic-package/tmp/service_logs/stderr to hold service container elastic-package-service_apache_1 STDERR
2020/09/16 15:17:16 DEBUG redirecting service container elastic-package-service_apache_1 STDOUT and STDERR to temp files
2020/09/16 15:17:16 DEBUG attaching service container elastic-package-service_apache_1 to stack network elastic-package-stack_default
2020/09/16 15:17:16 DEBUG adding service container elastic-package-service_apache_1 internal ports to context
2020/09/16 15:17:16 DEBUG running command: /usr/local/bin/docker-compose -f /Users/marcin.tojek/go/src/github.com/elastic/integrations/packages/apache/_dev/deploy/docker-compose.yml -p elastic-package-service config
2020/09/16 15:17:17 INFO creating test policy...
2020/09/16 15:17:17 DEBUG POST http://127.0.0.1:5601/api/ingest_manager/agent_policies
2020/09/16 15:17:17 DEBUG {"name":"ep-test-system-apache-access-20200916T15:17:17Z","description":"test policy created by elastic-package test system for data stream apache/access","namespace":"ep"}
2020/09/16 15:17:19 INFO adding package datastream to test policy...
2020/09/16 15:17:19 DEBUG POST http://127.0.0.1:5601/api/ingest_manager/package_policies
2020/09/16 15:17:19 DEBUG {"name":"apache-access","description":"","namespace":"ep","policy_id":"f15ef160-f81e-11ea-89b3-47f49158ae7d","enabled":true,"output_id":"","inputs":[{"type":"logfile","enabled":true,"streams":[{"id":"logfile-apache.access","enabled":true,"data_stream":{"type":"logs","dataset":"apache.access"},"vars":{"paths":{"value":["/tmp/service_logs/stdout"],"type":"text"}}}],"vars":{}}],"package":{"name":"apache","title":"Apache","version":"0.1.4"}}
2020/09/16 15:17:28 INFO deleting old data in data stream...
2020/09/16 15:17:28 INFO assigning package datastream to agent...
2020/09/16 15:17:28 DEBUG PUT http://127.0.0.1:5601/api/ingest_manager/fleet/agents/4f9beed8-13a8-4f5b-bf4d-b5aaaf1897ed/reassign
2020/09/16 15:17:28 DEBUG { "policy_id": "f15ef160-f81e-11ea-89b3-47f49158ae7d" }
2020/09/16 15:17:29 INFO checking for expected data in data stream...
2020/09/16 15:17:29 DEBUG found 0 hits in logs-apache.access-ep data stream
2020/09/16 15:17:30 DEBUG found 0 hits in logs-apache.access-ep data stream
2020/09/16 15:17:31 DEBUG found 0 hits in logs-apache.access-ep data stream
2020/09/16 15:17:32 DEBUG found 0 hits in logs-apache.access-ep data stream
2020/09/16 15:17:33 DEBUG found 0 hits in logs-apache.access-ep data stream
(...)
2020/09/16 15:19:26 DEBUG found 0 hits in logs-apache.access-ep data stream
2020/09/16 15:19:27 DEBUG found 0 hits in logs-apache.access-ep data stream
2020/09/16 15:19:28 DEBUG found 0 hits in logs-apache.access-ep data stream
2020/09/16 15:19:29 DEBUG found 0 hits in logs-apache.access-ep data stream
System test for apache/access dataset failed
2020/09/16 15:19:30 DEBUG reassigning original policy back to agent...
2020/09/16 15:19:30 DEBUG PUT http://127.0.0.1:5601/api/ingest_manager/fleet/agents/4f9beed8-13a8-4f5b-bf4d-b5aaaf1897ed/reassign
2020/09/16 15:19:30 DEBUG { "policy_id": "5aec14b0-f81e-11ea-89b3-47f49158ae7d" }
2020/09/16 15:19:30 DEBUG deleting test policy...
2020/09/16 15:19:30 DEBUG POST http://127.0.0.1:5601/api/ingest_manager/agent_policies/delete
2020/09/16 15:19:30 DEBUG { "agentPolicyId": "f15ef160-f81e-11ea-89b3-47f49158ae7d" }
2020/09/16 15:19:33 INFO tearing down service...
2020/09/16 15:19:33 INFO tearing down service using docker compose runner
2020/09/16 15:19:33 DEBUG running command: /usr/local/bin/docker-compose -f /Users/marcin.tojek/go/src/github.com/elastic/integrations/packages/apache/_dev/deploy/docker-compose.yml -p elastic-package-service down
Stopping elastic-package-service_apache_1 ... done
Removing elastic-package-service_apache_1 ... done
Removing network elastic-package-service_default
Error: error running package system tests: system test for apache/access dataset failed
The reason behind this is the timeout problem on the Agent side:
{"log.level":"error","@timestamp":"2020-09-16T13:15:00.450Z","log.origin":{"file.name":"application/fleet_gateway.go","file.line":176},"message":"Could not communicate with Checking API will retry, error: fail to checkin to fleet: Post \"http://kibana:5601/api/ingest_manager/fleet/agents/4f9beed8-13a8-4f5b-bf4d-b5aaaf1897ed/checkin?\": net/http: request canceled (Client.Timeout exceeded while awaiting headers)","ecs.version":"1.5.0"}
{"log.level":"debug","@timestamp":"2020-09-16T13:16:45.721Z","log.origin":{"file.name":"kibana/client.go","file.line":170},"message":"Request method: POST, path: /api/ingest_manager/fleet/agents/4f9beed8-13a8-4f5b-bf4d-b5aaaf1897ed/checkin","ecs.version":"1.5.0"}
{"log.level":"error","@timestamp":"2020-09-16T13:18:15.620Z","log.origin":{"file.name":"application/fleet_gateway.go","file.line":176},"message":"Could not communicate with Checking API will retry, error: fail to checkin to fleet: Post \"http://kibana:5601/api/ingest_manager/fleet/agents/4f9beed8-13a8-4f5b-bf4d-b5aaaf1897ed/checkin?\": net/http: request canceled (Client.Timeout exceeded while awaiting headers)","ecs.version":"1.5.0"}
As it's important for the system tests to be quick and simple, the workaround like xpack.ingestManager.fleet.pollingRequestTimeout: 60000 doesn't seem to be the right choice. We appreciate any optimization on the path here to make the test execution shorter.
Pinging @elastic/ingest-management (Team:Ingest Management)
are we sure the policy isn't applied? a lack of data may be related to a lack of activity on the host possibly? I don't think we're not seeing this in manual usage or other e2e tests. @mdelapenya has helped set up the Agent / Fleet e2e and he could help discuss our usage if desired.
I do not thing the fleet checkin timeout is the issue here, the timeout is used in the checkin if there is no config change. If there is a config change the checkin will be immediate. If there is a config change the request will be the timeout duration.
are we sure the policy isn't applied? a lack of data may be related to a lack of activity on the host possibly?
In our setup everything is in place - in this case sample access logs. Once the filebeat picks the config and starts processing files it is matter of seconds.
as noted from @nchaulet the change should be immediate, I would be assuming the _action_store_ has indeed the latest config.
@michalpristas and @nchaulet can you pair on that issue.
@mtojek what is the delay that you are seeing?
Here is a trace from the latest failed test run:
➜ apache git:(system-test-apache) /Users/marcin.tojek/go/bin/elastic-package test system -v
2020/09/16 17:51:05 DEBUG Enable verbose logging
Run system tests for the package
2020/09/16 17:51:05 INFO setting up service...
2020/09/16 17:51:05 DEBUG setting up service using Docker Compose service deployer
2020/09/16 17:51:05 DEBUG running command: /usr/local/bin/docker-compose -f /Users/marcin.tojek/go/src/github.com/elastic/integrations/packages/apache/_dev/deploy/docker-compose.yml -p elastic-package-service up -d
Creating network "elastic-package-service_default" with the default driver
Creating elastic-package-service_apache_1 ... done
2020/09/16 17:51:07 DEBUG creating temp file /Users/marcin.tojek/.elastic-package/tmp/service_logs/stdout to hold service container elastic-package-service_apache_1 STDOUT
2020/09/16 17:51:07 DEBUG creating temp file /Users/marcin.tojek/.elastic-package/tmp/service_logs/stderr to hold service container elastic-package-service_apache_1 STDERR
2020/09/16 17:51:07 DEBUG redirecting service container elastic-package-service_apache_1 STDOUT and STDERR to temp files
2020/09/16 17:51:07 DEBUG attaching service container elastic-package-service_apache_1 to stack network elastic-package-stack_default
2020/09/16 17:51:07 DEBUG adding service container elastic-package-service_apache_1 internal ports to context
2020/09/16 17:51:07 DEBUG running command: /usr/local/bin/docker-compose -f /Users/marcin.tojek/go/src/github.com/elastic/integrations/packages/apache/_dev/deploy/docker-compose.yml -p elastic-package-service config
2020/09/16 17:51:07 INFO creating test policy...
2020/09/16 17:51:07 DEBUG POST http://127.0.0.1:5601/api/ingest_manager/agent_policies
2020/09/16 17:51:07 DEBUG {"name":"ep-test-system-apache-access-20200916T17:51:07Z","description":"test policy created by elastic-package test system for data stream apache/access","namespace":"ep"}
2020/09/16 17:51:10 INFO adding package datastream to test policy...
2020/09/16 17:51:10 DEBUG POST http://127.0.0.1:5601/api/ingest_manager/package_policies
2020/09/16 17:51:10 DEBUG {"name":"apache-access","description":"","namespace":"ep","policy_id":"6f68abe0-f834-11ea-b5c3-6f5f5f464533","enabled":true,"output_id":"","inputs":[{"type":"logfile","enabled":true,"streams":[{"id":"logfile-apache.access","enabled":true,"data_stream":{"type":"logs","dataset":"apache.access"},"vars":{"paths":{"value":["/tmp/service_logs/stdout"],"type":"text"}}}],"vars":{}}],"package":{"name":"apache","title":"Apache","version":"0.1.4"}}
2020/09/16 17:51:19 INFO deleting old data in data stream...
2020/09/16 17:51:19 INFO assigning package datastream to agent...
2020/09/16 17:51:19 DEBUG PUT http://127.0.0.1:5601/api/ingest_manager/fleet/agents/1ccef44c-3a32-4a19-94a9-85674cd3508e/reassign
2020/09/16 17:51:19 DEBUG { "policy_id": "6f68abe0-f834-11ea-b5c3-6f5f5f464533" }
2020/09/16 17:51:20 INFO checking for expected data in data stream...
2020/09/16 17:51:20 DEBUG found 0 hits in logs-apache.access-ep data stream
2020/09/16 17:51:21 DEBUG found 0 hits in logs-apache.access-ep data stream
2020/09/16 17:51:22 DEBUG found 0 hits in logs-apache.access-ep data stream
2020/09/16 17:51:23 DEBUG found 0 hits in logs-apache.access-ep data stream
2020/09/16 17:51:24 DEBUG found 0 hits in logs-apache.access-ep data stream
(...)
2020/09/16 17:53:16 DEBUG found 0 hits in logs-apache.access-ep data stream
2020/09/16 17:53:17 DEBUG found 0 hits in logs-apache.access-ep data stream
2020/09/16 17:53:18 DEBUG found 0 hits in logs-apache.access-ep data stream
2020/09/16 17:53:19 DEBUG found 0 hits in logs-apache.access-ep data stream
2020/09/16 17:53:20 DEBUG found 0 hits in logs-apache.access-ep data stream
System test for apache/access dataset failed
2020/09/16 17:53:21 DEBUG reassigning original policy back to agent...
2020/09/16 17:53:21 DEBUG PUT http://127.0.0.1:5601/api/ingest_manager/fleet/agents/1ccef44c-3a32-4a19-94a9-85674cd3508e/reassign
2020/09/16 17:53:21 DEBUG { "policy_id": "e0f87020-f833-11ea-b5c3-6f5f5f464533" }
2020/09/16 17:53:21 DEBUG deleting test policy...
2020/09/16 17:53:21 DEBUG POST http://127.0.0.1:5601/api/ingest_manager/agent_policies/delete
2020/09/16 17:53:21 DEBUG { "agentPolicyId": "6f68abe0-f834-11ea-b5c3-6f5f5f464533" }
2020/09/16 17:53:24 INFO tearing down service...
2020/09/16 17:53:24 INFO tearing down service using docker compose runner
2020/09/16 17:53:24 DEBUG running command: /usr/local/bin/docker-compose -f /Users/marcin.tojek/go/src/github.com/elastic/integrations/packages/apache/_dev/deploy/docker-compose.yml -p elastic-package-service down
Stopping elastic-package-service_apache_1 ... done
Removing elastic-package-service_apache_1 ... done
Removing network elastic-package-service_default
Error: error running package system tests: system test for apache/access dataset failed
17:51:19 ~ 17:53:21 - we assumed that 2 minutes should more than enough.
For the second time:
2020/09/16 17:55:44 DEBUG adding service container elastic-package-service_apache_1 internal ports to context
2020/09/16 17:55:44 DEBUG running command: /usr/local/bin/docker-compose -f /Users/marcin.tojek/go/src/github.com/elastic/integrations/packages/apache/_dev/deploy/docker-compose.yml -p elastic-package-service config
2020/09/16 17:55:44 INFO creating test policy...
2020/09/16 17:55:44 DEBUG POST http://127.0.0.1:5601/api/ingest_manager/agent_policies
2020/09/16 17:55:44 DEBUG {"name":"ep-test-system-apache-access-20200916T17:55:44Z","description":"test policy created by elastic-package test system for data stream apache/access","namespace":"ep"}
2020/09/16 17:55:47 INFO adding package datastream to test policy...
2020/09/16 17:55:47 DEBUG POST http://127.0.0.1:5601/api/ingest_manager/package_policies
2020/09/16 17:55:47 DEBUG {"name":"apache-access","description":"","namespace":"ep","policy_id":"14726d60-f835-11ea-b5c3-6f5f5f464533","enabled":true,"output_id":"","inputs":[{"type":"logfile","enabled":true,"streams":[{"id":"logfile-apache.access","enabled":true,"data_stream":{"type":"logs","dataset":"apache.access"},"vars":{"paths":{"value":["/tmp/service_logs/stdout"],"type":"text"}}}],"vars":{}}],"package":{"name":"apache","title":"Apache","version":"0.1.4"}}
2020/09/16 17:55:50 INFO deleting old data in data stream...
2020/09/16 17:55:50 INFO assigning package datastream to agent...
2020/09/16 17:55:50 DEBUG PUT http://127.0.0.1:5601/api/ingest_manager/fleet/agents/1ccef44c-3a32-4a19-94a9-85674cd3508e/reassign
2020/09/16 17:55:50 DEBUG { "policy_id": "14726d60-f835-11ea-b5c3-6f5f5f464533" }
2020/09/16 17:55:51 INFO checking for expected data in data stream...
2020/09/16 17:55:51 DEBUG found 0 hits in logs-apache.access-ep data stream
2020/09/16 17:55:52 DEBUG found 0 hits in logs-apache.access-ep data stream
2020/09/16 17:55:53 DEBUG found 0 hits in logs-apache.access-ep data stream
2020/09/16 17:55:54 DEBUG found 0 hits in logs-apache.access-ep data stream
2020/09/16 17:55:55 DEBUG found 0 hits in logs-apache.access-ep data stream
2020/09/16 17:55:56 DEBUG found 0 hits in logs-apache.access-ep data stream
(...)
2020/09/16 17:57:45 DEBUG found 0 hits in logs-apache.access-ep data stream
2020/09/16 17:57:46 DEBUG found 0 hits in logs-apache.access-ep data stream
2020/09/16 17:57:47 DEBUG found 0 hits in logs-apache.access-ep data stream
2020/09/16 17:57:48 DEBUG found 0 hits in logs-apache.access-ep data stream
2020/09/16 17:57:49 DEBUG found 90 hits in logs-apache.access-ep data stream
System test for apache/access dataset passed!
17:55:50 ~ 17:57:49 - it passed just before timeout.
Here is another successful run:
➜ apache git:(system-test-apache) elastic-package test -v
2020/09/16 18:03:21 DEBUG Enable verbose logging
Run test suite for the package
Run system tests for the package
2020/09/16 18:03:21 INFO setting up service...
2020/09/16 18:03:21 DEBUG setting up service using Docker Compose service deployer
2020/09/16 18:03:21 DEBUG running command: /usr/local/bin/docker-compose -f /Users/marcin.tojek/go/src/github.com/elastic/integrations/packages/apache/_dev/deploy/docker-compose.yml -p elastic-package-service up -d
Creating network "elastic-package-service_default" with the default driver
Creating elastic-package-service_apache_1 ... done
2020/09/16 18:03:23 DEBUG creating temp file /Users/marcin.tojek/.elastic-package/tmp/service_logs/stdout to hold service container elastic-package-service_apache_1 STDOUT
2020/09/16 18:03:23 DEBUG creating temp file /Users/marcin.tojek/.elastic-package/tmp/service_logs/stderr to hold service container elastic-package-service_apache_1 STDERR
2020/09/16 18:03:23 DEBUG redirecting service container elastic-package-service_apache_1 STDOUT and STDERR to temp files
2020/09/16 18:03:23 DEBUG attaching service container elastic-package-service_apache_1 to stack network elastic-package-stack_default
2020/09/16 18:03:23 DEBUG adding service container elastic-package-service_apache_1 internal ports to context
2020/09/16 18:03:23 DEBUG running command: /usr/local/bin/docker-compose -f /Users/marcin.tojek/go/src/github.com/elastic/integrations/packages/apache/_dev/deploy/docker-compose.yml -p elastic-package-service config
2020/09/16 18:03:24 INFO creating test policy...
2020/09/16 18:03:24 DEBUG POST http://127.0.0.1:5601/api/ingest_manager/agent_policies
2020/09/16 18:03:24 DEBUG {"name":"ep-test-system-apache-access-20200916T18:03:24Z","description":"test policy created by elastic-package test system for data stream apache/access","namespace":"ep"}
2020/09/16 18:03:26 INFO adding package datastream to test policy...
2020/09/16 18:03:26 DEBUG POST http://127.0.0.1:5601/api/ingest_manager/package_policies
2020/09/16 18:03:26 DEBUG {"name":"apache-access","description":"","namespace":"ep","policy_id":"26496c90-f836-11ea-b5c3-6f5f5f464533","enabled":true,"output_id":"","inputs":[{"type":"logfile","enabled":true,"streams":[{"id":"logfile-apache.access","enabled":true,"data_stream":{"type":"logs","dataset":"apache.access"},"vars":{"paths":{"value":["/tmp/service_logs/stdout"],"type":"text"}}}],"vars":{}}],"package":{"name":"apache","title":"Apache","version":"0.1.4"}}
2020/09/16 18:03:30 INFO deleting old data in data stream...
2020/09/16 18:03:30 INFO assigning package datastream to agent...
2020/09/16 18:03:30 DEBUG PUT http://127.0.0.1:5601/api/ingest_manager/fleet/agents/1ccef44c-3a32-4a19-94a9-85674cd3508e/reassign
2020/09/16 18:03:30 DEBUG { "policy_id": "26496c90-f836-11ea-b5c3-6f5f5f464533" }
2020/09/16 18:03:31 INFO checking for expected data in data stream...
2020/09/16 18:03:31 DEBUG found 0 hits in logs-apache.access-ep data stream
2020/09/16 18:03:32 DEBUG found 0 hits in logs-apache.access-ep data stream
2020/09/16 18:03:33 DEBUG found 0 hits in logs-apache.access-ep data stream
2020/09/16 18:03:34 DEBUG found 0 hits in logs-apache.access-ep data stream
2020/09/16 18:03:35 DEBUG found 0 hits in logs-apache.access-ep data stream
2020/09/16 18:03:36 DEBUG found 0 hits in logs-apache.access-ep data stream
2020/09/16 18:03:37 DEBUG found 0 hits in logs-apache.access-ep data stream
2020/09/16 18:03:38 DEBUG found 0 hits in logs-apache.access-ep data stream
2020/09/16 18:03:39 DEBUG found 0 hits in logs-apache.access-ep data stream
2020/09/16 18:03:40 DEBUG found 0 hits in logs-apache.access-ep data stream
2020/09/16 18:03:41 DEBUG found 0 hits in logs-apache.access-ep data stream
2020/09/16 18:03:42 DEBUG found 0 hits in logs-apache.access-ep data stream
2020/09/16 18:03:43 DEBUG found 0 hits in logs-apache.access-ep data stream
2020/09/16 18:03:44 DEBUG found 0 hits in logs-apache.access-ep data stream
2020/09/16 18:03:45 DEBUG found 0 hits in logs-apache.access-ep data stream
2020/09/16 18:03:46 DEBUG found 0 hits in logs-apache.access-ep data stream
2020/09/16 18:03:47 DEBUG found 0 hits in logs-apache.access-ep data stream
2020/09/16 18:03:48 DEBUG found 0 hits in logs-apache.access-ep data stream
2020/09/16 18:03:49 DEBUG found 16 hits in logs-apache.access-ep data stream
System test for apache/access dataset passed!
As you see this one it's relatively quick.
@mtojek let me investigate a little more but looks like there is case I did not cover with config reassignment
Hi @nchaulet, any progress on this issue? Just wondering since I'm about to merge a PR with a workaround in place: https://github.com/elastic/elastic-package/pull/111/files#diff-6d81fc25f6617cc18c88e703716754b9R23-R24. Thanks!
@ycombinator I started on this but is a bigger change than what I thought https://github.com/elastic/kibana/pull/78493
Thanks for the update @nchaulet. I've subscribed to your PR now.
Thanks for getting this fixed @nchaulet. Appreciate it! ❤️
Most helpful comment
@mtojek let me investigate a little more but looks like there is case I did not cover with config reassignment