We should
While testing 1.2.0 I explored the possibility of deploying Elastic Agent as just another Beat. It turned out that while it is possible shoehorning Elastic Agent into the existing Beat CRD there is very little value in it: the config format for outputs differs from the format the Beats use so there is zero reusable configuration. Also when integrated with Fleet output configuration is set via Kibana's agent configuration API.
There are a few challenges when running Elastic Agent on k8s with ECK:
{
"action": "checkin",
"success": true,
"actions": [
{
"agent_id": "05da5b91-7134-4d41-b279-e03723f67c25",
"type": "CONFIG_CHANGE",
"data": {
"config": {
"id": "4bcaca10-b7bb-11ea-b290-cd25dc4a6f57",
"outputs": {
"default": {
"type": "elasticsearch",
"hosts": [
"https://o11y-es-http.default.svc:9200"
],
"api_key": "<redacted>"
}
},
fleet.yml) which needs to be shared with the main container.The manifest below contains additional hostPath mounts which I simply copied from the filebeat manifest, which are not functional in any form atm.
---
apiVersion: elasticsearch.k8s.elastic.co/v1
kind: Elasticsearch
metadata:
name: o11y
spec:
version: 7.8.0
nodeSets:
- name: default
count: 3
config:
# This setting could have performance implications for production clusters.
# See: https://www.elastic.co/guide/en/cloud-on-k8s/master/k8s-virtual-memory.html
node.store.allow_mmap: false
---
apiVersion: kibana.k8s.elastic.co/v1
kind: Kibana
metadata:
name: o11y
spec:
version: 7.8.0
count: 1
config:
xpack.ingestManager.enabled: true
xpack.ingestManager.fleet.elasticsearch.host: "https://o11y-es-http.default.svc:9200"
xpack.ingestManager.fleet.kibana.host: "https://o11y-kb-http.default.svc:5601"
elasticsearchRef:
name: o11y
http:
service:
spec:
type: LoadBalancer
---
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: agent-poc
spec:
selector:
matchLabels:
common.k8s.elastic.co/type: agent
template:
metadata:
labels:
common.k8s.elastic.co/type: agent
spec:
automountServiceAccountToken: true
terminationGracePeriodSeconds: 30
hostNetwork: true
dnsPolicy: ClusterFirstWithHostNet
initContainers:
- name: init-cfg
command:
- bash
- -c
- touch /usr/share/elastic-agent/config/agent/cfg.yml
image: docker.elastic.co/beats/elastic-agent:7.8.0
volumeMounts:
- mountPath: /usr/share/elastic-agent/config/agent
name: shared-config
- name: enroll
command:
- elastic-agent
- enroll
args:
- https://o11y-kb-http.default.svc:5601
- ${ENROLLMENT_TOKEN}
- -a
- /usr/share/elastic-agent/config/other/ca.crt
- -f
- -c
- /usr/share/elastic-agent/config/agent/cfg.yml
- --path.home=/usr/share/elastic-agent/config/agent
image: docker.elastic.co/beats/elastic-agent:7.8.0
volumeMounts:
- mountPath: /usr/share/elastic-agent/config/agent
name: shared-config
- mountPath: /usr/share/elastic-agent/config/other
name: kibana-certs
containers:
- name: elastic-agent
args:
- run
- -c
- /usr/share/elastic-agent/config/agent/cfg.yml
- --path.home=/usr/share/elastic-agent/config/agent
command:
- elastic-agent
image: docker.elastic.co/beats/elastic-agent:7.8.0
volumeMounts:
- mountPath: /usr/share/elastic-agent/data
name: agent-data
- mountPath: /usr/share/elastic-agent/config/other
name: kibana-certs
- mountPath: /usr/share/elastic-agent/config/agent
name: shared-config
- mountPath: /var/lib/docker/containers
name: varlibdockercontainers
- mountPath: /var/log/containers
name: varlogcontainers
- mountPath: /var/log/pods
name: varlogpods
securityContext:
runAsUser: 0
volumes:
- name: agent-data
emptyDir: {}
- name: kibana-certs
secret:
defaultMode: 420
secretName: o11y-kb-http-certs-public
- emptyDir: {}
name: shared-config
- hostPath:
path: /var/lib/docker/containers
type: ""
name: varlibdockercontainers
- hostPath:
path: /var/log/containers
type: ""
name: varlogcontainers
- hostPath:
path: /var/log/pods
type: ""
name: varlogpods
Nice. Did you test the standalone (non-fleet) mode? Any issues there?
an Enrollment Token needs to be created in Kibana before any agent can be enrolled (I did this manually for the purposes of this test, the same token can be used to enroll multiple agents though)
This is probably something we want the operator to orchestrate.
the Elasticsearch output configuration shipped by Kibana to the agent is unaware of any of ECK's self-signed certificates afaik and therefore non-functional
Kibana already knows about CAs that are needed here, would it make sense for them to be included in the config that is pushed to the Agent? Ie. is that a feature that Fleet should implement?
Nice. Did you test the standalone (non-fleet) mode? Any issues there?
I did not. But it would allow ECK to configure a correct default output with certificates.
an Enrollment Token needs to be created in Kibana before any agent can be enrolled (I did this manually for the purposes of this test, the same token can be used to enroll multiple agents though)
This is probably something we want the operator to orchestrate.
Yes I think so too.
the Elasticsearch output configuration shipped by Kibana to the agent is unaware of any of ECK's self-signed certificates afaik and therefore non-functional
Kibana already knows about CAs that are needed here, would it make sense for them to be included in the config that is pushed to the Agent? Ie. is that a feature that Fleet should implement?
Good point. I will raise an issue in the Beats repository.
@david-kow Not sure if there's another issue tracking the implementation of the Elastic Agent in ECK, but I was trying to track down the progress on that, the 7.9 Elastic Stack drop has options to run the Elastic Agent as beta.
Hi @SeanPlacchetti, this didn't get too much traction so far, but this should change in following weeks. This is the right issue to track and I'll make sure it's updated with the progress.
FYI we're looking into integrating APM Server with Elastic Agent now: https://github.com/elastic/apm-server/issues/4004. We anticipate that in 8.0 this will be the one and only way of running a fully functioning APM Server, as all index templates, pipelines, etc. would be managed by Fleet.
I played with the Elastic Agent and managed to get it to work in Fleet mode with hands-off setup. This includes creating Fleet user, grabbing tokens, enrolling agents and running the Agent, all while using our custom CAs. See the manifest at the bottom.
Some notes:
enroll (I'm not sure if run doesn't already do it) offered a way to retry. This would help greatly when users are deploying the Stack and Agents simultaneously. Instead of Pod restarts when Kibana is not yet available the process would keep running and retries happening would be visible through logs. Similar feature is supported by Beats when connecting to Elasticsearch output or Kibana (for dashboard setup) already. @ruflin - something to consider maybe.2020-09-18T05:06:47.775Z DEBUG kibana/client.go:170 Request method: POST, path: /api/ingest_manager/fleet/agents/049b65a1-aaa2-41b2-9dd3-f4384a646eac/checkin
2020-09-18T05:06:48.488Z DEBUG application/action_dispatcher.go:81 Dispatch 1 actions of types: *fleetapi.ActionConfigChange
2020-09-18T05:06:48.489Z DEBUG application/handler_action_policy_change.go:23 handlerConfigChange: action 'action_id: 8654298c-7e72-4930-8518-031f9da51634, type: CONFIG_CHANGE' received
2020-09-18T05:06:48.490Z DEBUG application/handler_action_policy_change.go:34 handlerConfigChange: emit configuration for action action_id: 8654298c-7e72-4930-8518-031f9da51634, type: CONFIG_CHANGE
2020-09-18T05:06:48.491Z DEBUG application/emitter.go:39 Transforming configuration into a tree
2020-09-18T05:06:48.491Z DEBUG application/action_dispatcher.go:93 Failed to dispatch action 'action_id: 8654298c-7e72-4930-8518-031f9da51634, type: CONFIG_CHANGE', error: could not create the AST from the configuration: missing field accessing 'inputs'
2020-09-18T05:06:48.491Z ERROR application/fleet_gateway.go:159 failed to dispatch actions, error: could not create the AST from the configuration: missing field accessing 'inputs'
hostNetwork when needed. Right now this is a two step process where first the Pod template needs to be modified to give the Agent right access and then changes can be made via Fleet. This is not very convenient, so we should think if we want to improve this and if yes, how to do it.
apiVersion: elasticsearch.k8s.elastic.co/v1
kind: Elasticsearch
metadata:
name: elasticsearch
spec:
version: 7.9.0
nodeSets:
- name: default
count: 3
config:
node.store.allow_mmap: false
---
apiVersion: kibana.k8s.elastic.co/v1
kind: Kibana
metadata:
name: kibana
spec:
version: 7.9.0
count: 1
config:
xpack.ingestManager.fleet.elasticsearch.host: "https://elasticsearch-es-http.default.svc:9200"
xpack.ingestManager.fleet.kibana.host: "https://kibana-kb-http.default.svc:5601"
elasticsearchRef:
name: elasticsearch
---
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: agent-poc
spec:
selector:
matchLabels:
common.k8s.elastic.co/type: agent
template:
metadata:
labels:
common.k8s.elastic.co/type: agent
spec:
automountServiceAccountToken: true
terminationGracePeriodSeconds: 30
hostNetwork: true
dnsPolicy: ClusterFirstWithHostNet
initContainers:
- name: agent-setup
command: ["/bin/sh","-c"]
args:
- |
set -e
# this file is created when agent runs
test -f "action_store.yml" && exit 0;
# we need to trust custom CA from Kibana
cp /usr/share/elastic-agent/config/kb/ca.crt /etc/pki/ca-trust/source/anchors/
update-ca-trust
# hardcoded Kibana service URL
KIBANA_URL=https://kibana-kb-http.default.svc:5601
# setup Fleet user
curl -XPOST -u ${ELASTICSEARCH_USER}:${ELASTICSEARCH_PASS} "${KIBANA_URL}/api/ingest_manager/fleet/setup" -d '{"forceRecreate":false}' -H "kbn-xsrf: reporting" -H "Content-Type: application/json"
# grab the first (default) enrollment token
EK_ID=$(curl -u ${ELASTICSEARCH_USER}:${ELASTICSEARCH_PASS} "${KIBANA_URL}/api/ingest_manager/fleet/enrollment-api-keys?page=1&perPage=20" | jq -r .list[0].id)
TOKEN=$(curl -u ${ELASTICSEARCH_USER}:${ELASTICSEARCH_PASS} "${KIBANA_URL}/api/ingest_manager/fleet/enrollment-api-keys/${EK_ID}" | jq -r .item.api_key)
# create empty config file as enroll complains if it's not there
touch /usr/share/elastic-agent/config/agent/elastic-agent.yml
# enrolls the agent and uses .../agent directory for config (elastic-agent.yml and fleet.yml)
elastic-agent enroll ${KIBANA_URL} ${TOKEN} -f --path.config /usr/share/elastic-agent/config/agent
image: docker.elastic.co/beats/elastic-agent:7.9.0
env:
- name: ELASTICSEARCH_USER
value: elastic
- name: ELASTICSEARCH_PASS
valueFrom:
secretKeyRef:
name: elasticsearch-es-elastic-user
key: elastic
volumeMounts:
- mountPath: /usr/share/elastic-agent/config/agent
name: shared-config
- mountPath: /usr/share/elastic-agent/config/kb
name: kb-certs
- mountPath: /usr/share/elastic-agent/config/es
name: es-certs
containers:
- name: elastic-agent
command: ["/bin/sh","-c"]
args:
- |
# trust Kibana (for Fleet) and ES (for Beats)
cp /usr/share/elastic-agent/config/es/ca.crt /etc/pki/ca-trust/source/anchors/esca.crt
cp /usr/share/elastic-agent/config/kb/ca.crt /etc/pki/ca-trust/source/anchors/kbca.crt
update-ca-trust
# runs the agent and uses elastic-agent.yml and fleet.yml created by init container
elastic-agent run --path.config /usr/share/elastic-agent/config/agent -e
image: docker.elastic.co/beats/elastic-agent:7.9.0
volumeMounts:
- mountPath: /usr/share/elastic-agent/config/kb
name: kb-certs
- mountPath: /usr/share/elastic-agent/config/es
name: es-certs
- mountPath: /usr/share/elastic-agent/config/agent
name: shared-config
- mountPath: /var/lib/docker/containers
name: varlibdockercontainers
- mountPath: /var/log/containers
name: varlogcontainers
- mountPath: /var/log/pods
name: varlogpods
securityContext:
runAsUser: 0
volumes:
- name: kb-certs
secret:
defaultMode: 420
secretName: kibana-kb-http-certs-public
- name: es-certs
secret:
defaultMode: 420
secretName: elasticsearch-es-http-certs-public
- emptyDir: {}
name: shared-config
- hostPath:
path: /var/lib/docker/containers
type: ""
name: varlibdockercontainers
- hostPath:
path: /var/log/containers
type: ""
name: varlogcontainers
- hostPath:
path: /var/log/pods
type: ""
name: varlogpods
Great to see the progress on this. I think there are at least 2 different aspects of Elastic Agent in ECK:
You mention above, that the k8s module does not work. I assume you use the prebuilt integration here (not module ;-) ).
++ on having retry, I think there are multiple use cases for this. Could you open an issue for this in the Beats repo?
For the generic metrics implementation, we don't have it yet. But I wonder what you need it for in this context?
Thanks a lot for pushing this forward, happy to push forward any changes needed on our end to make it happen.
@david-kow Are your changes by chance in a branch / draft PR somewhere so playing around with it would be possible? Or does the above snipped contain already all I need? Sorry, not too familiar with ECK yet.
Thanks for your comment @ruflin.
Yes, I was talking about Kubernetes integration, the module works well :)
I've created an issue for retry.
As to the generic metrics, nothing specific right now, but we do use it for Stack Monitoring in our current Beat examples, so I thought I'll mention it.
The above is all you need (assuming ECK is already installed) :)
- Enrolled Elastic Agent into Fleet that is just idle and waiting for policies. This Agent can be used for polling use cases like AWS package, Uptime or getting policies for monitoring some other services running in the same k8s cluster like MySQL.
- Enrolled Elastic Agent that directly monitors K8s/ECK: Here, we need additional volumes and permissions. This Agent would also cover the current MB / FB monitoring of the Stack. I'm wonder if there should be potentially 2 Agents for this: 1 monitoring k8s and all its bits in a generic way and one focused on monitoring the stack.
I think this might end up being more than two "flavours". When we add support for other Beats, we'll need more and more permissions and settings, like host paths mounted, host network access/pid, container capabilities and related k8s API permissions and RBAC resources. And then there are some k8s distribution specific concerns - mostly with OpenShift security context vs the rest. Also, there will be always a use case/scenario/setup that we didn't think about, so whatever we'll do we'll need to leave a way for users to set it up however they want.
The perfect solution would be to translate the config in Fleet to the required k8s config. This would allow for a very smooth experience, but it would require a significant amount of work, as each feature in the Agent (underlying Beats) would have a defined permission/config needed in k8s. It would also require coordination between Fleet and ECK, so ECK can update the Pod specs as needed. I don't think this is feasible and something worth investing in.
We had similar challenges with current Beats CRD and we ended up with no defaults, no presets, no built-in configurations. We can document (similar to Beats CRD docs) how to address few common scenarios for users to build upon.
The "default" config could be a no-config which would allow for some of the Beats, like Heartbeat or possibly Metricbeat to run. Others would need users to specify some config in the ECK manifest to run correctly. And possibly, even more configuration would be required for things like Autodiscover to work (for context: Autodiscover requires access to k8s APIs that can only be granted by creating k8s resources outside of ECKs manifests).
For bare Metricbeat/Heartbeat, on ECK side we could have:
apiVersion: agent.k8s.elastic.co/v1alpha1
kind: Agent
metadata:
name: uptime
spec:
version: 7.9.1
elasticsearchRef:
name: elasticsearch
kibanaRef:
name: kibana
mode: fleet
deployment:
replicas: 1
For Filebeat:
apiVersion: agent.k8s.elastic.co/v1alpha1
kind: Agent
metadata:
name: o11y
spec:
version: 7.9.1
elasticsearchRef:
name: elasticsearch
kibanaRef:
name: kibana
mode: fleet
daemonSet:
podTemplate:
spec:
automountServiceAccountToken: true
terminationGracePeriodSeconds: 30
dnsPolicy: ClusterFirstWithHostNet
hostNetwork: true # Allows to provide richer host metadata
containers:
- name: filebeat
securityContext:
runAsUser: 0
volumeMounts:
- name: varlogcontainers
mountPath: /var/log/containers
- name: varlogpods
mountPath: /var/log/pods
- name: varlibdockercontainers
mountPath: /var/lib/docker/containers
volumes:
- name: varlogcontainers
hostPath:
path: /var/log/containers
- name: varlogpods
hostPath:
path: /var/log/pods
- name: varlibdockercontainers
hostPath:
path: /var/lib/docker/containers
The above would align well with mode: standalone where enabling it would just mean that config: ... needs to be provided and there are no different defaults and/or assumptions made.
Is there any update on this? I am trying to run Fleet in ECK, prior to this I was following the quickstart and was able to get the elastic-agent to enroll (running directly on the nodes), however, it would seemingly not deploy the beats or send data to Elasticsearch (with nothing in the logs).
It seems like the best way to get logs/metrics out of Kubernetes right now is to manually install and configure Beats on the nodes?
Hey @Just-Insane, no significant updates on this just yet. For making it work today with Elasticsearch/Kibana deployed by ECK, the best I know of is to try the proof-of-concept above.
Note that Elastic Agent/Fleet is different than "raw" Beats. We support the latter, but we don't support the Elastic Agent just yet. The work towards that is currently in progress.
For general logs/metrics using Beats you can check out our quickstart, configuration docs or just apply the examples.
Work for the Elastic Agent CRD in the standalone mode is tracked in a separate issue.
Closing this for now as initial work to support Agent has been completed. Let's open a new issue for Fleet when the time comes.
Most helpful comment
I played with the Elastic Agent and managed to get it to work in Fleet mode with hands-off setup. This includes creating Fleet user, grabbing tokens, enrolling agents and running the Agent, all while using our custom CAs. See the manifest at the bottom.
Some notes:
enroll(I'm not sure ifrundoesn't already do it) offered a way to retry. This would help greatly when users are deploying the Stack and Agents simultaneously. Instead of Pod restarts when Kibana is not yet available the process would keep running and retries happening would be visible through logs. Similar feature is supported by Beats when connecting to Elasticsearch output or Kibana (for dashboard setup) already. @ruflin - something to consider maybe.With the manifest above, ECK can deploy Agent, but depending on configurations set by user via Fleet the Pod might need different permissions. Examples would be the right RBAC for Kubernetes API endpoints, mounting the right host path or using
hostNetworkwhen needed. Right now this is a two step process where first the Pod template needs to be modified to give the Agent right access and then changes can be made via Fleet. This is not very convenient, so we should think if we want to improve this and if yes, how to do it.Elastic Agent with ECK (POC)