Ambassador: Fine-grained logging configuration and consistency

Created on 28 Aug 2018  路  22Comments  路  Source: datawire/ambassador

One of the most important considerations in running production services is logging. It's common practice to aggregate Kubernetes container logs (either the entire container log from the host system or specific files via a sidecar container) for analysis and search with a tool like Splunk or ElasticSearch. Ambassador's logging is currently pretty sad: only logs to stdout/stderr, different components have different log formats, and some processes log without a datestamp altogether. All of this makes it more difficult than it should be to index Ambassador logs.

Ambassador should add configuration options for logging:

  • Envoy access log format (#701) https://www.envoyproxy.io/docs/envoy/latest/configuration/access_log
  • Log format for Python components (hot-restarter, kubewatch, diagd) using logging module (#441) https://docs.python.org/3/library/logging.html
  • Prefix for all logs in entrypoint.sh (including date(1) formatting for timestamps)
  • Log destination for all components, should allow writing each component's logs to separate files (at specified paths) in addition to stdout/stderr
  • Ideal: a new LoggingService manifest for configuring direct log forwarding to Splunk HEC/ElasticSearch/etc.

Ambassador's default configuration should log _everything_ with the same format (at very least with a consistent timestamp prefix, including time zone).

I've attempted to use the sidecar method with a Splunk forwarder container. I had to write my own Dockerfile that depends on Ambassador and adds a custom entrypoint to tee the logs into files:

Dockerfile

ARG VERSION
FROM quay.io/datawire/ambassador:${VERSION}

RUN apk --no-cache add bash

WORKDIR /ambassador
COPY my_entrypoint.sh .
RUN chmod 755 my_entrypoint.sh

ENTRYPOINT [ "./my_entrypoint.sh" ]

my_entrypoint.sh

#!/bin/bash

mkdir -p /logs

exec > >(tee -a /logs/ambassador_stdout.log)
exec 2> >(tee -a /logs/ambassador_stderr.log)

# variants with added timestamps, but this causes double timestamps on some logs???
# exec > >(awk '{ print strftime("%Y-%m-%d %H:%M:%S %z"), $0; fflush(); }' | tee -a /logs/ambassador_stdout.log)
# exec 2> >(awk '{ print strftime("%Y-%m-%d %H:%M:%S %z"), $0; fflush(); }' | tee -a /logs/ambassador_stderr.log)

exec ./entrypoint.sh

(Installing Bash is required to get >() process redirection)

Most helpful comment

nooope

All 22 comments

Thanks for the detailed feedback! Definitely something that makes sense and we need to work towards this in future releases.

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

Please don鈥檛 close this

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

Don鈥檛 close this. Bots that mark issues stale are a terrible idea. The only thing they improve is the number of open issues, which is not a metric projects should be optimizing for.

I really need this as well in order for Ambassador to be "production ready" for us.

Ideally Ambassador would expose the Envoy config for Access Logs (JSON and Plain) as well as (or perhaps to start with) enable the gRPC Access Logging (https://www.envoyproxy.io/docs/envoy/latest/api-v2/config/accesslog/v2/als.proto) service from Envoy.

My ideal setup (and how Gloo approaches this) is to log the "control plane" logs to stdout (for kubectl logs etc.) and then stream the Access Logs separately to a file (that can be picked up by Filebeat or Fluent Bit etc.) or using Envoys gRPC ALS service.

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

Bump so this doesn鈥檛 get closed >:[

pro-actively bumping this to keep it from going stale. I use ambassador for my day job and side job because it's served me so well. The logging is so far the only glaring issue I've encountered. This may even be something I could work on as a contribution. Thank you for the awesome work on this project.

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

boooo

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

nope

definitely a priority item

@richarddli Is currently possible to filter somehow access logs logged by envoy?
I would like e.g. log only 5xx requests on production env, logging all of them would be not a good idea.

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

nope

need this feature here..

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

nooope

Infact, instead of envoy generating its own RequestID, can we use existing TraceID as generated by AWS ALBs so that it is easier to trace requests from cloud resources to applications

definitely a priority item

Was this page helpful?
0 / 5 - 0 ratings