Vector: New `docker` source

Created on 10 Jul 2019  路  11Comments  路  Source: timberio/vector

Hello, it will be very nice to see Docker logs getting directly from socket, for all docker containers. This allow dynamically collect logs.

feature

Most helpful comment

Thanks, @ktff this is exactly what we were looking for. I'm going to have @LucioFranco review this and sign off on the specification. We'll try to do that tomorrow morning so that we can unblock you and you can start.

All 11 comments

Creating a docker log driver could be an option here: https://docs.docker.com/config/containers/logging/configure/

Another option could be by using Syslog logging driver on Docker side, and Syslog source on Vector side.

@DenisSlnn, @jszwedko what do you think?

@ktff that sounds like it would work, but I think we wanted something a little more automatic and Docker friendly. I don't know enough about how this would work to give you concrete suggestions, but my hope was that you'd dig in and think about how a user would set up Vector in a Docker environment.

For example, logspout does a pretty good job with this, and you can see how logspout is basically a mini, domain-specific Vector. Our intent would be for a user to install Vector instead of logspout. For example, here are some things logspout does that make it nice in a Docker environment:

  1. The general way in which a user sets up and installs logspout. You can see an example here. Note this description:

    Logspout is a log router for Docker containers that runs inside Docker. It attaches to all containers on a host, then routes their logs wherever you want. It also has an extensible module system.

    And this:

    The simplest way to use logspout is to just take all logs and ship to a remote syslog. Just pass a syslog URI (or several comma separated URIs) as the command. Here we show use of the tls encrypted transport option in the URI. Also, we always mount the Docker Unix socket with -v to /var/run/docker.sock:

  2. The ability to blacklist/whitelist containers by labels.

Both of these are a little beyond me, but you can see the user experience we're trying to create. Everything elsse they're doing with outputs Vector already has.

Let me know if that helps. I'm happy to discuss with you via a chat to further clarify. But the first step here is to define the steps and requirements, and then we can start on development.

@ktff I've assigned this to you if you feel comfortable proceeding. If not, no problem, we can look at other issues.

I think the specification for this can be best described with the following pseudo documentation.

docker source

The docker source ingest log data from local Docker containers and outputs log events.

Config File(example)

[sources.my_source_id]
  # REQUIRED - General
  type = "docker" # must be: "docker"

  # OPTIONAL

  # Collect logs from these containers.
  include_containers = ["container_name"]

  # Collect logs from containers with any of these labels.
  include_labels = ["docker_label"]

  # Don't collect logs from these containers.
  # Has priority over include_*
  ignore_containers =  ["container_name"]

  # Don't collect logs from containers with any of these labels.
  # Has priority over include_*
  ignore_labels = ["docker_label"]

If include_containers and include_labels are empty, docker source will collect logs from all applicable containers.

Requirements

  • Docker container must be started without the -t option.

  • When using Docker Community version, container must use json-file or journald logging driver. By default Docker uses json-file. On Enterprise version, that's not an issue.

#

No other configuration is required.

This is possible to achieve using docker logs command. Requirement section exists because of this.

There are some possible extensions to options, but I consider the mentioned ones as enough for now.

#

Alternatives

There are 2 notable alternatives for docker source:

  1. Using Docker Engine managed plugin system

    • Requires installing vector as a Docker plugin.
    • Additional configuration of daemon.json or docker run is required.
    • Possibly more performant than original proposition of docker source.
    • Requirement section is not necessary.
  2. Using Docker Plugin discovery

    • As original proposition of docker source, Vector can be installed in any way.
    • Hardest to maintain.
    • Additional configuration of daemon.json or docker run is required.
    • Possibly more performant than original proposition of docker source.
    • Requirement section is not necessary.

Although both alternatives could possibly be more performant, and be without Requirement section, I think that ease of use, containing configuration to one TOML file, and having docker source be inline with other sources, is of greater value.

And honestly, I think being able to just add

[sources.my_source_id]
  type = "docker"

to configuration, and have access to logs from all containers on the local host, is quite powerful.

Thanks, @ktff this is exactly what we were looking for. I'm going to have @LucioFranco review this and sign off on the specification. We'll try to do that tomorrow morning so that we can unblock you and you can start.

@ktff Overall I'm pretty happy with this solution. One question I had is, would it make sense to use a regex instead of an include/ignore set of options?

I am not quite sure what you mean.

Beside replacing "container_name" and "label_name" with a regex, I don't see how a regex could replace options. But it would help if it can.

Could you explain the idea a little more.

@ktff what I mean is containers = ["(some-regex)"] might be better since you can include negative matches in regexes. So you can have one option instead of splitting that functionality into two separate options. Let me know if that makes sense.

@LucioFranco yes, it makes sense, but it has downsides.

Approachability and readability will suffer.

  • Using options would require basic knowledge of regex.

  • If regex is used in any capacity, It could be difficult for readers to reason about what docker source will actually collect. This is mostly because it will be more difficult to determine if there is any include as it's existence changes what source collects from all to include.

I also doubt that regex will be used in some greater capacity, since labeling system already exists, and is native to Docker ecosystem.

Having double the options is more verbose, but I also expect that usage of more than 2 options at the same time will be rare.

@ktff that totally makes sense! I'm happy with that, let's do it 馃憤 Thanks for explaining 馃槃

Was this page helpful?
0 / 5 - 0 ratings

Related issues

raghu999 picture raghu999  路  3Comments

binarylogic picture binarylogic  路  4Comments

Hoverbear picture Hoverbear  路  3Comments

LucioFranco picture LucioFranco  路  3Comments

LucioFranco picture LucioFranco  路  3Comments