Vector: File Source crash on docker

Created on 1 Jun 2020  路  7Comments  路  Source: timberio/vector

I'm trying to run a simple docker container that reads log files and outputs them to console.
But when deploy the container it crashes with the following output:

$ docker run -v $PWD/vector.toml:/etc/vector/vector.toml:ro timberio/vector:0.9.1-alpine
Jun 01 15:59:29.124  INFO vector: Log level "info" is enabled.
Jun 01 15:59:29.124  INFO vector: Loading configs. path=["/etc/vector/vector.toml"]
Jun 01 15:59:29.126  INFO vector: Vector is starting. version="0.9.1" git_version="v0.9.1" released="Thu, 30 Apr 2020 15:51:58 +0000" arch="x86_64"
Jun 01 15:59:29.127 ERROR vector::topology: Configuration error: Source "file_in": data_dir "/var/lib/vector/" does not exist

The documentation refers the configuration data_dir as optional, shouldn't the vector's default logging folder already exists or being created at runtime?

Configurations:

[sources.file_in]
  type = "file"
  include = ["/data/*.data"]
  fingerprinting.strategy = "device_and_inode"
  start_at_beginning = true

[sinks.console_out]
  type = "console"
  inputs = ["file_in"]
  encoding.codec = "json"
releasing must docker bug

All 7 comments

https://vector.dev/docs/reference/sources/file/#data_dir
Indeed, data_dir is optional, but by default global data_dir will be used in this case and vector should have write permission for this dir. By default global data_dir is /var/lib/vector/ (docs is wrong here, I'll push fix shortly).

Such command should work:

$ docker run -v $PWD/vector.toml:/etc/vector/vector.toml:ro -v $PWD/vector_data/:/var/lib/vector/:rw timberio/vector:0.9.1-alpine

Edit: pr -- #2720

Thanks for the quick response.

True that mapping /var/lib/vector/ to the host should work, but the folder with permissions should shouldn't already exists in the docker image?

Thinking about this, I'm not sure it makes sense to provision a var/lib/docker on the guest.

When Vector needs this global directory, it means it wants to store state. Docker containers using this as a normal folder is actually pretty dangerous, since if the container restarts, all the logging metadata was lost.

I think the "correct" fix for this issue is adding advice to https://vector.dev/guides/integrate/platforms/docker/#start-the-vector-container about mounting the data_dir as a volume or path.

@Hoverbear it looks like there are places in the docs where we can add this:

  1. https://vector.dev/docs/setup/installation/platforms/docker/
  2. https://vector.dev/guides/integrate/platforms/docker/#start-the-vector-container

Can we close this by updating those places

Absolutely!

I agree that it'd be atypical not to want to mount the state directory outside of vector in production configurations, but I do think it shouldn't be required and that the docker image should work out-of-the-box without mounting any directories. This is especially useful when simply testing out vector and matches the behavior of other "stateful" images I've used (like postgresql and redis).

I would opt for creating /var/lib/vector in the docker image and simply recommending in the docs that they mount it outside if they want to preserve state across restarts.

That works for me. :)

Was this page helpful?
0 / 5 - 0 ratings

Related issues

LucioFranco picture LucioFranco  路  3Comments

a-rodin picture a-rodin  路  3Comments

binarylogic picture binarylogic  路  3Comments

jamtur01 picture jamtur01  路  3Comments

valyala picture valyala  路  3Comments