I'm trying to run a simple docker container that reads log files and outputs them to console.
But when deploy the container it crashes with the following output:
$ docker run -v $PWD/vector.toml:/etc/vector/vector.toml:ro timberio/vector:0.9.1-alpine
Jun 01 15:59:29.124 INFO vector: Log level "info" is enabled.
Jun 01 15:59:29.124 INFO vector: Loading configs. path=["/etc/vector/vector.toml"]
Jun 01 15:59:29.126 INFO vector: Vector is starting. version="0.9.1" git_version="v0.9.1" released="Thu, 30 Apr 2020 15:51:58 +0000" arch="x86_64"
Jun 01 15:59:29.127 ERROR vector::topology: Configuration error: Source "file_in": data_dir "/var/lib/vector/" does not exist
The documentation refers the configuration data_dir as optional, shouldn't the vector's default logging folder already exists or being created at runtime?
Configurations:
[sources.file_in]
type = "file"
include = ["/data/*.data"]
fingerprinting.strategy = "device_and_inode"
start_at_beginning = true
[sinks.console_out]
type = "console"
inputs = ["file_in"]
encoding.codec = "json"
https://vector.dev/docs/reference/sources/file/#data_dir
Indeed, data_dir is optional, but by default global data_dir will be used in this case and vector should have write permission for this dir. By default global data_dir is /var/lib/vector/ (docs is wrong here, I'll push fix shortly).
Such command should work:
$ docker run -v $PWD/vector.toml:/etc/vector/vector.toml:ro -v $PWD/vector_data/:/var/lib/vector/:rw timberio/vector:0.9.1-alpine
Edit: pr -- #2720
Thanks for the quick response.
True that mapping /var/lib/vector/ to the host should work, but the folder with permissions should shouldn't already exists in the docker image?
Thinking about this, I'm not sure it makes sense to provision a var/lib/docker on the guest.
When Vector needs this global directory, it means it wants to store state. Docker containers using this as a normal folder is actually pretty dangerous, since if the container restarts, all the logging metadata was lost.
I think the "correct" fix for this issue is adding advice to https://vector.dev/guides/integrate/platforms/docker/#start-the-vector-container about mounting the data_dir as a volume or path.
@Hoverbear it looks like there are places in the docs where we can add this:
Can we close this by updating those places
Absolutely!
I agree that it'd be atypical not to want to mount the state directory outside of vector in production configurations, but I do think it shouldn't be required and that the docker image should work out-of-the-box without mounting any directories. This is especially useful when simply testing out vector and matches the behavior of other "stateful" images I've used (like postgresql and redis).
I would opt for creating /var/lib/vector in the docker image and simply recommending in the docs that they mount it outside if they want to preserve state across restarts.
That works for me. :)