I'm failing to understand why /var/lib/logstash is not used for data on the container version.
I've read https://www.elastic.co/guide/en/logstash/current/dir-layout.html#docker-layout and it doesn't provide any hint on what to do for the data folder.
As similar to /usr/share/elasticsearch/data isn鈥檛 in-line with the FHS (filesystem hierarchy standard)?
, the container versions seem to make use of /usr/share instead of /var/lib for data and this makes configuraiton and understanding of where and how to place data volumes, e.g. for dead letter queues, very confusing. And to make things event more interesting, /opt/logstash/data also seems to be used/exist...
/usr/share/logstash/data/ used when /usr/share, according to http://refspecs.linuxfoundation.org/FHS_3.0/fhs-3.0.html#usrshareArchitectureindependentDataThe /usr/share hierarchy is for all read-only architecture independent data files.
Exec into the container and note the data folder is not in the default/expected place
docker exec -it <container name>
bash-4.2$ ls /var/lib/logstash
ls: cannot access /var/lib/logstash: No such file or directory
Run the following and see there are two paths for the logstash data location (image was logstash:5.6.3)
$ ls -l {/opt/logstash,/usr/share/logstash}/data/*
-rw-r--r--. 1 logstash logstash 36 Nov 7 12:13 /opt/logstash/data/uuid
-rw-r--r--. 1 logstash logstash 36 Nov 7 12:13 /usr/share/logstash/data/uuid
/opt/logstash/data/dead_letter_queue:
total 0
drwxr-xr-x. 2 logstash logstash 32 Nov 17 14:47 main
/opt/logstash/data/queue:
total 0
/usr/share/logstash/data/dead_letter_queue:
total 0
drwxr-xr-x. 2 logstash logstash 32 Nov 17 14:47 main
/usr/share/logstash/data/queue:
total 0
I agree the locations are perhaps unexpected and/or confusing.
@jarpy Any thoughts on moving the default path.data to match the rpm/deb packages? ("/var/lib/logstash"). No pressure.
Regarding FHS: FHS is honestly a confusing specification which I have found, in practice (10+ years in operations), many organizations and linux distributions ignore or deviate strongly from, and when cited, often creates an aggressive environment of debate without resolution. So for the purposes of this issue, I am ignoring FHS.
Let's focus on what is least likely to surprise users and also focus on documenting whatever we decide. If nothing else, a necessary improvement here is to document the default data directory.
Let's focus on what is least likely to surprise users
Thanks :-)
FHS is honestly a confusing specification
Fair enough, regardless, consistency would help. RHEL and Ubuntu more or less manage to keep to the FHS. That said, the FHS doesn't even bother to speak to container use cases...
Thanks for this. I can see now how the documentation bug came about. The Docker image is based on the tarball distribution, not the OS packages, so it follows those directory locations (where they are defined), including the default for path.data. In practice, that means that the directory layout documentation for the Docker image began life as a cut-and-paste of the same documentation for the tarball. Until version 5.6 however, the data directory was not documented for any distribution format. When the docs were updated in 5.6, we didn't add a line for the data directory to the Docker documentation.
Regarding the ultimate choice of file locations within the image, it was necessary to choose between the FHS and the established patterns of the existing Logstash artefacts. Sadly, it's not possible to be consistent with both.
Most helpful comment
I agree the locations are perhaps unexpected and/or confusing.
@jarpy Any thoughts on moving the default
path.datato match the rpm/deb packages? ("/var/lib/logstash"). No pressure.Regarding FHS: FHS is honestly a confusing specification which I have found, in practice (10+ years in operations), many organizations and linux distributions ignore or deviate strongly from, and when cited, often creates an aggressive environment of debate without resolution. So for the purposes of this issue, I am ignoring FHS.
Let's focus on what is least likely to surprise users and also focus on documenting whatever we decide. If nothing else, a necessary improvement here is to document the default data directory.