Elasticsearch: 'elasticsearch' user and group id's should be consistent across multiple nodes

Created on 11 Jun 2015  路  12Comments  路  Source: elastic/elasticsearch

For filesystem snapshots, if the 'elasticsearch' group and user have differing id numbers between nodes in the system, snapshots will not work. The master will create subdirectories for each index, but the permissions will reflect ownership by the 'elasticsearch' user on the master. If the master's elasticsearch user does not match the elasticsearch user id on another node, that node will not be able to write to those directories.

When we create the 'elasticsearch' user and group at installation time, it would be helpful to have a consistent id number. I don't know if the best solution is to pick an arbitrary high user ID number (9200?) and try to use it by default, or just to document that it needs to be set the same across all nodes.

:DeliverPackaging >docs Delivery

Most helpful comment

For Anyone that comes across this problem, I solved this by running the following commands on the nodes that had permission issues
id elasticsearch# To Know your elasticsearch gid and uid
service elasticsearch stop
usermod -u NEWUID elasticsearch
groupmod -g NEWGID elasticsearch
find / -user OLDUID -exec chown -h NEWUID {} ;
find / -group OLDGID -exec chgrp -h NEWGID {} ;
service elasticsearch start

All 12 comments

I'd consider a fixed user/group ID a bug. For example we assign ranges for user accounts depending on whether it's a "system" account or a "real user" account and 9200 would fall smack into our "real user" account range and probably conflict. I know of more organizations that use a similar scheme, so whatever value you choose you'll step on somebodies toe.

Let's opt for documenting it

Hi All,

I am facing an issue while taking elasticsearch snapshot in cluster. I found it is because of different permission of elasticsearch user in one of my node.

I have 2 servers. Configured as cluster
server1: 10.0.0.0
server2: 10.0.0.1

I have changed the userid & groupid of elasticsearch user in server2. So that to keep it same across all the nodes.
But after changing it, i am unable to start the elasticsearch service in server2. Also log file is also not getting updated.

Please guide me in solving this issue..

Thanks,
Gogul

This has been open for almost 3 years with no other reports of similar issues. I'm going to close this as the filesystem snapshots seem to be used far less than cloud based snapshots, and for those that do use snapshots they seem to have not had similar problems. If someone does run into this issue again, we would happily accept a docs change for filesystem snapshot/restore docs.

FYI, this is still an on-going issue. I have just run into it with 6.2.2 about 30 minutes ago. Filesystem snapshots are still a supported feature, correct? So unless anyone fixed the root issue (they haven't) the problem will still be there. I can't fix the code, but I can fix the documentation maybe. Which documentation are you referring to when you mention accepting changes?

when you change uid/gid ,the new uid/gid will have no permission to the origin directory/files owned by elastic. so you should rechange owner of these directories/files to elasticsearch.
command: chown -R elasticsearch:elasticsearch /paths/es's

I have solved this issue by creating the elasticsearch user and group prior installation of elasticsearch, thus setting it to a fixed value throughout all the nodes. If Ansible is used for provisioning, ansible-elasticsearch role is capable of doing it automatically.

I also ran into this issue. I installed ES on six nodes a few months back, then replaced three of them with new nodes a few weeks ago. Since then ES hasn't been able to write to the NFS for snapshotting. It looks like the only way to fix this is to make sure the uid and gid are the same across all nodes. This looks like a chore because i will need to find a non-used uid and gid that will work across all nodes then do the:

# make sure uid doesn't exist
id -nu NEW_UID
# make sure new guid doesn't exist
getent group NEW_GUID
# this command will take forever to complete (just tried it on 9TB and took like an hour)
/bin/chown --changes --silent --no-dereference --recursive --from=OLD_UID:OLD_GUID NEW_UID:NEW_GUID /
# then ensure chown'd directories elasticsearch:elasticsearch

What about adding a bootstrap check that ensures the elasticsearch UID and GID are consistent (or at least a warning when there is inconsistency)?

This has been open for almost 3 years with no other reports of similar issues. I'm going to close this as the filesystem snapshots seem to be used far less than cloud based snapshots, and for those that do use snapshots they seem to have not had similar problems. If someone does run into this issue again, we would happily accept a docs change for filesystem snapshot/restore docs.

I'm still running into this issue. Please make some changes.

This has been open for almost 3 years with no other reports of similar issues. I'm going to close this as the filesystem snapshots seem to be used far less than cloud based snapshots, and for those that do use snapshots they seem to have not had similar problems. If someone does run into this issue again, we would happily accept a docs change for filesystem snapshot/restore docs.

I'm still running into this issue. Please make some changes.

Me too and probably also other people. We all solve it in house, as apparently nobody fixed that in Elasticsearch.

For Anyone that comes across this problem, I solved this by running the following commands on the nodes that had permission issues
id elasticsearch# To Know your elasticsearch gid and uid
service elasticsearch stop
usermod -u NEWUID elasticsearch
groupmod -g NEWGID elasticsearch
find / -user OLDUID -exec chown -h NEWUID {} ;
find / -group OLDGID -exec chgrp -h NEWGID {} ;
service elasticsearch start

Was this page helpful?
0 / 5 - 0 ratings