Elasticsearch: Support File Sockets rather than just Ethernet Interfaces

Created on 7 Nov 2016  路  5Comments  路  Source: elastic/elasticsearch

Describe the feature:

This is a feature request. Right now it seems like ElasticSearch only supports binding port 9200 to an actual Ethernet interface. The problem with this is that it means any non-privileged user on an ElasticSearch host has access to do destructive things (_unless you're running the Shield plugin_) to the cluster.

I would like to configure ElasticSearch to bind to a unix file socket, and then be able to control the permissions to that file socket with standard Unix file permissions. From there, we can easily use Apache to provide a reasonable level of authenticated access.

Note about plugins:
I know that there are plugins like https://github.com/sscarduzio/elasticsearch-readonlyrest-plugin that help provide authorization to the REST interface. The problem is that these have to be compiled and managed for each specific version of ElasticSearch, and we're just tired of playing that game. Fronting ElasticSearch with Apache has a ton of advantages (centralized logging being a huge one) ... and its just a common design.

:DistributeNetwork discuss

Most helpful comment

Unix sockets are a very simple (thus reliable) way of enforcing security in a wide variety of situations. The utility of this is rapidly increasing. Not all problems require a cluster, and if you don't HAVE to build a cluster, you certainly do not want to. The number of cores per machine is making clustering across machines less and less necessary, and there are huge incentives to centralize (performance, simplicity of code and administration). We have pretty cheap machines with 128 cores now, and even then, a lot of problems can be decomposed in a way so that it's not the ES that needs to be clustered, but a different layer in the stack.

  • Any time you only need the application to talk with other local (to the same machine) applications. No need to configure more complicated network settings (specific ip-bindings, firewalls, machine firewalld, etc.)
  • Security. Most machine hacks work by port-scanning and exploiting bugs/vulnerabilities in code. So, if you CAN get by with local communication, you absolutely should. Of course you can lock down the machine, which we should be doing anyway, but why leave your jewelry by the front door?
  • Control. You can block access using the full set of file-system and mount options, including chroots.
  • It's much easier to isolate and identify services by location on a hierarchical filesystem than a port-number.
  • Using ports adds the problem of port conflicts when running more than one server on a machine.

Specifically, to your points
1) Providing both socket and network communications should not be hard. Network and socket communication use almost all the same code, at least in C and there are plenty of libraries of pre-written code for doing sockets in Java (http://stackoverflow.com/questions/170600/unix-socket-implementation-for-java). Having all of the internal communication be over open ports that anyone can connect to is a security concern that should be addressed regardless. Putting that on the people who administer the network is a unnecessary leap in complexity.

2) It is not at all rare. I've worked with hundreds of companies setting up IT infrastructure and have seen many of them using Unix sockets for the same reasons I've outlined above. Using virtualization to hide ES is a huge leap in complexity compared to just having ES communicate on Unix Sockets.

Also note that curl (the ES defacto standard CLI), has supported communication via unix sockets since version 7.40 (now on 7.52).

Given that securing ES is such a huge issue (https://www.google.com/search?q=securing%20elasticsearch), and that supporting unix sockets is so simple compared to other solutions, it seems like a great feature to add.

Thanks! I am impressed with what ES overall.

-Carl

All 5 comments

We discussed this internally and there are a couple of issues that we see with adding this. Lemme elaborate:

  • since we are using network communication internally (transport) on port 9300 this might give you a false level of security. In such a case how would forming a cluster work if you are binding to a file socket?
  • it seems like the usecase is pretty rare that you want to secure from access from within local host, in such a case I think we should not add complexity to ES but rather use some virtualization to hide ES from the users and put apache HTTPD with mod_proxy in front of it for the extra level of security?

We decided to close this for these reasons for now.

Unix sockets are a very simple (thus reliable) way of enforcing security in a wide variety of situations. The utility of this is rapidly increasing. Not all problems require a cluster, and if you don't HAVE to build a cluster, you certainly do not want to. The number of cores per machine is making clustering across machines less and less necessary, and there are huge incentives to centralize (performance, simplicity of code and administration). We have pretty cheap machines with 128 cores now, and even then, a lot of problems can be decomposed in a way so that it's not the ES that needs to be clustered, but a different layer in the stack.

  • Any time you only need the application to talk with other local (to the same machine) applications. No need to configure more complicated network settings (specific ip-bindings, firewalls, machine firewalld, etc.)
  • Security. Most machine hacks work by port-scanning and exploiting bugs/vulnerabilities in code. So, if you CAN get by with local communication, you absolutely should. Of course you can lock down the machine, which we should be doing anyway, but why leave your jewelry by the front door?
  • Control. You can block access using the full set of file-system and mount options, including chroots.
  • It's much easier to isolate and identify services by location on a hierarchical filesystem than a port-number.
  • Using ports adds the problem of port conflicts when running more than one server on a machine.

Specifically, to your points
1) Providing both socket and network communications should not be hard. Network and socket communication use almost all the same code, at least in C and there are plenty of libraries of pre-written code for doing sockets in Java (http://stackoverflow.com/questions/170600/unix-socket-implementation-for-java). Having all of the internal communication be over open ports that anyone can connect to is a security concern that should be addressed regardless. Putting that on the people who administer the network is a unnecessary leap in complexity.

2) It is not at all rare. I've worked with hundreds of companies setting up IT infrastructure and have seen many of them using Unix sockets for the same reasons I've outlined above. Using virtualization to hide ES is a huge leap in complexity compared to just having ES communicate on Unix Sockets.

Also note that curl (the ES defacto standard CLI), has supported communication via unix sockets since version 7.40 (now on 7.52).

Given that securing ES is such a huge issue (https://www.google.com/search?q=securing%20elasticsearch), and that supporting unix sockets is so simple compared to other solutions, it seems like a great feature to add.

Thanks! I am impressed with what ES overall.

-Carl

it seems like the usecase is pretty rare that you want to secure from access from within local host, in such a case I think we should not add complexity to ES but rather use some virtualization to hide ES from the users and put apache HTTPD with mod_proxy in front of it for the extra level of security?

Unix domain sockets are more simple than TCP sockets. Also, ES is nice to use with a single-server application that provides cheap text search. I was shocked when I found out that ES doesn't support them. Most servers support them, because they are so fundamental. It so easy to secure unix domain sockets using standard file permissions. This is much easier than (mis)configuring authentication although one would be wise to layer their security.

I just realized that everything I'm saying has been said much better by @CarlEklof.

It has been 2 years since the issue was closed. It may be a time to revise this.

@s1monw do you think that if you discuss this issue internally you might want to consider implementing a unix socket now?

+1 This would be a great feature to have! -cc @s1monw

Was this page helpful?
0 / 5 - 0 ratings