Do we really need this?
To me it seems like a bad setting as it stops simple interactions with the cluster and also destroys any ability to do monitoring (especially with Marvel).
I know it's used in wares plugin.
Also It could make sense to disable this on production nodes and only activate it on monitoring nodes.
disabling http does not stop Marvel collecting data and sending it to the monitoring cluster, right?
It also may be useful for people who only want to interact with Elasticsearch using the Java API and therefore have no need for the HTTP protocol
To me, allowing this is akin to keeping _shutdown.
Sure it is useful for a few people, but disabling it breaks too much stuff to be really worth it.
@markwalkom I think you should ask also the question on discuss.elastic.co to see how many users actually are disabling http, for which use case and invite them to comment on this issue.
Beside of using Elasticsearch by Java API only, I like the feature of disabling HTTP port to protect data nodes in a private network from tampering with them by sending spurious HTTP requests. The disabling of HTTP also assures that HTTP-based overhead (including HTTP client traffic and workload) is not present on that nodes, which may be of some importance in a team environment with distributed tasks of searching, indexing, monitoring, operating.
Otherwise, I would have to set up operating system layer TCP/IP filtering services (e.g. RHEL 7 firewalld) on all ports in the range 9200-9299, which is of course possible, but is another thing that can be easily forgotten when adding nodes (and should be documented in case).
Maybe I'm wrong, but I sincerely hope that optional commercial add-ons are not driving Elasticsearch features. As an alternative, I suggest to improve the Marvel product by using Java API only.
I think with our current work towards a Java HTTP client, along with moving as much as possible to APIs, and with not being able to access those APIs hampers any kind of debugging ability, we should remove the setting and enable HTTP everywhere. It's been a while since we've discussed this, does anyone else have thoughts?
If we have an architecture with data nodes and client nodes, would that make sense to still allow disabling http on data nodes?
That being said may be client nodes could be seen/replaced by federated search/index nodes (well tribe nodes nowadays) which I believe would use the REST layer?
IMO that sounds ok to remove http.enabled setting. Would love also to hear from others though.
If we have an architecture with data nodes and client nodes, would that make sense to still allow disabling http on data nodes?
I still think it's valuable, since you may want to retrieve something like GET /_cluster/state?local=true for diagnosis, but you then cannot since HTTP is disabled.
+1 to remove the option of disabling HTTP (I was just about to open a new issue for this). With our continued work on making a REST client be the official Java client, I think this is the way to go.
I am +1 on this too. If somebody doesn't wanna expose stuff to the outer world they can just bind to localhost and they are good?
There seems to be consensus on this and nothing to discuss further. I labelled this 6.0.0 and adoptme thinking that we could start with deprecating the setting in 6.0.0 and remove in 7.0.0.
Before we move forward with this, let's review where this is used across our stack. For example, Logstash depends on this for the sniffing feature in the ES output and we have users out there who wants sniffing but at the same time would like the ability to have LS send events only to a subset of sniffed nodes (by disabling HTTP on specific nodes).
Logstash depends on this for the sniffing feature in the ES output
Logstash (code) does not depend on http.enabled as far as I can tell. We do mention http.enabled it in the documentation to help users select which Elasticsearch nodes they want to talk to (sniffing all nodes, subtract nodes with http disabled). It is not a required feature, though, nor is it a out-of-the-box dependency. For the future, we want to improve our selection criteria (by node tag, etc) for Logstash's sniffing, so we won't need "running http?" as criteria.
However ...
Logstash does depend on the /_nodes/http api in order to perform sniffing. If this API is going away, I cannot tell from the discussion in this ticket.
Can someone tell us if /_nodes/http is being removed as a part of this effort?
Can someone tell us if
/_nodes/httpis being removed as a part of this effort?
We would not, or at least, I would be opposed to removing it.
All our language clients depend on /_nodes/http for sniffing. That will stay. This issue is about removing the ability to disable http on a node, through the http.enabled setting.
I've been thinking that the best thing for most clients to do would be to use node attrs for sniffing. All of the other automatic criteria are problematic. It's best just to let users label nodes how they like and use that info to sniff.
That's probably a larger discussion that should be had among client library authors however.
I've created an issue to discuss some proposed changes to clients to help with this here: https://github.com/elastic/elasticsearch/issues/24871
Most helpful comment
All our language clients depend on
/_nodes/httpfor sniffing. That will stay. This issue is about removing the ability to disable http on a node, through thehttp.enabledsetting.