Elasticsearch version (bin/elasticsearch --version):
6.1.1
Description of the problem including expected versus actual behavior:
406 response when sending a request to _bulk with a Content-Type that includes a charset encoding.
This worked in 5.6.x
Steps to reproduce:
$ curl -XPOST -H 'Content-Type:application/x-ndjson;charset=UTF-8' http://localhost:9200/_bulk --data-binary @mydata
$ {"error":"Content-Type header [application/x-ndjson; charset=UTF-8] is not supported","status":406}
This is caused by #19388 where we enforced a strict content type header. In 5.x we only emitted deprecation warnings but since 6.0 we will enforce a "correct" ContentType header.
In the current implementation it is only allowed to specify a media type in the ContentType header but not any parameters (the method in question is RestController#hasContentType()).
Assigning to @jaymode for final clarification whether this was intentional in the implementation (IMHO it was intentional and we should not accept parameters (chartset=UTF-8) that we will silently ignore further down the line).
Adding a charset to a Content-Type is correct according to spec (https://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.17). I agree things get fuzzy if the charset ends up being ignored, but I don't like the idea of getting a error from a valid HTTP request.
Edit:
Should mention that the HTTP call is being formed by a 3rd party library (wslite in my case), so this could be something other people face if building their own client
This is not the intended behavior as it is breaking. The changes in #19388 / #22691 sparked another issue regarding our handling of charset, #22769. Unfortunately we have not executed the plan outlined in that issue and unintentionally rejected the charset parameter for ndjson, #27065. There is also some discussion around this on a PR, #27301.
I am having the same problem
{"error":"Content-Type header [application/x-ndjson; charset=utf-8] is not supported","status":406}
I have tried to compile Elastic from the source code, and run it with no changes. I have tried also to install a 5.5.0 with the same identical result
here HTTP dump
dic 31, 2018 10:58:27 AM jolie.Interpreter logInfo
INFORMAZIONI: [main_logger.ol] [HTTP debug] Sending:
POST /_bulk HTTP/1.1
Host: localhost
Connection: close
X-Jolie-MessageID: 44
X-Jolie-ServicePath: /
Content-Type: application/x-ndjson; charset=utf-8
Content-Length: 2873
{"create":{"_index":"logger","_type":"log","_id":"1546250307113"}}
{"process_id":"154212001839195487","memory":42979328,"pri":"NORMAL","serv_id":"FileProcessorCubo","ip":"localhost","op_id":"uploadInvoi
ce","message_id":"1834818","message":"","evt_type":"SUCCESS","domain_id":"ArxivarConnector","scope_id":"main","evt_ts":1546250230370,"e
vt_id":"OperationEnded","value":"","evtarr_ts":1546250307113}
{"create":{"_index":"logger","_type":"log","_id":"1546250307113"}}
{"process_id":"154212001839195488","memory":44332960,"pri":"NORMAL","serv_id":"FileProcessorCubo","ip":"localhost","op_id":"uploadInvoi
ce","message_id":"1834846","message":"","evt_type":"SUCCESS","domain_id":"ArxivarConnector","scope_id":"main","evt_ts":1546250290373,"e
vt_id":"OperationEnded","value":"<?xml version=\"1.0\" encoding=\"UTF-8\" standalone=\"no\"?><root _jolie_type=\"void\"><message _jolie
_type=\"void\"\/><\/root>","evtarr_ts":1546250307113}
{"create":{"_index":"logger","_type":"log","_id":"1546250307208"}}
{"process_id":"15459930671263556","memory":43338824,"pri":"NORMAL","serv_id":"ArchiveLinkImplementation","ip":"localhost","op_id":"info
Impl","message_id":"42914","message":"null","evt_type":"REQUEST","domain_id":"ArxivarConnector","scope_id":"main","evt_ts":154625030574
7,"evt_id":"OperationStarted","value":"<?xml version=\"1.0\" encoding=\"UTF-8\" standalone=\"no\"?><root _jolie_type=\"void\"><docId _j
olie_type=\"string\">005056AC770F1EE9839C53B368787C67<\/docId><pVersion _jolie_type=\"string\">0046<\/pVersion><message _jolie_type=\"v
oid\"\/><contRep _jolie_type=\"string\">ZA<\/contRep><script _jolie_type=\"string\">ContentService<\/script><\/root>","evtarr_ts":15462
50307208}
{"create":{"_index":"logger","_type":"log","_id":"1546250307210"}}
{"process_id":"15459930671263556","memory":43835800,"pri":"NORMAL","serv_id":"ArchiveLinkImplementation","ip":"localhost","op_id":"info
Impl","message_id":"42914","message":"","evt_type":"SUCCESS","domain_id":"ArxivarConnector","scope_id":"main","evt_ts":1546250305748,"e
vt_id":"OperationEnded","value":"<?xml version=\"1.0\" encoding=\"UTF-8\" standalone=\"no\"?><root _jolie_type=\"void\"><numComps _joli
e_type=\"string\"\/><compTimeM _jolie_type=\"string\"\/><timeC _jolie_type=\"string\"\/><docId _jolie_type=\"string\">005056AC770F1EE98
39C53B368787C67<\/docId><message _jolie_type=\"void\"\/><compDateC _jolie_type=\"string\"\/><script _jolie_type=\"string\">ContentServi
ce<\/script><dateM _jolie_type=\"string\"\/><docStatus _jolie_type=\"string\"\/><timeM _jolie_type=\"string\"\/><compId _jolie_type=\"s
tring\"\/><compTimeC _jolie_type=\"string\"\/><pVersion _jolie_type=\"string\">0046<\/pVersion><dateC _jolie_type=\"string\"\/><rootObj
ect _jolie_type=\"string\"\/><compDateM _jolie_type=\"string\"\/><compstatus _jolie_type=\"string\"\/><contRep _jolie_type=\"string\">Z
A<\/contRep><\/root>","evtarr_ts":1546250307210}
dic 31, 2018 10:58:27 AM jolie.Interpreter logInfo
INFORMAZIONI: [main_logger.ol] [HTTP debug] Receiving:
HTTP Code: 406
Resource: null
--> Header properties
content-length: 99
content-type: application/json; charset=UTF-8
--> Message content
{"error":"Content-Type header [application/x-ndjson; charset=utf-8] is not supported","status":406}
Regards
Balint
Interestingly, ES 6.x accepts application/json; charset=utf-8 just not application/x-ndjson; charset=utf-8; charset=utf-8.
is there a workaround for this? how can I post a multiline bulk request which is also utf-8?
IMHO this should be re-opened. The referenced ticket #27065 was closed unsolved and references back here.
In 7.3 the issue remains and is a problem for those of us (programming ABAP from SAP) who cannot minutely influence the headers.
We discussed this issue in our weekly meeting. In order to move forward, the plan is to introduce the parsing of the charset parameter. The value will need to indicate a charset of UTF-8, as this is what is supported, in order to be accepted.
This has been fixed during the 7.x series. There is an open follow up to use the charset of the request and only accept valid items but that will be handled separately.
Most helpful comment
We discussed this issue in our weekly meeting. In order to move forward, the plan is to introduce the parsing of the charset parameter. The value will need to indicate a charset of
UTF-8, as this is what is supported, in order to be accepted.