This is requested numerous places, but I was reminded from https://github.com/elasticsearch/logstash/pull/895
Escaping quotes and control characters are impossible to do today in Logstash config. "\n" is literally backslash and lowercase n. "\"" is literally backslash and doublequote.
Logstash supports single and double-quoted text values, and we should support escapes in both the same exact way:
Add a new setting, config.support_escapes that, when set, will enable the following behaviors on quoted strings in the Logstash config:
setting => 'I said, "Hello!"'setting => "I said, 'Hello!'"setting => "C:\some\path"setting => "this
is
multiple
lines"
For reference see related issue on jira and linked use cases (csv quote, exec command, mutate string)
@wiibaa Thanks for linking in that jira ticket :heart:
'\' should be text with a single backslash
Seems to be missing from the proposal. Without this, there's no way to encode the sequence backslash-n without getting a newline instead.
'\' should be text with a single backslash
Will be problematic, since that looks exactly like the start of a legitimate multi-line string (single quote, followed by whitespace, then arbitrary text). If multi-line strings aren't legal, or have additional escaping requirements, then that should be documented.
Ugh markdown ruined my escaping, I guess. I did not intend to propose that
quote backslash backslash quote would be omitted ands also did not intend
that quote backslash quote would be valid.
On Tuesday, August 19, 2014, Neil Gentleman [email protected]
wrote:
'\' should be text with a single backslash
Seems to be missing from the proposal. Without this, there's no way to
encode the sequence backslash-n without getting a newline instead.'\' should be text with a single backslash
Will be problematic, since that looks exactly like the start of a
legitimate multi-line string (single quote, followed by whitespace, then
arbitrary text). If multi-line strings aren't legal, or have additional
escaping requirements, then that should be documented.—
Reply to this email directly or view it on GitHub
https://github.com/elasticsearch/logstash/issues/1645#issuecomment-52736089
.
Just giving my 2 cents, I have the feeling that the "historical reason" is that a regex-string would requires escaping of all backslashes to be interpreted correctly by the filter using them a lot (multiline and GROK) and it would be silly as stated by Jordan in the past, so the escaping was built-in in the config parser and a TODO was added in logstash 1.0.x and 1.1.x to manage proper regex config element.
Maybe allowing real regex object where it is due and then allowing string to be string (as mostly expected in ruby/java/bash) would be an alternative.
@nigelzor I fixed the rendering problem markdown had. It was turning quote-backslash-backslash-quote into a single backslash.
@wiibaa in logstash 1.2 and beyond you can do if [somefield] =~ /someregex/
I always intended to interpret backslash-as-escapes but never actually got around to it. It burdens users infrequently, but as we have more users, probability makes that affect a larger population of users, so I wanted to file about fixing it now.
@jordansissel what I meant is that you cannot do it inside filter config to disambigue between a pure string and a regex. So internally, sometimes a string is interpreted as a regex, sometimes it is not.
I hope you see what I mean. For me the following config could be the clearer solution to users instead of having the backslash escaping by default + exceptions documented and explained (for example "\s" should be also an exception not yet listed)
mutate {
replace =>["message", /\n+/, "\t"]
}
or
grok {
match => { "message" => /%{DATETIME} %{LOG} \t blabla \n %{GREEDYDATA}/ }
}
is this issue related to this issue?
@splashx yep :)
@jordansissel oh no :laughing:
+1 as I just hit this today too
Hi @jordansissel, are there any progress about this issue? Is there a timetable when it's fixed?
+1 hit today (same as https://github.com/elastic/logstash/issues/3239 6 months ago)
This change is likely to be resolved eventually, but not right now. There's a lot of work going on for the new centralized configuration feature that's coming soon and will impact how we store and represent the configuration of a logstash agent. I'd rather fix this problem at _that_ time than try to fix it now and break everyone who is using the current not-so-good behavior.
Thank you for this information :+1:
I'm not sure what to make of this. Could someone confirm that this is the current behavior?
Upon encountering a quote character, the Logstash config parser interprets this as a string literal, which is terminated by the next quote character of the same type as the opening quote. This string literal then represents the string of characters between the start and end quote characters in the surface syntax.
This means that a character sequence seq represents a string str if either
str contains no single-quotes, and seq = "'" + str + "'", orstr contains no double-quotes, and seq = '"' + str + '"'.If str contains single-quotes and double-quotes, it is unrepresentable as a string literal in Logstash syntax. (Deal with it.)
@jordansissel got any updates on when we could reckon on a solution to this issue?
@sts We can probably have it ready for the next major release of Logstash since that'll be the next time we can really introduce breaking changes.
@suyograo @acchen97 - I think we can ship this improvement with the next major given all the new work going into configuration features.
@sts As a short-term alternative, I'd be willing to add a flag to enables escaping support in the config processor. This would maintain backwards compat _and_ make it available in the Logstash 2.x series.
@jordansissel that would be awesome!
How can I work around this ? by using ruby code directly ?
Hi @suyograo,
can you tell the name of the hidden flag, which change the quote escaping.
Thank you in advance!
+1 Would be lovely to have this behave as expected by commonly accepted standards. Makes for painful on-boarding of more users in my organization as developers struggle to figure out that string manipulation in the log file parsing is challenging.
The idea of a flag for switching behavior works for me.
👍
+1 for this issue, following the configuration page, setting delimiter to it's theoritical default value (according to official documentation) does not work.
delimiter => "\n"
There's a hack around this using the ruby filter and ASCII ordinals https://gist.github.com/andrewvc/1b6e5c72cf124ed39d311f9714dac271
Any update on this ? This issue has been opened for more than 2 years now.
No update at this time.
I've updated the issue's description to include some possible workarounds for specific characters. See the section under "Workarounds today:"
@suyograo @acchen97 I'm open to adding this as an off-by-default feature as soon as maybe Logstash 5.2 or 5.3 timeframe.
@jordansissel +1 let's try to target 5.2 for this. Can you please own this feature?
Glad this issue is somewhat on your radar. The existing workarounds are helpful, but one thing is apparently still impossible to do with the current configuration parser: Using mutate+gsub to replace something matched with a double-quote character . i.e.
mutate {
gsub => [
'message', '\"','"'
]
}
just like mentioned in https://github.com/logstash-plugins/logstash-filter-mutate/issues/40.
And this is not a missing feature or cleanup, but a plain bugfix that has currently no workaround.
@jordansissel @acchen97 any news on this?
No news. There are many workarounds that can still be used today. I want to
add this feature, but we are overloaded with tasks as it is and it may take
some time before this gets worked on.
On Fri, Mar 31, 2017 at 12:14 AM Jakob Reiter notifications@github.com
wrote:
@jordansissel https://github.com/jordansissel @acchen97
https://github.com/acchen97 any new on this?—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/elastic/logstash/issues/1645#issuecomment-290635567,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AAIC6socvU57pCb87SLRb3l8u1TOrfPPks5rrKfKgaJpZM4CYjY9
.
This makes it impossible to insert e.g. control characters.
I have a branch where I am working on this as time permits. No ETA. https://github.com/elastic/logstash/compare/feature/config-string-escapes
I was able to work around this by manually inserting the character literal. 😄
PR is open for this: https://github.com/elastic/logstash/pull/7442
Fixed in #7442
Hmm. I configure fresh ELK system and it took me few days to find why my text contain \n (lteral \ then literal n).
Well, I wrote
mutate { join => "somefield" => "\n" }
Most helpful comment
@suyograo @acchen97 I'm open to adding this as an off-by-default feature as soon as maybe Logstash 5.2 or 5.3 timeframe.