Logstash: Upgrade to JRuby 9k

Created on 8 Jul 2016  路  35Comments  路  Source: elastic/logstash

According to this tweet, JRuby 1.7 will be EOL by the sometime during 2016.

Let's make sure we have all the bugs/problems identified and get on the path to upgrade to JRuby 9k some time after Logstash 5.0.0

enhancement v6.0.0

Most helpful comment

I have begun the upgrade process. All core specs pass with JRuby 9.1.6.0 at the time of posting this comment.

I will be adding notes here about what needs to change in the plugins as I find them (as well as adding an issue to each plugin)

All 35 comments

I can revamp https://github.com/elastic/logstash/pull/3654 in more structured way, including the intention to clarify possible fast/slow areas due to the change to 9k, so we can work on them.

put a PR to track the upgrade at https://github.com/elastic/logstash/pull/5616 , but hit an error in our java event, described at #5617 (blocker for the upgrade)

Found this on the date filter with 9.1.5.0

undefined method `time' for "2013-04-21T00:00:00.000Z":String
     # /Users/guy/.gem/jruby/2.3.1/gems/logstash-core-event-java-5.0.0-java/lib/logstash/timestamp.rb:14:in `<=>'
     # uri:classloader:/jruby/kernel/basicobject.rb:4:in `!='
     # /Users/guy/.gem/jruby/2.3.1/gems/insist-1.0.0/lib/insist/comparators19.rb:6:in `!='
     # ./spec/filters/date_spec.rb:339:in `block in (root)'
     # /Users/guy/.gem/jruby/2.3.1/gems/rspec-wait-0.0.9/lib/rspec/wait.rb:46:in `block in (root)'

same test passes in 1.7.25

I have begun the upgrade process. All core specs pass with JRuby 9.1.6.0 at the time of posting this comment.

I will be adding notes here about what needs to change in the plugins as I find them (as well as adding an issue to each plugin)

Digest::HMAC is removed in Ruby 2.2 - its suggested that we use OpenSSL::HMAC
plugins affected:

  • [ ] LogStash::Codecs::Collectd issue 24

Netflow Codec, JRuby 1.7.X quirk.
https://github.com/logstash-plugins/logstash-codec-netflow/issues/66
[Edit] now fixed in master.

CSV Output multi_receive_encoded method not implemented. Not specific to JRuby 9K but needs fixing. See csv output issue 10

I have implemented the method in a thread-safe way locally.

RabbitMQ output. double received on_shutdown unexpectedly. [Edit] now fixed in master.

require relative is needed on some plugins when loading a local spec_helper

@guyboertje https://github.com/elastic/logstash/issues/6576 is also a blocker.

lsperfm 20 second run

9K

name,                    start time, elapsed,  events, avg tps, best tps, avg top 20% tps
simple line in/out,           10.05,   20.15, 1292000,   64133, 74500.00, 74000
simple line in/json out,       8.83,   20.12, 1644000,   81697, 95300.00, 94766
json codec in/out,             9.21,   20.02,  942000,   47059, 53200.00, 52966
line in/json filter/json out,  9.16,   20.03,  594000,   29651, 35400.00, 34666
apache in/json out,            9.00,   20.02, 1582020,   79039, 90900.00, 89733
apache in/grok codec/json out, 9.22,   20.07,  631620,   31465, 41200.00, 41100
syslog in/json out,            9.68,   20.24,  200000,    9881, 12600.00, 12500

1.7.25

name,                    start time, elapsed,  events, avg tps, best tps, avg top 20% tps
simple line in/out,            6.87,   20.13, 1624000,   80664, 91400.00, 90666
simple line in/json out,       6.74,   20.11, 1640000,   81535, 88900.00, 88733
json codec in/out,             6.80,   20.02,  882000,   44049, 51000.00, 50966
line in/json filter/json out,  6.72,   20.08,  620000,   30884, 36500.00, 36400
apache in/json out,            6.69,   20.03, 1556280,   77686, 84300.00, 84100
apache in/grok codec/json out, 6.82,   20.04,  653400,   32605, 39600.00, 38633
syslog in/json out,            7.00,   20.18,  204000,   10111, 12100.00, 12066

Once we have a PR to move logstash into jruby 9k we should also start testing against java 9, so we can help validate the jruby support (see https://github.com/jruby/jruby/issues/4111)

New longer lsperfm benchmarks with % change. Averaged 3 x 120 second runs.

_The data in the table that was below is erroneous. Removed table._

[table removed]

longer lsperfm benchmarks with % change, 6 minutes each run

Please scroll to view the right most columns. In the configs that include a filter the difference between 1.7.25 and 9K range from -1% to +3% - this is a good sign.

_The data in the table that was below is erroneous. Removed. Comment below refers to the erroneous data._

[table removed]

@guyboertje Any ideas explaining the tremendous discrepancies between 1.7.25 & 9K with the apache in/json out and simple line in/out tests?

I will rerun the tests. On the 1.7.25 instance the log.level was set to warn. I will set both instances to warn.
I am also profiling the apache in/json out with YourKit Profiler.

NOTE: my current JVM is Azul Zulu for all benchmarks above.

Please scroll to view the right most columns. The difference between 1.7.25 and 9K range from -4% to +5% - this is a good sign as there are no wildly differing numbers across the suite. Further -4% to +5% could be caused by interruptions from other processes on my laptop.

To achieve this, I warmed up the JVM by running one 90 run first then the normal suite for 6 minutes each. Both instances of Logstash had their logging level set to warn.

name | version | start time | elapsed | events | avg tps | best tps | avg top 20% tps | events % change | avg tps % change | best tps % change
---- | ------- | ---------- | ------- | ------ | ------- | -------- | --------------- | --------------- | ---------------- | -----------------
name | version | start time | elapsed | events | avg tps | best tps | avg top 20% tps | events % change | avg tps % change | best tps % change
apache in/grok codec/json out | 1.7.25 | 6.78 | 360.06 | 11097900 | 30823 | 35200 | 32360 | | |
apache in/grok codec/json out | 9K | 9.25 | 360.07 | 11660220 | 32383 | 36500 | 33966 | 5.07% | 5.06% | 3.69%
apache in/json out | 1.7.25 | 6.64 | 360.01 | 30183120 | 83841 | 86300 | 85535 | | |
apache in/json out | 9K | 9.63 | 360.03 | 30808800 | 85573 | 89800 | 88350 | 2.07% | 2.07% | 4.06%
json codec in/out | 1.7.25 | 6.94 | 360.02 | 17946000 | 49848 | 120400 | 51726 | | |
json codec in/out | 9K | 9.21 | 360.06 | 18778000 | 52152 | 53600 | 53192 | 4.64% | 4.62% | -55.48%
line in/json filter/json out | 1.7.25 | 6.79 | 360.04 | 12362000 | 34335 | 38400 | 36795 | | |
line in/json filter/json out | 9K | 9.22 | 360.08 | 11862000 | 32943 | 37600 | 35702 | -4.04% | -4.05% | -2.08%
simple line in/json out | 1.7.25 | 6.82 | 360.14 | 31058000 | 86238 | 90900 | 89243 | | |
simple line in/json out | 9K | 9.24 | 360.13 | 32732000 | 90890 | 94400 | 93439 | 5.39% | 5.39% | 3.85%
simple line in/out | 1.7.25 | 6.87 | 360.13 | 33758000 | 93738 | 96600 | 95787 | | |
simple line in/out | 9K | 8.86 | 360.13 | 33196000 | 92178 | 95800 | 94846 | -1.66% | -1.66% | -0.83%
syslog in/json out | 1.7.25 | 6.85 | 360.1 | 4278000 | 11880 | 13000 | 12446 | | |
syslog in/json out | 9K | 9.41 | 360.12 | 4376000 | 12152 | 13000 | 12700 | 2.29% | 2.29% | 0.00%

@guyboertje

Now i'm using logstash 5.2 with To use output-csv plugin with logstash 5.x , the method received need to be remplaced by multi_receive_encoded method #11 suggested solution (edit csv.rb). but it seems data can be lost compare to the original one (the broken one) and also some of csv option can't be use either(such as add csv header) . So could you please suggest the solution for writing as csv file with logstash 5.x?
btw. I'm newbie in logstash,ruby.

Thank you!

@gunpix
I will create a PR for CSV output today.

@guyboertje do you have a branch for this work? I'm running into issues on Jruby 9k (testing on Java 9 for #6891).

@jordansissel - I have this PR https://github.com/elastic/logstash/pull/6713.
Perhaps we need a branch.

Adding a few traces I've seen in the bootrap / running test.

I will just list a few issue I've seen in the build to make

[plugin:install-development-dependencies] Installing development dependencies of all installed plugins
/Users/ph/es/logstash/second/logstash/vendor/bundle/jruby/2.3.0/gems/rubyzip-1.1.7/lib/zip.rb:35: warning: ObjectSpace impacts performance. See http://wiki.jruby.org/PerformanceTuning#dont-enable-objectspace
/Users/ph/es/logstash/second/logstash/logstash-core/lib/logstash/api/commands/stats.rb:57: warning: key (SymbolNode:heap_used_in_bytes 60) is duplicated and overwritten on line 57
/Users/ph/es/logstash/second/logstash/logstash-core/lib/logstash/api/commands/node.rb:41: warning: key (SymbolNode:vm_name 45) is duplicated and overwritten on line 43

With the update of rspec we also see this in the suite, we will need to be more robust in the expect when using raise_error

WARNING: Using the `raise_error` matcher without providing a specific error or message risks false positives, since `raise_error` will match when Ruby raises a `NoMethodError`, `NameError` or `ArgumentError`, potentially allowing the expectation to pass without even executing the method you are intending to call. Actual error raised was #<ArgumentError: Setting "string.1" hasn't been registered>. Instead consider providing a specific error class or message. This message can be suppressed by setting: `RSpec::Expectations.configuration.on_potential_false_positives = :nothing`. Called from /Users/ph/es/logstash/second/logstash/logstash-core/spec/logstash/settings_spec.rb:54:in `block in (root)'.

First run doesn't show too much errors:

Finished in 5 minutes 27 seconds (files took 48.92 seconds to load)
2107 examples, 7 failures, 2 pending

complete report

Nothing really major, JRuby 9k seems to have switched to java.time for the implementation of the Time class which mean they now have nanoseconds precision. Since we still rely on Joda time in the timestamp class we can only provide milliseconds precision. I think the best way right now is to fix the test to make sure we only provide milliseconds precision and create an issue to refactor the timestamp class to use java.time and be able to provide nsec precision.

@ph - I have discovered some weirdnesses with Joda versions and the date filter. I am adding a general LS issue to cover a move from Joda to Java 8 Time across the whole of LS.

@guyboertje Yup, I believe this move can be done after we move to JRuby 9K. The case that come to my mind is the following:

jruby-9.1.9.0 :001 > Time.now.to_f
 => 1495118655.8176029
jruby-9.1.9.0 :002 > DateTime.now.to_time.to_f
 => 1495118663.463
jruby-9.1.9.0 :003 >

Everything is passing locally with the master branch, the move was pretty smooth, I am running the test of all the plugins against the core and will update this issue with the changes.

qa/integration passes, I am running into issue for the qa/acceptance, I can't build a debian artifact. I think its not a new problem, I believe the error was already there in 1.7.25 but was causing only a debug statement.

ruby 1.7.25

 Exception: java.lang.ThreadDeath thrown from the UncaughtExceptionHandler in thread "Thread-17"

jruby 9k

 rake aborted!
IOError: closed stream
org/jruby/RubyIO.java:2189:in `flush'
/Users/ph/es/logstash/vendor/bundle/jruby/2.3.0/gems/childprocess-0.7.0/lib/childprocess/jruby/pump.rb:45:in `pump'
/Users/ph/es/logstash/vendor/bundle/jruby/2.3.0/gems/childprocess-0.7.0/lib/childprocess/jruby/pump.rb:18:in `block in run'
Tasks: TOP => artifact:deb

My theory is there is a bug in fpm and we close the pipe/channel at the wrong time.

We will need to update to http://jruby.org/2017/05/25/jruby-9-1-10-0

Also when building plugins we see this warning:

jruby: warning: unknown property jruby.cext.enabled
jruby: warning: unknown property jruby.cext.enabled
jruby: warning: unknown property jruby.cext.enabled
jruby: warning: unknown property jruby.cext.enabled
jruby: warning: unknown property jruby.cext.enabled
jruby: warning: unknown property jruby.cext.enabled

Previous error only seems to be on plugins build.

I've found another regression,

9k

jruby-1.7.25 :002 > LogStash::Event.new.to_hash
 => {"@timestamp"=>LogStash::Timestamp, "@version"=>"1"}

master

jruby-1.7.25 :002 > LogStash::Event.new.to_hash
 => {"@timestamp"=>2017-06-06T19:31:11.065Z, "@version"=>"1"}

5.4.0

irb(main):002:0> LogStash::Event.new.to_hash
=> {"@timestamp"=>2017-06-06T19:19:12.403Z, "@version"=>"1"}

EDIT: fixed by making sure we implement inspect on Timestamp.

7349 is merged

Was this page helpful?
0 / 5 - 0 ratings