Logstash: [META] Java Execution as Default in 6.x

Created on 12 Mar 2018  路  31Comments  路  Source: elastic/logstash

This issue is for us to discuss whether we should flip the switch and make Java execution the default in 6.4.

@original-brownbear @danhermann @jakelandis @yaauie I would love your input on this topic. I myself need a little time to re-familiarize myself with the code.

Blockers:

Java Execution discuss meta

Most helpful comment

@original-brownbear - apologizes, I was only looking in the Junit tests and missed the Ruby specs that cover the cases (at the unit level) I was referencing. I haven't gone through these in detail, but on the surface they look pretty thorough and resolves my concerns of shipping as the default.

:shipit:

All 31 comments

The Java execution code has been pretty carefully reviewed and fairly heavily exercised during the development process. It also has the same rspec test coverage as the Ruby execution along with some additional Java JUnit tests and it has worked entirely without issue for me. I'm not sure how feasible it is, but it would be nice if we could establish some sort of objective criteria for its stability.

+1 to what @danhermann says. I don't necessarily know how to make this "more stable" without concrete input. I know of no bugs in it as of right now but know of bugs in the Ruby exec that are only fixed in the Java exec, so I'm optimistic about moving to it as the default for 6.4.

I think it's not quite ready (see #9198), things to wish for:

  • parallel compilation (as many as the number of workers configured?)
  • more tests which has many rules and conditionals since it looks like that haven't been tested yet.

After an hour it still wasn't done compiling everything so I gave up.

@simmel, thank you for trying out the Java execution. Could you post the config that wasn't working for you so we can debug it?

@simmel the other thing that would help is if you could take a thread dump while java execution is running. If you could take 3-4 of those over the course of a few minutes that'd be very useful for us to find the source of the problem.

@danhermann WRT objective criteria, I'm not sure that exists. There's an infinite number of possible Logstash configs, so this feels to me like more of a judgement call.

I would say, given @simmel 's report, that we definitely shouldn't ship this for 6.3 of course.

but it would be nice if we could establish some sort of objective criteria for its stability.

Unit level code [1] coverage % is often used as the objective criteria. I know that we exercise a lot of the code via some of the (non-unit level) testing , but iirc those tests don't make the same level of correctness assertions that you would at a unit level. It seems that there are many unit level paths of execution that could be tested but aren't currently.

[1] Unit level - for example asserting that (only) an in condition works properly, as opposed to compiling the whole pipeline and asserting the final result. Ideally we would identify all the units of the config language and have corresponding tests that the execution is correct.

@jakelandis thanks for chiming in here. Last I looked it seemed there were a lot of opportunities to add more unit tests. Functional tests don't catch everything, and given the importance of this code I think we want to be extra certain here.

Additionally, those tests make refactoring easier.

@andrewvc @simmel @jakelandis do you have an example of what conditional doesn't have tests? It seems to me that the java pipeline specs (conditionals_spec.rb) basically covers all supported operators I could think of.
Additionally java_pipeline_spec.rb should cover all possible conditional operations together with flushing events now as well.
I'd really need something more concrete here to add more tests.

Also, I would definitely not start adding unit level tests to the EventCondtition classes if that was the suggestion. This would make future refactoring horribly complex and further changes to that code will be incoming eventually with the Java API on the horizon.

@original-brownbear - apologizes, I was only looking in the Junit tests and missed the Ruby specs that cover the cases (at the unit level) I was referencing. I haven't gone through these in detail, but on the surface they look pretty thorough and resolves my concerns of shipping as the default.

:shipit:

OK, glad that you're on board. I had the same concerns, but I haven't looked in a bit.

I think we should target 6.4 given where we currently are with #9198 . Hopefully we'll hear back from @simmel there. If not, we'll have to find a way to repro his report.

Hey all, sorry for the late reconnect; had to take care of sick kids.

I've opened up #9255 for the slow compilation issue so we don't pollute this thread.

On Tue, 2018-03-13 at 05:51:42 +0000, Armin Braun wrote:

@andrewvc @simmel @jakelandis do you have an example of what conditional doesn't have tests? It seems to me that the java pipeline specs (conditionals_spec.rb) basically covers all supported operators I could think of.
Additionally java_pipeline_spec.rb should cover all possible
conditional operations together with flushing events now as well.
I'd really need something more concrete here to add more tests.

Ah, I didn't know about the java_pipeline_spec.rb test. What I was
referring to was things like my related #9255 and other things where you
have a large ruleset and using lots of different plugins.

I know writing tests for edge cases is hard but sometimes I feel like
there are a lot of issues in production which I feel should have been
caught in testing before shipping.

@simmel @andrewvc I get the pain here (I really do). Unfortunately, I don't think the problem is necessarily a result of the Java exec (it just makes it a little worse potentially).
In the end, the performance of the compiler is mainly gated by treetop as far as I can see.

@simmel what's the difference between Ruby and Java exec in terms of reload times? (in terms of the ratio of the two). If this isn't somewhere in the vicinity of ~1.5x+ there isn't much point in optimizing the compiler for performance I'm afraid. In that case, we should look into Treetop (as painful as it is).

On Mon, 2018-03-19 at 04:41:50 -0700, Armin Braun wrote:

@simmel what's the difference between Ruby and Java exec in terms of
reload times? (in terms of the ratio of the two).

I haven't been able to start with java exec because of #9255 and thus I
haven't been able to reload it.

somewhere in the vicinity of ~1.5x+ there isn't much point in
optimizing the compiler for performance I'm afraid. In that case, we
should look into Treetop (as painful as it is).

Are you talking about reloading here or startup? If startup, as stated
in #9255, logstash still hadn't started after 50 minutes.

@original-brownbear @danhermann @yaauie FYI I've added a list of blocking issues to the top of this ticket with checkboxes. Nothing huge here luckily :)

@andrewvc checked one of them since #9255 was fixed :) The other one has an open PR now https://github.com/elastic/logstash/pull/9282 which should be trivial to review :)

@andrewvc both fixed :P

@andrewvc given that these are fixed, what's the plan here? Do we want to just make the Java exec the default in 6.x as it already is in master or do we want to go for removing the Ruby exec code outright (which would be my preference)?

Potentially new blocker: https://github.com/elastic/logstash/issues/9255#issuecomment-376127356 trying to track this down to open an issue for it.

9289 is merged :)

Having discussed this with @original-brownbear @yaauie and @danhermann we've decided that with these last bugs resolved java execution is LGTM for the 6.3.0 release barring any new discoveries.

@original-brownbear once you've made java execution the default in 6.x please feel free to close this ticket.

@andrewvc not so fast apparently https://github.com/elastic/logstash/issues/9255#issuecomment-376451019 :( Looking into that now

After some discussion, we've added to this list and are targeting 6.4 instead.

FWIW with all fixes our ruleset works now = ) Thanks for all the hardwork @original-brownbear!

@simmel no problem and thank you for finding/reporting/testing these :)

I'm getting flashbacks ; ( We just upgraded from 6.3.0 to 6.4.2 and got a whole new array of issues with java execution. Tracking this in #10071

The Java execution engine was made default in the 7.0 release.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

packetrevolt picture packetrevolt  路  3Comments

molitoris picture molitoris  路  3Comments

suyograo picture suyograo  路  5Comments

bobbyhubbard picture bobbyhubbard  路  3Comments

max-wittig picture max-wittig  路  4Comments