Logstash: Queue Space in Bytes Metric Breaks Tests

Created on 18 May 2017  路  14Comments  路  Source: elastic/logstash

See: https://travis-ci.org/elastic/logstash/jobs/233693574

  1) Test Monitoring API can retrieve queue stats
     Failure/Error: expect(result["pipeline"]["queue"]["data"]["free_space_in_bytes"]).not_to be_nil

     NoMethodError:
       undefined method `[]' for nil:NilClass
     # ./specs/monitoring_api_spec.rb:64:in `(root)'
     # /home/travis/.rvm/gems/jruby-1.7.25/gems/stud-0.0.22/lib/stud/try.rb:79:in `try'
     # /home/travis/.rvm/gems/jruby-1.7.25/gems/stud-0.0.22/lib/stud/try.rb:95:in `try'
     # /home/travis/.rvm/gems/jruby-1.7.25/gems/stud-0.0.22/lib/stud/try.rb:91:in `try'
     # /home/travis/.rvm/gems/jruby-1.7.25/gems/stud-0.0.22/lib/stud/try.rb:123:in `try'
     # ./specs/monitoring_api_spec.rb:56:in `(root)'
     # /home/travis/.rvm/gems/jruby-1.7.25/gems/rspec-wait-0.0.9/lib/rspec/wait.rb:46:in `(root)'
Finished in 10 minutes 45 seconds (files took 6.01 seconds to load)
32 examples, 1 failure
Failed examples:
rspec ./specs/monitoring_api_spec.rb:51 # Test Monitoring API can retrieve queue stats

Edit by @original-brownbear:

Retagged as bug on account of https://github.com/elastic/logstash/issues/7154#issuecomment-302655480

bug

All 14 comments

note: this test should fail less often now since https://github.com/elastic/logstash/pull/7144 has been introduced. this PR triggers KeyError when fetching data from the hash, causing Stud.try to retry the block

@jsvd argh I thought it retried on NoMethodError too? At least it claims to? :D

@jsvd oh :D that makes my PR https://github.com/elastic/logstash/pull/7157 very pointless then and explains everything :)
I think we can close here in that case?

@jsvd I guess we should also fix this kind of thing:

Stud.try(max_retry.times, [NoMethodError, RSpec::Expectations::ExpectationNotMetError])

and remove the NoMethodError, that's really misleading :D

Well, there's an argument to be made that once the webserver is up the metrics should be available. These failures actually expose a race condition in core, even if it's a mostly harmless one.
Fixing this will require some core refactor to only respond to requests after the some sort of "metric subsystem bootstrap check".

Maybe we can remove the test failure tag and promote this to a bug?

and remove the NoMethodError, that's really misleading :D

@original-brownbear indeed: https://github.com/elastic/logstash/pull/7145/files
sorry for that :(

@jsvd well the metrics for this thing are generated in an async and periodic manner, does it make sense to force holding the startup for that?

PS: No need to be sorry, I learned something about try that may have taken quite a while to get right on my own :D

interpreted the emoji to mean close :) Fixed by #7145

@original-brownbear fair point. I think we can just acknowledge that this issue exist and tag it accordingly, but there's no need to tackle this now

[edit] too fast for me :D

let's leave it open then :)

@original-brownbear @jsvd I would also argue that until #7155 is implemented this test can't be written correctly. Once that's done then the metric should always exist so long as the webserver is up, but just be set to some initial value (maybe nil?)

@andrewvc makes sense :)

Closing this issue as the original test failure hasn't been seen for quite some time, many changes to the metrics have been made since this issue has been opened, and more changes are coming.

e.g. I don't see any actionable items as direct result of this issue not covered by other issues.

Was this page helpful?
0 / 5 - 0 ratings