I am currently using a custom jruby-1.7.25 install and the $USE_RUBY flag is set to 1. Strangely enough, even with the # encoding: utf-8 "magic" at the top of my spec.rb, I am _still_ having the exact same issue as referenced https://github.com/elastic/logstash/issues/3448.
Failures:
1) Test event "{"message":"2015-21-01 12:01:02.003 UTC Hello 42 3...." when processed
Failure/Error: insist { subject["timestamp"] } == "2015-21-01 12:01:02.003 UTC"
Insist::Failure:
Expected "2015-21-01 12:01:02.003 UTC", but got nil
# /vagrant/logstash_test_stuff/test_spec.rb:83:in `(root)'
Finished in 0.534 seconds (files took 5.99 seconds to load)
1 example, 1 failure
Failed examples:
rspec /vagrant/logstash_test_stuff/test_spec.rb:72 # Test event "{"message":"2015-21-01 12:01:02.003 UTC Hello 42 3...." when processed
Randomized with seed 38731
Logstash 1.5 is technically not being worked on any more. Is there a reason you can't use 2.4?
My bad, @untergeek. I was actually using logstash v.2.3.0. I've updated the title of the issue.
So what does _reliably_ mean in this context? Does it work sometimes and not others? To me, reliably would indicate that it always passes or always fails.
@untergeek:
that this test (below) works:
describe "simple test" do
config <<-CONFIG
filter {
grok {
match => { "message" => "am%{GREEDYDATA:foo}" }
}
}
CONFIG
sample 'amessage' do
insist { subject["foo"] } == "essage"
insist { subject["message"] } == "amessage"
end
end
...but this one does not:
describe "intermediate test" do
config <<-CONFIG
filter {
grok {
match => [ "message", "%{TIMESTAMP_ISO8601:timestamp} %{WORD:word1} %{INT:int1:int} %{NUMBER:[inner][float1]:float}" ]
tag_on_failure => [ "_grokparsefailure_test" ]
}
}
CONFIG
message = '2015-21-01 12:01:02.003 UTC Hello 42 3.14159'
sample("message" => message, "type" => "test") do
insist { subject["type"] } == "test"
insist { subject["timestamp"] } == "2015-21-01 12:01:02.003 UTC"
insist { subject["word1"] } == "Hello"
insist { subject["int1"] } == 42
insist { subject["inner"]["float1"] } == 3.14159
end
end
@untergeek: my gut is telling me this has something to do with access to patterns. Strange though because the top of my spec.rb looks like -->
require "spec_helper"
require "logstash/filters/grok"
require "logstash/patterns/core"
What's the failure in the second test?
@untergeek, the failure is as follows:
Failures:
1) intermediate test "{"message":"2015-21-01 12:01:02.003 UTC Hello 42 3...." when processed
Failure/Error: insist { subject["timestamp"] } == "2015-21-01 12:01:02.003 UTC"
Insist::Failure:
Expected "2015-21-01 12:01:02.003 UTC", but got nil
# ./spec/test_spec.rb:63:in `(root)'
Finished in 0.33 seconds (files took 6.7 seconds to load)
1 example, 1 failure
...no match seems to have been made.
We're moving away from the RSpec 2.0 insist terminology, to using RSpec 3's expect. You should look at how we're doing RSpec 3.0 tests in other plugins and such and go with that, mostly because we may pull the plug on RSpec 2.0 usage at any time. Here's an example using expect from the logstash-core-patterns repository: https://github.com/logstash-plugins/logstash-patterns-core/blob/master/spec/patterns/httpd_spec.rb
That said, I think the problem is that UTC is not a valid timezone for ISO8601.
From: https://github.com/logstash-plugins/logstash-patterns-core/blob/master/patterns/grok-patterns#L69-L71
ISO8601_TIMEZONE (?:Z|[+-]%{HOUR}(?::?%{MINUTE}))
ISO8601_SECOND (?:%{SECOND}|60)
TIMESTAMP_ISO8601 %{YEAR}-%{MONTHNUM}-%{MONTHDAY}[T ]%{HOUR}:?%{MINUTE}(?::?%{SECOND})?%{ISO8601_TIMEZONE}?
So, an appropriate ISO8601_TIMEZONE can be Z for Zulu time (which is UTC), or it can be -0700, or -07:00, or +1000 or +10:30 or anything like that, but UTC is not valid. See: https://en.wikipedia.org/wiki/ISO_8601#UTC
That seems to be why you're getting nil instead of the expected timestamp.
@emmanuellyautomated In addition to the good point made by @untergeek I would recommend when you design test against grok to always test that the tag does not contains the failure tag like (with old syntax) insist { subject["tags"] }.nil?.
It helps to early reconsider both your patterns/regex and your input, once you trust the grok filter enough. Because your input string is 2015-21-01, and ISO8601 is yyyy-mm-dd the regex will already fail here even before the timezone issue :D
@untergeek; @wiibaa: I ended up using a test structure that looks like this:
# encoding: utf-8
require "spec_helper"
require "logstash/filters/grok"
require "logstash/patterns/core"
describe "cisco example -- CISCO737016" do
let(:pattern) { "CISCOFW737016" }
context "src_ip error message for 737016 can be returned" do
let(:message) { '<158>Sep 01 2016 22:04:07 10.9.2.253 : %ASA-6-737016: IPAA: Freeing local pool address 172.31.7.19' }
subject { grok_match(pattern, message) }
it "should render a hash from matching" do
expect(subject['src_ip']).to include('172.31.7.19')
end
end
end
...turns out that changing to a more explicit testing style fixed the problem. Thanks for the suggestion @untergeek!
Most helpful comment
@emmanuellyautomated In addition to the good point made by @untergeek I would recommend when you design test against grok to always test that the tag does not contains the failure tag like (with old syntax)
insist { subject["tags"] }.nil?.It helps to early reconsider both your patterns/regex and your input, once you trust the grok filter enough. Because your input string is 2015-21-01, and ISO8601 is yyyy-mm-dd the regex will already fail here even before the timezone issue :D