Elastalert: num_hits vs num_matches

Created on 2 Aug 2017  路  4Comments  路  Source: Yelp/elastalert

Hi all,

I have frequency type rule with terms query.

use_terms_query: true
query_key: clientIP
filter:
- query:
    match:
        detail: "some text I match against"

In the alert I have these fields:

num_hits: 1
num_matches: 1

Sometimes with other numbers, but it's always the num_hits which is larger.

Could you please explain what num_hits and num_matches really mean?

Thank you,
脕d谩m

Most helpful comment

num_hits is the number of documents returned by elasticsearch for a given query. num_matches is how many times that data matched your rule, each one potentially generating an alert.

If it makes a query over a 10 minute range and gets 10 hits, and you have

type: frequency
num_hits: 10
timeframe:
  minutes: 10

then you'll get 1 match.

All 4 comments

num_hits is the number of documents returned by elasticsearch for a given query. num_matches is how many times that data matched your rule, each one potentially generating an alert.

If it makes a query over a 10 minute range and gets 10 hits, and you have

type: frequency
num_hits: 10
timeframe:
  minutes: 10

then you'll get 1 match.

Great, thanks for explanation!

I really thought that I understand it, but whenever I try to wrap my head around these cases it looks like I'm losing it:

  • 3 hits 1 matches
  • 21 hits 18 matches

I have it configured like you described.

type: frequency
num_events: 5
timeframe:
  minutes: 5

(Plus the stuff in m opening comment.)

So how can the last case happen for example?
There were 28*5 hits, right?
Since I aggregate against the clientIP, there were only 18 different clientIPs, so 18 matches?

What's "I aggregate against the clientIP" mean?
Does that mean you are using use_terms_query: true?

If so, "hits" in this case actually means "buckets" (The log says buckets, the match says num_hits still, a bit confusing I suppose) and each bucket has some count which isn't currently logged.

In that case, it means that 18 of the 21 unique clientIPs had at least 5 documents.

Yep, that's what I meant. :)

use_terms_query: true
query_key: clientIP

Thank you, it is clear now.

Was this page helpful?
0 / 5 - 0 ratings