Hello Honza, I'm hoping you might be able to help me out with one last issue i'm having. I need to query for all articles that have 404 errors as a response field and I also need to find all of them from withing the lasts 15 inutes. The issues show up inside of Kibana but they are not showing in my queries.
I've been having a really hard time getting this to work and there aren't any clear examples of querying in a specific range of time in the docs.
Here's my script.
s = Search(using=es, index= "_all") \
.query("match", response = "404")\
.filter('range', timestamp={'from': datetime.datetime.now() - datetime.timedelta(minutes=15), 'to' : datetime.datetime.now() }) #, 'lte': datetime(2010, 10, 9)})
#.filter("range", '@timestamp': {"gte": "now-15m", "lte": "now"}) #, "lt" : "2014-12-31 1:00:00"})
#filter is for a range of possible times.
#.filter("range", timestamp={ "gt":"now -5m","lt":"now" })
response = s.execute()
Try this instead:
from = (datetime.datetime.now() - datetime.timedelta(minutes=15)).strftime('%Y-%m-%dT%H:%M:%S')
to = datetime.datetime.now().strftime('%Y-%m-%dT%H:%M:%S')
The other things to check are if the date type is being set on the mappings correctly when it's being loaded into the index and also timezone information and daylight savings.
@aphillipo there is no need to convert the datetimes to strings, elasticsearch-py already does that for you.
@DavidAwad what is the mapping for the field and it's name? In your examples you are using both 'timestamp' and '@timestamp', which are different.
what's the dirrerence between the two? I was under the impression their simply formatted differently. I just need to use either one such that i have all 404 responses from the last 15 minutes. I know that i need to use a greater than filter and a less than time.now() filter.
My problem is with the exact syntax I'd need to use as I know exactly what i'm trying to do here. Thank you again for all the help.
@DavidAwad @ is an allowed character so @timestamp and timestamp are two entirely different fields. The exact syntax is (assuming the field name doesn't contain @):
s = Search(using=es).filter('term', response=404).filter('range', timestamp={'gte': 'now-5m', 'lt': 'now'})
There is no need to calculate the time in python, elasticsearch can do it. The response condition should also probably be a filter since it's an exact value lookup.
Awesome! I'm sorry to have to keep throwing questions at you!
could I give a datetime object to elasticsearch? This is a silly example but if i have any randome datetime object. Can I feed that to the API?
timevar = datetime.datetime.now()
sleep(3000)
# ... code here #
s = Search(using=es).filter('term', response=404).filter('range', timestamp={'gte': timevar , 'lt': 'now'})
Yes, you can. See https://github.com/elasticsearch/elasticsearch-dsl-py/issues/49#issuecomment-68568617
closing this ticket since issue has been resolved
Cool, Thanks!
Alright, So i've been manually inserting issues myself and can't query for them using you're exact syntax and this is what's getting at me
es = elasticsearch.Elasticsearch([{u'host': u'192.168.4.151', u'port': b'9200'}])
#es = elasticsearch.Elasticsearch([{u'host': u'192.168.4.151'}])
#s = Search(using=es, index= "_all") \
.query("match", request="\COIL")
#.filter("range", timestamp={"gt": "2011-08-30 00:00:00" , "lt" : "2014-08-30 13:00:00"})
s = Search(using=es).filter('term', response = 404).filter('range', timestamp={'gte': 'now-5m', 'lt': 'now'})
response = s.execute()
if response.hits:
for hit in response:
print ('\033[94m' + "Script HIT. " + '\033[0m' )
Is there something i'm not getting here?
*also this is a test ip, feel free to try this code to recreate the issue *
http://192.168.4.151:9292 is a link to the kibana instance for proof that recent 404's have gotten there.
I am sorry, I cannot help you if you don't give me the basic information - what do your mappings look like? What documents do you have in your index? What exactly is wrong with this last example you have provided?
The problem:
The filter we have for just the last 15 minutes keeps removing all of the logs from the elasticsearch results.
So what's happening is I'm writing a script that calls a function everytime a 404 error happens on a webpage. I need to have a filter that will look at all of the 404 errors in the last 15 minutes or whatever time based on some variable python datetime object. You've already answered that question and I know I can simply pass a datetime object to it but before I even get that working I want to see that the filtering even works at all. Which is where we've gotten stuck.
I can find individual logs from the last 15 minutes by querying for hard coded strings that I know are there. (e.g. "/COIL" is a 404 log in that kibana instance) but when we add the time based filter the results are empty.
Is there any reason that filter would be removing the entries I'm interested in?
Here's an example of my dataset.
{
"_index": "logstash-2015.01.05",
"_type": "apache",
"_id": "3Oro28ckTgmFSVXc2mR61g",
"_score": 0.2964111,
"_source": {
"message": "104.236.36.14:80 65.196.71.34 - - [05/Jan/2015:11:47:30 -0500] \"GET /COIL HTTP/1.1\" 404 403 \"-\" \"Mozilla/5.0 (X11; Ubuntu; Linux i686; rv:31.0) Gecko/20100101 Firefox/31.0\"",
"@version": "1",
"@timestamp": "2015-01-05T16:47:30.000Z",
"type": "apache",
"file": "/var/log/apache2/other_vhosts_access.log",
"host": "addteq.com",
"offset": "5872",
"clientip": "65.196.71.34",
"ident": "-",
"auth": "-",
"timestamp": "05/Jan/2015:11:47:30 -0500",
"verb": "GET",
"request": "/COIL",
"httpversion": "1.1",
"response": "404",
"bytes": "403",
"referrer": "\"-\"",
"agent": "\"Mozilla/5.0 (X11; Ubuntu; Linux i686; rv:31.0) Gecko/20100101 Firefox/31.0\""
}
}
The problem is that you have a field 'timestamp' that is a string, not a date so a range filter won't work on it as you expect. In that case you need to use the '@timestamp' field in your filter which _looks_ like it is a date. To say more I'd have to see the mappings
AHH! In that case how would we adapt our range filter to use the @timestamp field? I pathetically attempted
s = Search(using=es).filter('term', response=404).filter('range', '@timestamp'={'gte': 'now-5m' , 'lt': 'now'})
and I get SyntaxError: keyword can't be an expression
Also this is the logstash config, hopefully this helps. If not is there another way to find the mapping? I don't actually know what that is.
filter {
ββif [type] == "apache" {
ββββgrok {
ββββββ# See the following URL for a complete list of named patterns
ββββββ# logstash/grok ships with by default:
ββββββ# https://github.com/logstash/logstash/tree/master/patterns
ββββββ#
ββββββ# The grok filter will use the below pattern and on successful match use
ββββββ# any captured values as new fields in the event.
ββββββmatch => { "message" => "%{COMBINEDAPACHELOG}" }
ββββ}
ββββdate {
ββββββ# Try to pull the timestamp from the 'timestamp' field (parsed above with
ββββββ# grok). The apache time format looks like: "18/Aug/2011:05:44:34 -0700"
ββββββmatch => [ "timestamp", "dd/MMM/yyyy:HH:mm:ss Z" ]
ββββ}
ββ}
}
use **{'@timestamp': {'gte': 'now-5m' , 'lt': 'now'}}
you meant this right?
s = Search(using=es).filter('term', response=404).filter('range' , **{'@timestamp': {'gte': 'now-5m' , 'lt': 'now'}})
Because that seems to be doing the same thing.
after some guessing, i eventually arrived at a huge amount of failed requests and a nested: QueryParsingException[[logstash-2011.08.31] [range] filter does not support [@timestamp]]; }]')
This was the line I used. I think i'm just a syntax away
.filter( 'range', {'@timestamp':{'gte': 'now-5m' , 'lt': 'now'} } )
I think this is actually a logstash problem and not one of this API itself?
No, this is neither a logstash problem, nor is it a problem with the library. You are missing ** in front of the dict, it should be:
.filter('range', **{'@timestamp':{'gte': 'now-5m' , 'lt': 'now'}})
Also please stop appending questions to this ticket, this is really not the place to help you and I will not respond to any more questions here. Feel free to contact me another way. Thank you
No, Sincerely Honza thank you. You have dealt with all the tedious requests on my part and I can't thank you enough for all of your help. It works and that is completely thanks to you. Have a great day,
thanks again for everything!
I'll try to submit a PR with some documentation of the stuff we've talked about here, it's the least I can do.
Documentation would be amazing and I would be very grateful for it, thanks!
Hi All,
I am newbie to ElasticSearch and its java api. I want to search result that are logged within a specific range of date and time. I put this question on stackoverflow but no response so far. You can see the question from
I know on this page already most relevent discussion is done but still i could not run it. Kindly help me to begin with the good startup with elasticsearch and java api.
Best Regards,
@jawadyousaf I am sorry, but this is an issue tracker for the Python client to Elasticsearch. You will probably have better luck in the mailing list or irc. Either way I have to ask you not to hijack the discussion here. Thank you
To point you in the right direction you want a FilteredQuery and a RangeFilter as it's filter element. The json equivalent would be:
{
"query": {
"filtered": {
"query": {},
"filter": {
"range": {"@timestamp":{"gt": "now-1d"}}
}
}
}
}
Most helpful comment
No, this is neither a logstash problem, nor is it a problem with the library. You are missing
**in front of the dict, it should be:Also please stop appending questions to this ticket, this is really not the place to help you and I will not respond to any more questions here. Feel free to contact me another way. Thank you