Name # reqs # fails Avg Min Max | Median req/s
----------------------------------------------------------------------------------------------------
POST long 1 0(0.00%) 6067 6067 6067 | 6100 0.00
POST short 1 0(0.00%) 1239 1239 1239 | 1200 0.20
----------------------------------------------------------------------------------------------------
Total 2 0(0.00%) 0.20
Name # reqs # fails Avg Min Max | Median req/s
--------------------------------------------------------------------------------------------------------------------------------------------
POST medium 7 0(0.00%) 468 199 1006 | 360
POST short 2 0(0.00%) 1063 574 1553 | 570 0.33
----------------------------------------------------------------------------------------------------
Total 9 0(0.00%) 1.00
This only seems to be an intermittent issue, but notice how the median response times are not within the bounds of min and max.
Anyone else ran into a similar issue?
Yes! I had someone at the office show me a situation where they ran 10 requests, the average was way outside of the bounds of what the logs showed. So probably not just an issue with the median, seems to be an issue with reporting. It also seems to be intermittent for us too so I haven't really gone too deep on it yet.
In order to calculate the median response time, as well as response times for specific percentiles, without storing the response time for every single request, we keep a dict of the following format: {response_time: numer_of_requests}. In order to save memory we round the response time to only use two digits of precision (so that 6067 becomes 6100, 1239 -> 1200, 574 -> 570, and so on). Since that dict is used to calculate the median response times, while the exact response times are used when calculating min/max, it could happen that the median ends up outside of the min/max boundaries (especially for few requests).
I guess it would probably be good to mention that we only use two digits precision for median & percentile response times, in the docs and the web UI.
I would like to know the status if this issue. I understand is a precision problem because only 2 digits are taken, but I don't understand why this is not improved. Having values of median over max and min is not what it's expected in a tool that's used to measure performance. Wouldn't it be possible to use more accuracy, or at least, round it to the max or min value in this cases?
Having values of median over max and min is not what it's expected
in a tool that's used to measure performance
agreed.. but is #790 really any better? it just masks the issue rather than adding precision. why don't you submit a PR that uses 3 digit precision and see how that works? (we will get a more precise result at the expense of memory used)
Yes, I think #790 is better than the current situation. Of course using 3 digit precision would be better but I am not involved in locustio development and I didn't want to go so far in the changes.
790 gives something that's logical from a stats point of view
instead of something that's an impossible value.
it might make it look logical, but the stats are not correct. That's slightly better than the current situation I suppose.. but I'd rather someone fix the actual issue.
Well, they are correct when you have only one hit, what's better.
I agree it's better to fix the actual issue, but also, this is aslight improvement. Currently you can't show the stats to anyone because immediately will tell you they are wrong.
So, while someone else has time or resources to provide an exact solution, I think #790 can save locust users of some red faces.
Most helpful comment
In order to calculate the median response time, as well as response times for specific percentiles, without storing the response time for every single request, we keep a dict of the following format:
{response_time: numer_of_requests}. In order to save memory we round the response time to only use two digits of precision (so that 6067 becomes 6100, 1239 -> 1200, 574 -> 570, and so on). Since that dict is used to calculate the median response times, while the exact response times are used when calculating min/max, it could happen that the median ends up outside of the min/max boundaries (especially for few requests).I guess it would probably be good to mention that we only use two digits precision for median & percentile response times, in the docs and the web UI.