K6: Some metrics are returning 0 as minimum value

Created on 11 Jul 2018  Â·  4Comments  Â·  Source: loadimpact/k6

Scenario:
When running load tests with more than 1 VU, the some results contain 0 as minimum value:
Version 0.22.0

k6 run --address 127.0.0.1:6566 --tag script=script2 --out influxdb=http://localhost:8086/k6db script2.js



  execution: local
     output: influxdb=http://localhost:8086/k6db (http://localhost:8086)
     script: script2.js

    duration: -,  iterations: -
         vus: 1, max: 20

    done [==========================================================] 50s / 50s

    ✓ status was 200

    checks.....................: 100.00% ✓ 195  ✗ 0   
    data_received..............: 4.9 MB  98 kB/s
    data_sent..................: 30 kB   608 B/s
    http_req_blocked...........: avg=3.78ms   min=0s       med=6.77µs   max=105.45ms p(90)=20.67µs  p(95)=34.93ms 
    http_req_connecting........: avg=666.65µs min=0s       med=0s       max=52.45ms  p(90)=0s       p(95)=1.8ms   
    http_req_duration..........: avg=1.63s    min=0s       med=1.65s    max=14.29s   p(90)=2.73s    p(95)=3.1s    
import http from "k6/http";
    http_req_receiving.........: avg=105.46ms min=0s       med=107.89ms max=370.95ms p(90)=207.78ms p(95)=247.68ms
    http_req_sending...........: avg=41.98µs  min=0s       med=34.74µs  max=177.4µs  p(90)=75.56µs  p(95)=96.34µs 
    http_req_tls_handshaking...: avg=3.08ms   min=0s       med=0s       max=64.83ms  p(90)=0s       p(95)=29.69ms 
    http_req_waiting...........: avg=1.52s    min=0s       med=1.53s    max=14.26s   p(90)=2.57s    p(95)=2.97s   
    http_reqs..................: 215     4.299993/s
    iteration_duration.........: avg=2.5s     min=119.84µs med=2.63s    max=10.15s   p(90)=3.72s    p(95)=3.97s   
    iterations.................: 212     4.239993/s
    vus........................: 6       min=1  max=20
    vus_max....................: 20      min=20 max=20

On version 0.21.1 this doesn't happen:

~/k6⟫ ../backup/k6-v0.21.1-linux64/k6 run --address 127.0.0.1:6567  --tag script=script2 --out influxdb=http://localhost:8086/k6db script2.js

          /\      |‾‾|  /‾‾/  /‾/   
     /\  /  \     |  |_/  /  / /   
    /  \/    \    |      |  /  ‾‾\  
   /          \   |  |‾\  \ | (_) | 
  / __________ \  |__|  \__\ \___/ .io

  execution: local
     output: influxdb=http://localhost:8086/k6db (http://localhost:8086)
     script: script2.js

    duration: -,  iterations: -
         vus: 1, max: 20

    done [==========================================================] 50s / 50s

    ✓ status was 200

    checks.....................: 100.00% ✓ 308  ✗ 0   
    data_received..............: 7.7 MB  154 kB/s
    data_sent..................: 42 kB   836 B/s
    http_req_blocked...........: avg=1.5ms    min=2.9µs    med=6.5µs    max=43.96ms  p(90)=14.64µs p(95)=14.95ms 
    http_req_connecting........: avg=120.53µs min=0s       med=0s       max=4.97ms   p(90)=0s      p(95)=1.26ms  
    http_req_duration..........: avg=774.55ms min=135.95ms med=626.94ms max=9.23s    p(90)=1.28s   p(95)=1.37s   
    http_req_receiving.........: avg=42.94ms  min=5.09ms   med=34.12ms  max=224.73ms p(90)=90.38ms p(95)=117.89ms
    http_req_sending...........: avg=39.06µs  min=14.49µs  med=31.81µs  max=171.61µs p(90)=61.05µs p(95)=85.2µs  
    http_req_tls_handshaking...: avg=1.36ms   min=0s       med=0s       max=39.72ms  p(90)=0s      p(95)=11.42ms 
    http_req_waiting...........: avg=731.56ms min=129.92ms med=588.45ms max=9.12s    p(90)=1.19s   p(95)=1.27s   
    http_reqs..................: 308     6.15999/s
    iteration_duration.........: avg=1.78s    min=1.13s    med=1.63s    max=10.23s   p(90)=2.28s   p(95)=2.37s   
    iterations.................: 320     6.399989/s
    vus........................: 6       min=1  max=20
    vus_max....................: 20      min=20 max=20

script:

import http from "k6/http";
import { check, sleep } from "k6";

export let options = {
  stages: [{ duration: "30s", target: 20 }, { duration: "20s", target: 5 }],
  insecureSkipTLSVerify: true
};

export default function() {
  let res = http.get("https://example.local");
  check(res, {
    "status was 200": r => r.status == 200
  });
  sleep(1);
}
bug high prio

All 4 comments

As I mentioned in the slack chat, 0 is a normal minimum value for http_req_connecting and http_req_tls_handshaking when there are reused connections from keep-alive requests. So the 0.21.1 results looks perfectly normal, while the 0.22.0 summary is very strange.

Here are some results from 0.22.0 and 0.21.1 from me as well, running the same script on a localhost https server. For 0.22.0:

    checks.....................: 100.00% ✓ 570  ✗ 0   
    data_received..............: 102 kB  2.0 kB/s
    data_sent..................: 68 kB   1.4 kB/s
    http_req_blocked...........: avg=146.4µs  min=0s       med=16.28µs  max=7.78ms   p(90)=25.94µs  p(95)=76.08µs 
    http_req_connecting........: avg=10.15µs  min=0s       med=0s       max=862.79µs p(90)=0s       p(95)=0s      
    http_req_duration..........: avg=779.76µs min=0s       med=817.67µs max=2.97ms   p(90)=1.27ms   p(95)=1.34ms  
    http_req_receiving.........: avg=131.53µs min=0s       med=121.23µs max=454.45µs p(90)=239.31µs p(95)=272.46µs
    http_req_sending...........: avg=53.38µs  min=0s       med=50.03µs  max=279.11µs p(90)=95.89µs  p(95)=127.73µs
    http_req_tls_handshaking...: avg=112.54µs min=0s       med=0s       max=6.71ms   p(90)=0s       p(95)=0s      
    http_req_waiting...........: avg=594.85µs min=0s       med=573.76µs max=2.55ms   p(90)=979.33µs p(95)=1.04ms  
    http_reqs..................: 590     11.799933/s
    iteration_duration.........: avg=953.99ms min=147.34µs med=1s       max=1.01s    p(90)=1s       p(95)=1s      
    iterations.................: 584     11.679934/s
    vus........................: 6       min=1  max=20
    vus_max....................: 20      min=20 max=20

for 0.21.1:

    checks.....................: 100.00% ✓ 555  ✗ 0   
    data_received..............: 100 kB  2.0 kB/s
    data_sent..................: 66 kB   1.3 kB/s
    http_req_blocked...........: avg=156.63µs min=4.23µs   med=7.98µs   max=7.53ms   p(90)=24.26µs  p(95)=29.61µs 
    http_req_connecting........: avg=12.3µs   min=0s       med=0s       max=772.29µs p(90)=0s       p(95)=0s      
    http_req_duration..........: avg=680.61µs min=258.53µs med=442.98µs max=2.62ms   p(90)=1.21ms   p(95)=1.3ms   
    http_req_receiving.........: avg=115.4µs  min=32.23µs  med=81.4µs   max=526.88µs p(90)=220.52µs p(95)=267.73µs
    http_req_sending...........: avg=47.6µs   min=14µs     med=34.35µs  max=465.45µs p(90)=77.94µs  p(95)=112.71µs
    http_req_tls_handshaking...: avg=124.24µs min=0s       med=0s       max=6.59ms   p(90)=0s       p(95)=0s      
    http_req_waiting...........: avg=517.61µs min=148.68µs med=339.66µs max=2.34ms   p(90)=939.4µs  p(95)=983.98µs
    http_reqs..................: 555     11.09996/s
    iteration_duration.........: avg=1s       min=1s       med=1s       max=1.01s    p(90)=1s       p(95)=1s      
    iterations.................: 564     11.279959/s
    vus........................: 6       min=1  max=20
    vus_max....................: 20      min=20 max=20

I'm pasting those, because it seems that when running on localhost (where there's no network latency, compared to @ampc results in a LAN), the difference between the average values for a particular metric between 0.22.0 and 0.21.1 isn't that much. It seems likely that there are a few zero-filled http trail metrics (or even just a single one) that manage to skew the results so there's a 0 for the minimum value everywhere.

Debugging this more, the bug seems to happen once the ramp-down phase starts, i.e. after 30 seconds have passed in this particular example script. The problem disappears with any of these script modifications:

  • if I remove stages and just use {vus: 20, duration: "50s"} for options
  • if I leave only the first ramp-up stage
  • if I leave both stages, but use a value of 20 or higher for the target of the second stage

I currently suspect it's likely that have I've broken something with the massive refactoring I did in this pull request...

Current working hypothesis: when scaling down, the local executor cancel's a vu's context and thus the HTTP request. But the surrounding http request code doesn't check that and pushes an empty netext.Tracer in the metric samples channel.

As seen in the above pull request, my hypothesis from yesterday seems to be correct. The fix needs more polishing and testing, including some thinking if there are metrics we actually want to emit even in VUs that are winding down.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

Julianhm9612 picture Julianhm9612  Â·  4Comments

euclid1990 picture euclid1990  Â·  3Comments

na-- picture na--  Â·  3Comments

if-kenn picture if-kenn  Â·  4Comments

Jonne picture Jonne  Â·  4Comments