K6: Cleanup of the custom metrics

Created on 7 May 2020  路  4Comments  路  Source: loadimpact/k6

Custom metrics (Counter, Gauge, Trend, Rate) behave unexpectedly when non-numeric values are added. Some non-numeric values are casted to numeric values (true => 1) and some break the internal state of the metric.

Currently, any value passed to .add() is accepted and treated in some way towards the internal state.

Regardless of what is passed to .add(), k6 doesn't raise exceptions. This is the correct behavior of a load testing tool because the tool needs to be resilient against failing SUT.
SUT failures are expected and should not cause the test execution to be unreliable.
Non-numeric values should not be counted towards the internal state, but they should not raise exceptions either.

Rate

Rate considers undefined, NaN and "strings" as positive values and counts them towards the internal state.

It considers false and null as negative values and marshals both to 0. I think this is probably correct.

I have to say that it's rather funny that value -1 is considered positive and is marshaled to +1. Perhaps this is by design, but I believe that negative values should be counted towards the "negative" side of the metric.

All non-numeric values fail JSON marshaling when the JSON output is used k6 run -o json

WARN[0001] JSON: Sample couldn't be marshalled to JSON   error="json: unsupported value: NaN" filename=

Proposed refactoring

Positive values: true, 1,2,3..., 0.1,...,
Negative values: null, false, 0, -1, -2,..., -0.1,...
Error values: Nan, undefined, "string", any other non-numeric value

All positive values should be marshaled to 1, all negative values should be marshaled to 0.

When an error value is added to the metric, k6 should produce a warning, increment an invalid_value counter for the metric, and continue with the iteration.
k6 should not raise Exception and abort the iteration.

Example:

rate.add( response.json("crocodile.0.is_happy"); // no ifs/validation is required from the user, and the execution of the iteration will continue when the "is_happy" is "undefined". 
// same as rate.add(undefined);
WARN[000X] Rate metric: unsupported value: undefined

The result of this should look similar to:
image

I think this behavior makes k6 more resilient when executing stress tests. Ideally, the user should not need to worry about breaking iterations when the SUT is crashing.

Counter

The current default behavior of the Counter metric is not to count but to sum. I believe we should consider it a bug.

Example:

let c = Counter("my_counter");
c.add(-200);
c.add(200);

// c.count is `0` while it should be `2`.
// c.sum should be `0`

The original metrics implementation in k6 was partially based on Prometheus (https://github.com/loadimpact/k6/issues/429#issuecomment-352901424) which has a Counter type that is cumulative (https://prometheus.io/docs/concepts/metric_types/#counter).
As the WIP Openmetrics project is going to be some extension of the Prometheus format, it should be considered in the context of k6 as well (https://github.com/loadimpact/k6/issues/858, although we don鈥檛 have to follow Prometheus of course if we don鈥檛 think it makes any sense).

The proposed change adds functionality to the Counter metric and fixes bugs, but doesn't break the loose compatibility with Prometheus definition of the metric.

proposed refactoring.

The Counter metric should have 2 default aggregator methods: count and sum.

count should be incremented on every call to .add().
sum should be modified on every numeric value .add(numeric_value).

image

 c.add(false);     // increment count, don't modify sum
 c.add(true);      // increment count, don't modify sum
 c.add(0);         // increment count, don't modify sum
 c.add(1);         // increment count, modify sum
 c.add(-1);        // increment count, modify sum
 c.add(0.1);        // increment count, modify sum
 c.add(null);     // increment count, don't modify sum
 c.add(undefined); // increment count, don't modify sum
 c.add(NaN);       // increment count, don't modify sum
 c.add("string");  // increment count, don't modify sum
 c.add(r);         // increment count, don't modify sum

Gauge

proposed refactoring.

 g.add(false);     // increment `invalid_value` counter. Reset gauge to 0.
 g.add(true);      // increment `invalid_value` counter. Reset gauge to 0.
 g.add(0);         // state: 0  // CORRECT // marshalls to 0
 g.add(1);         // state: 1  // CORRECT // marshalls to 1
 g.add(-1);        // state: -1 // CORRECT // marshalls to -1
 g.add(0.1);        // state: 0.1 // CORRECT // marshalls to 0.1
 g.add(null);      // increment `invalid_value` counter. Reset gauge to 0.
 g.add(undefined); //  increment `invalid_value` counter. Reset gauge to 0.
 g.add(NaN);       //  increment `invalid_value` counter. Reset gauge to 0.
 g.add("string");  //  increment `invalid_value` counter. Reset gauge to 0.
 g.add(r);         // increment `invalid_value` counter. Reset gauge to 0.

image

Trend

The trend metric is currently the most resilient metric type we have. It doesn't completely die when invalid values are added. Nevertheless, it exhibits incorrect behavior in several cases.

 t.add(false);      // increment `invalid_value` counter. Don't modify the internal state
 t.add(true);      // increment `invalid_value` counter. Don't modify the internal state
 t.add(0);         // state: 0  // CORRECT // marshalls to 0
 t.add(1);         // state: 1  // CORRECT // marshalls to 1
 t.add(-1);        // state: -1 // CORRECT // marshalls to -1
 t.add(null);     // increment `invalid_value` counter. Don't modify the internal state
 t.add(undefined); // increment `invalid_value` counter. Don't modify the internal state
 t.add(NaN);      // increment `invalid_value` counter. Don't modify the internal state
 t.add("string"); // increment `invalid_value` counter. Don't modify the internal state
 t.add(r);        // increment `invalid_value` counter. Don't modify the internal state

Current Behavior

import {sleep} from "k6"
import { Counter, Gauge, Trend, Rate } from "k6/metrics";

let r = Rate("my_rate");
let c = Counter("my_counter");
let g = Gauge("my_gauge");
let t = Trend("my_trend");

function testRate(){
 r.add(false);     // state -1 //    ?    // marshalls to 0
 r.add(true);      // state +1 //    ?    // marshalls to 1
 r.add(0);         // state -1 // CORRECT // marshalls to 0
 r.add(1);         // state +1 // CORRECT // marshalls to 1
 r.add(-1);        // state +1 // WAT?    // marshals to -1
 r.add("string");  // state +1 //    ?    // fails json marshaling
 r.add(undefined); // state +1 //  BUG    // fails json marshaling
 r.add(null);      // state -1 //    ?    // marshals to "0"
 r.add(NaN);       // state +1 //    ?    // fails JSON marshaling
 r.add(r);         // state +1 //  BUG    // fails JSON marshaling
}

function testCounter(){
 c.add(false);     // no change // CORRECT // marshalls to 0
 c.add(true);      // state +1  // CORRECT // marshalls to 1
 c.add(0);         // state -1  // CORRECT // marshalls to 0
 c.add(1);         // state +1  // CORRECT // marshalls to 1
 c.add(-1);        // state -1  // CORRECT // marshals to -1
 c.add(null);      // no change //   ?     // marshals to "0"
 c.add(undefined); // BUG - state change to NaN and the entire metric is broken
 c.add(NaN);       // BUG - the same
 c.add("string");  // BUG - the same
 c.add(r);         // BUG - the same
}

function testGauge(){
 g.add(false);     // state: 0  //   ?     // marshalls to 0 - should it?
 g.add(true);      // state: 1  //   ?     // marshalls to 1
 g.add(0);         // state: 0  // CORRECT // marshalls to 0
 g.add(1);         // state: 1  // CORRECT // marshalls to 1
 g.add(-1);        // state: -1 // CORRECT // marshals to -1
 g.add(null);      // state: 0  //   ?     // marshals to 0
 g.add(undefined); // BUG - state change to NaN and the entire metric is broken
 g.add(NaN);       // BUG - the same
 g.add("string");  // BUG - the same
 g.add(r);         // BUG - the same
}

function testTrend(){
 t.add(false);     // state: 0  //   invalid     // marshalls to 0 - should it?
 t.add(true);      // state: 1  //   invalid     // marshalls to 1
 t.add(0);         // state: 0  // CORRECT // marshalls to 0
 t.add(1);         // state: 1  // CORRECT // marshalls to 1
 t.add(-1);        // state: -1 // CORRECT // marshals to -1
 t.add(null);      // state: 0  //   invalid     // marshals to 0
 t.add(undefined); // BUG - avg changes to NaN - other aggregation functions seem unaffected?
 t.add(NaN);       // BUG - the same
 t.add("string");  // BUG - the same
 t.add(r);         // BUG - the same
}

export default function() {

 testRate();
 testCounter();
 testGauge();
 testTrend();

 sleep(0.1);
}

image

Prevent users from redefining built-in metrics

Example script.

import http from 'k6/http';
import {sleep} from 'k6';
import { Counter } from 'k6/metrics';
export let options = {
  iterations: 10,
  vus: 1,
};
let durationCounter = new Counter('http_req_duration');
export default function() {
  http.get('https://test-api.k6.io/public/crocodiles/1/');
  durationCounter.add(__ITER * 10000);
  sleep(1);
};

Produces Trend metric with values from both, built-in Trend metric and the custom Counter metric.
image
The k6 cloud backend is also confused, showing the values of the built-in Trend metric on the main chart and Counter values in the Analysis Panel.

Suggested error:
ERRO: you can't redefine the 'http_req_duration' built-in metric.

Additional options

(To be discussed)

Allow users to specify the initial value for metrics

new Counter('http_req_errors', {initial: 0, });
bug high prio

Most helpful comment

OpenMetrics spec was released a few weeks ago.

Probably it's interesting to give it a look while working on this ticket! (Pawel said that k6 metrics implementation was more-or-less based on Prometheus format).

If our metrics adhere to the spec and we follow the best practices, there is less context switch for users working with k6 that come from OTel or Prometheus.

All 4 comments

Found a previous issue of yours on the same topic :sweat_smile: https://github.com/loadimpact/k6/issues/908

This is also causing issues when streaming metrics to the cloud (and probably other outputs), we get NaN as the value.

OpenMetrics spec was released a few weeks ago.

Probably it's interesting to give it a look while working on this ticket! (Pawel said that k6 metrics implementation was more-or-less based on Prometheus format).

If our metrics adhere to the spec and we follow the best practices, there is less context switch for users working with k6 that come from OTel or Prometheus.

https://github.com/loadimpact/k6/issues/1832 (or something like it) is a prerequisite for the current issue.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

StephenRadachy picture StephenRadachy  路  3Comments

sniku picture sniku  路  3Comments

ppcano picture ppcano  路  3Comments

na-- picture na--  路  4Comments

caalle picture caalle  路  4Comments