K6: Proposal for adding a new standard `http_req_failures` metric

Created on 28 Jan 2021  ยท  27Comments  ยท  Source: loadimpact/k6

I'm opening a new issue instead of building on top of https://github.com/loadimpact/k6/issues/1311 to start the discussion with a clean slate. I believe this proposal addresses most if not all issues brought up in the original proposal.

Before discussing the implementation details and effort required to build this feature, let's discuss why we want this feature at all.

  1. Users want to know if their http requests fail or not, without adding boilerplate code.
  2. Users want to see the http timing metrics for successful requests.
  3. Users want to see the ratio between successful and failed requests (0.1% failures is often acceptable)
  4. Some users may want to know the absolute number of failed requests (15 requests failed)

Basic requirements

This basic script must:

  1. show the number of failed requests,
  2. show response time for successful requests,
  3. test must exit with non-0 exit code because the threshold is crossed.
import { sleep } from 'k6'
import http from 'k6/http'

export let options = {
  thresholds: {
    // test fails if more than 10% of HTTP requests fail. 
    // default http_success_hook is used to determine successful status codes.
    http_reqs_failure: ['rate < 0.1'],
  }
};

export default function() {
 let response_success  = http.get("https://httpbin.test.k6.io/status/200");
 let response_failure1 = http.get("https://httpbin.test.k6.io/status/503");
 let response_failure2 = http.get("https://httpbin.test.k6.io/status/503");
}

Discussion about the metric type for http_reqs_failure

There are two possible metric types for the new http_reqs_failure. Rate or Counter.
Both types have their advantages, and it's not entirely clear which one is better for this use case.

Advantages of Rate:

  • ability to easily configure thresholds http_reqs_failure: rate<0.1
  • shows rate...

Advantages of Counter:

  • shows failures per second
  • shows the total number of failed requests (might be useful to some users)
  • is consistent with http_reqs metric.

The end-of-test summary for this test should look similar to this:

Output when using Counter metric

http_reqs..................: 3      3.076683/s
http_reqs_failure..........: 2      2.036683/s
http_reqs_success..........: 1      1.036683/s

Output when using Rate metric

http_reqs..................: 3      3.134432/s
http_reqs_failure..........: 33.33% โœ“ 1 โœ— 2
http_reqs_success..........: 66.66% โœ“ 2 โœ— 1

Neither Rate nor Counter covers all possible use-cases. I think Rate is preferable over Counter.

If we added count to the Rate metric, the output could possibly look similar to this

http_reqs..................: 3      0.136/s    
http_reqs_failure..........: 33.33% โœ“ 2 (66.66%) 3.136/s   โœ— 1 (33.33%) 0.136/s    
http_reqs_success..........: 66.66% โœ“ 1 (33.33%) 3.136/s   โœ— 2 (66.66%) 0.136/s    

Note, I'm not really advocating for this, just pointing out that neither Rate nor Counter cover all use cases.

Why do we have failures and successes as separate metrics?

The obvious critique of the above suggestion is to say that http_reqs_success is unnecessary because it's opposite of http_reqs_failure. This is true, but some outputs don't allow to define logic, and therefore it's not possible to show http_reqs_success unless k6 itself produces it.

Once the metric filtering feature is developed, I would suggest we exclude http_reqs_success by default.

http_req_duration and other http_req_* metrics.

The core requirement of this feature is to be able to see http_req_duration for successful requests only.

There are two possibilities here:

  1. Don't emit http_req_duration for failures
  2. Tag http_req_duration with failed:true|false tag and display filtered values.

Let's discuss both approaches

Don't emit http Trend metrics for failed requests

In this approach, http_req_duration and other http metrics won't include failed requests towards the metric's internal state.

Users who want to track error timings can define custom metrics like this:

  let http_4xx = new Trend('http_4xx');

  let res = http.get('http://test.k6.io');
  if(res.status > 400 && res.status <= 499){
    http4xx.add(res.timings.http_req_duration);
  }

Tag http_req_duration with failed:true|false tag and display filtered values.

With this approach, we would emit the http_req_duration and friends as we used to, but we will tag values with failed:true|false

The default-end-of-test summary would display only successful requests like this:

http_req_duration{failed:false}...: avg=132.76ms min=127.19ms med=132.76ms max=138.33ms p(90)=137.22ms 
http_reqs........................: 3      3.076683/s
http_reqs_failure................: 2      2.036683/s
http_reqs_success................: 1      1.036683/s
iteration_duration...............: avg=932.4ms  min=932.4ms  med=932.4ms  max=932.4ms  p(90)=932.4ms  p(95)=932.4ms 
iterations.......................: 1      1.038341/s

The most problematic issue with this approach is that some outputs don't ingest tags and won't be able to display http_req_duration for successful requests only.

Examples of http_req_duration and http_reqs_failure k6 should produce with this approach.

{
  "type": "Point",
  "metric": "http_req_duration",
  "data": {
    "time": "2021-01-22T12:40:08.277031832+01:00",
    "value": 0.032868,
    "tags": {
      "error_code": "1501",
      "group": "",
      "method": "GET",
      "name": "https://httpbin.test.k6.io/status/501",
      "proto": "HTTP/2.0",
      "scenario": "default",
      "status": "501",
      "tls_version": "tls1.2",
      "url": "https://httpbin.test.k6.io/status/501",
      "failed": true,  // see reasoning below in the "cloud support" section
    }
  }
}

{
  "type": "Point",
  "metric": "http_reqs_failure",
  "data": {
    "time": "2021-01-22T12:40:08.277031832+01:00",
    "value": 1,
    "tags": {
      // same tags as for http_req_duration
      "error_code": "1501",
      "status": "501",
      "name": "https://httpbin.test.k6.io/status/501",
      "group": "",
      "method": "GET",
      "proto": "HTTP/2.0",
      "scenario": "default",
      "tls_version": "tls1.2",
    }
  }
}

Cloud support

There are additional considerations for the k6 cloud support.

Performance insights

The "performance insights" feature and web app currently assume that successful requests have status 200-399.

  • The "URL table" displays statuses >=400 with red background
  • There are several performance alerts that show up when there are sufficiently many requests with statuses >=400

Both approaches listed above solve these problems, although in different ways.

In approach 1, we would only get timings for successful requests and therefore we won't show timings for failed requests.
We will still get tagged http_reqs_failure metrics and therefore will be able to show errors without timings.
image
We would probably redesign this UI to separate failures from successes in a better way.

In approach 2, we would get a new standard tag called failed to all http_req_* metrics, including http_req_li_all.
Timings would still be shown for errors (although probably not useful), but the background of the row would be determined by the failed tag.

image

  {
    "type": "Points",
    "metric": "http_req_li_all",
    "data": {
      "time": "1604394111659104",
      "type": "counter",
      "tags": {
        "tls_version": "tls1.2",
        "group": "",
        "scenario": "default",
        "url": "https://test.k6.io",
        "name": "https://test.k6.io",
        "method": "GET",
        "status": "200",
        "proto": "HTTP/2.0",
        "failed": false
      },
      "values": {
        "http_req_waiting": 123.88875,
        "http_req_receiving": 0.215741,
        "http_req_duration": 124.419757,
        "http_req_blocked": 432.893314,
        "http_req_connecting": 122.01245,
        "http_req_tls_handshaking": 278.872101,
        "http_req_sending": 0.315266,
        "http_reqs": 10,
        "http_reqs_success": 10
      }
    }
  },
  {
    "type": "Points",
    "metric": "http_req_li_all",
    "data": {
      "time": "1604394111659104",
      "type": "counter",
      "tags": {
        "tls_version": "tls1.2",
        "group": "",
        "scenario": "default",
        "url": "https://test.k6.io",
        "name": "https://test.k6.io",
        "method": "GET",
        "status": "200",
        "proto": "HTTP/2.0",
        "failed": true
      },
      "values": {
        "http_req_waiting": 23.88875,
        "http_req_receiving": 0.215741,
        "http_req_duration": 24.419757,
        "http_req_blocked": 32.893314,
        "http_req_connecting": 22.01245,
        "http_req_tls_handshaking": 78.872101,
        "http_req_sending": 0.315266,
        "http_reqs": 10,
        "http_reqs_failure": 10
      }
    }
  }

(alternative) Why don't we skip the new metrics and purely rely on failed tag?

It's possible to extend the existing http_reqs counter metric by tagging requests with failed and changing the metric type to Rate. If that's done, the following script would be possible,

import { sleep } from 'k6'
import http from 'k6/http'

export let options = {
  thresholds: {
    'http_reqs{failed:true}': ['rate < 0.1'],
  }
};

export default function() {
 let response_success  = http.get("https://httpbin.test.k6.io/status/200");
 let response_failure1 = http.get("https://httpbin.test.k6.io/status/503");
 let response_failure2 = http.get("https://httpbin.test.k6.io/status/503");
}

Possible end-of-test summary:

http_reqs...................: 3      โœ“ 1 โœ— 2     3.134432/s
http_reqs{failed:true}..: 33.33% โœ“ 1 โœ— 2     1.034432/s
http_reqs{failed:false}.: 66.66% โœ“ 2 โœ— 1     2.104432/s

Possible problems with this approach are:

  • some outputs don't ingest tags. It would not be possible to use this functionality with statsd
  • http_reqs would be backwards incompatible unless we combine rate and counter into a new metric type.
  • some "special case" handling would be required for displaying.

I'm (currently) against this alternative.

Defining failure

To determine if a request has failed or succeeded, a JavaScript hook function is invoked after the request, but before the metrics emission. This proposal builds on https://github.com/loadimpact/k6/issues/1716

import http from 'k6/http'

export let options = {
  hooks: {
    http: {
      successHook: 'myHttpSuccessHook',
    }
  }
};

export function myHttpSuccessHook(response){
  // returns boolean true|false
  // adds failed = true|false tag
  // decides if the metric goes into http_req_duration.
  // default implementation: return response.status >= 200 && response.status <= 399
  return response.status >= 200 && response.status <= 204
}

export default function() {
 let response_success  = http.get("https://httpbin.test.k6.io/status/200");
 let response_failure1 = http.get("https://httpbin.test.k6.io/status/503");
 let response_failure2 = http.get("https://httpbin.test.k6.io/status/503");
}

per-request handling

Sometimes users need to handle special cases.

Alternative 1 - handle inside the hook

import http from 'k6/http'

export let options = {
  hooks: {
    http: {
      successHook: 'myHttpSuccessHook',
    }
  }
};

export function myHttpSuccessHook(response){
  if(response.request.name === 'https://httpbin.test.k6.io/status/403'){
    return response.status === 403 // expecting 403 for this specific URL
  }
  return response.status >= 200 && response.status <= 204
}

export default function() {
 let response_success  = http.get("https://httpbin.test.k6.io/status/200");
 let response_failure1 = http.get("https://httpbin.test.k6.io/status/503");
 let response_failure2 = http.get("https://httpbin.test.k6.io/status/503");
}

Alternative 2 - override the hook per request

import { sleep } from 'k6'
import { Rate } from 'k6/metrics'
import http from 'k6/http'

export default function() {
 let response_failure1 = http.get("https://httpbin.test.k6.io/status/503");
 let response_failure2 = http.get("https://httpbin.test.k6.io/status/503");
 let response_success  = http.get("https://httpbin.test.k6.io/status/403", {
  successHook: (r) => r.status===403
 });
}

What about the redirect chains?

This is up for discussion.

import { sleep } from 'k6'
import { Rate } from 'k6/metrics'
import http from 'k6/http'

export default function() {
 let response_success  = http.get("http://httpbin.test.k6.io/absolute-redirect/5", {
  successHook: (r) => r.status===200
 });
}

Should the hook fire on every request in the chain or only on the last one?

Concerns

  1. the performance penalty of executing the js hook function on every request.
  2. the performance penalty of adding more data to http_req_li_all and other http_req_* metrics.
feature

Most helpful comment

So I'm just thinking a bit of how to maintain a JavaScript feel to it, and my suggestion might be way of. But here goes.

As for the interface for manipulating the response event (or what you'd call it) I would go for something like the following:

const redirect = new Counter("redirect")

http.on("response", (ev) => {
  console.log("default metric: ", ev.metric.name); // whatever k6 thinks it should be

  if (ev.request.url === "https://test.k6.io") {
    // You go ahead and handle this for me!
    return 
  }

  const status = ev.response.status

  if (status >= 400) {
    // Change ev.metric to http_reqs_failure
    ev.fail()
  } else if (status >= 300) {
    redirect.add(1)
    ev.preventDefault() // tell k6 to ignore this ev
  } else {
    // Change ev.metric to http_reqs_success
    ev.succeed()
  }
})

The above just makes sense to me as a JavaScript-developer. It matches how NodeJS and browsers handle these kinds of events. For instance, in the browser anchor.addEventListener("click", ev => ev.preventDefault()) tells the browser to ignore handling the click (the on/off naming is node-like and much nicer than `addEventListener).

We could also have specific events for success/failure in case you know that you only want to modify failure into success or vice versa. Could improve performance a bit, by not being a catch-all.

// An evil API that has reversed error with success status
http.on("failure", ev => {
  ev.succeed()
})

http.on("success", ev => {
  ev.fail()
})

The list goes on what would be possible:

http.on("request", ev => {
    // I'm lazy and don't want to add this everywhere
    ev.request.headers.add("X-Request-Id", uuid())

    // Add tags for every request
    ev.addTags({

    })
})

I could go on and on. ๐Ÿ˜ฌ

All 27 comments

I think the metricEmissionHooks option is unnecessary.

```js

import http from 'k6/http'

export let options = {
metricEmissionHooks: {
http: {
successHook: 'myHttpSuccessHook',
}
}
};

export function myHttpSuccessHook(response){
return response.status >= 200 && response.status <= 204
}

export default function() {
http.get("https://httpbin.test.k6.io/status/200");
}

 Instead, we can define a name convention for the hook method. For example,`didSuccessRequest`.

  ```js

 import http from 'k6/http'

export function didSuccessRequest(response){
    return response.status >= 200 && response.status <= 204
}

export default function() {
    http.get("https://httpbin.test.k6.io/status/200");
}

I'm not that fond of metricEmissionHooks either, but I'm equally skeptic against using naming conventions to define behavior.

I'd much rather see something along the lines of:

export let options = {
  hooks: {
    http: {
      after: 'myHttpSuccessHook' // after might be a bad name, but you get the gist.
    }
  } 
};

(without having to quote the name, if that's even possible, but I have some vague memory of the interop layer requiring it to be a string?)

The issue with these "named exports", is that we really can't guard against errors in a good way.

If the user were to write didSuccesRequest, it wouldn't work and it would be quite hard to spot. If we put it in the options, at least we'd be able to set up types for it, gently pushing the user towards the right syntax.

I've had issues myself more than once due to the current types being too permissive, not warning when you write threshold instead of thresholds for instance.

I am definetely for it not being 1 more magically named one. I was against the handleSummary just being magically named one instead of configurable.

Also arguably it will be nice if you can configure different ones for different scenarios for example.

(without having to quote the name, if that's even possible, but I have some vague memory of the interop layer requiring it to be a string?)

The "options" ultimately need to end up as a JSON and be transported around as such (and to not be recalculated) which means that w/e you have in it needs to be able to get to JSON and back in a way that we can figure out what you meant. We can try to do "magical" stuff so when you wrie a name of a function we automatically figure it out(no idea how, and probably will be finicky), but then writing just the function definition there won't work, which is strange ... so, yeah - we need it to be a string.

Also arguably it will be nice if you can configure different ones for different scenarios for example

Do we really want to provide this?

If you allow the user to set a callback, they will likely have enough flexibility to change the logic based on the current scenario execution. In this case, it is not compulsory to serialize the string name.

1 more magically named one

What you call a "magically named", is just a documented API for me.

@simskij provided another alternative something like http.default.setSuccessHook(callback)

IMO, this type of dynamic string execution looks not very intuitive and error-prone. cc @legander

I'm quite fond of the tagging approach since it avoids adding additional metrics for each request, which are costly to process as it is. After https://github.com/loadimpact/k6/issues/1321 is implemented this might not be as important, but we should still try to minimize adding new metrics if we can avoid it.

Are we sure that statsd doesn't support tags still? It seems tag support was added last year in https://github.com/statsd/statsd/pull/697, released in v0.9.0. If this was the only output that didn't support them, and if we're moving towards supporting OpenMetrics as a generic output (which does support tags from what I've seen), then it might be worth it to consider the tag approach after all.

The performance penalty of executing the JS hook on each request is certainly a concern. Are we sure we need these checks to be dynamic? If status codes are the only metric for determining this, maybe we can avoid the hooks and allow configuring them with something like:

export let options = {
  http: {
    success: {
      status: ['>=200', '<400'],
    }
  }
}

Though I'm not sure if this would perform much better.

OTOH if we need the flexibility for custom failure states determined by headers or something else in the response, then I guess we need the hook based approach. I'm ambivalent about the syntax, and we can certainly refine it later, though I'd also like to avoid using naming conventions for it.

A few short comments to start:

  • Rate will be strictly superior to Count, once we have https://github.com/loadimpact/k6/issues/1312, so my vote is for that
  • http_req_failures seems better than http_reqs_failure
  • About the hook: I am very much against a global exported function with a special name for a hook; a string function in the global options is only _slightly_ better... :disappointed:
  • @simskij's / @ppcano's example above of having a function in the HTTP module to set the hook callback (http.default.setSuccessHook(callback)) seems to be the best proposal to me, though we need to evaluate if we can do it on a per-VU level so we can have different hooks for different VUs/scenarios.
  • Also, I'd prefer that, if we have a hook/callback, it actually be capable of doing more things than determine if the request is "successful"... There are way more use cases for such a hook, so I'd prefer we don't tie our hands with just it returning true/false and being incapable of anything else...

So, to step back the details and speak in more general terms... This is a big feature that, even if we agree on everything (and I have plenty of objections :sweat_smile: ), is unlikely to land in k6 v0.31.0. But given that this feature completely depends on the hook/callback (https://github.com/loadimpact/k6/issues/1716), and that the hook is actually the more complicated part, I think we should focus solely on that as the MVP for v0.31.0. I am not sure we'll manage to do even that for v0.31.0, but if we reach a consensus, I think for sure it's the place we should focus our energies first.

And I also mean the "viable" part in "MVP" :sweat_smile: If the hook is a bit more generic, everything else here might be completely unnecessary... If we have a generic post-request hook, it can be used for the following three things:

  • have all of the possible information to determine if a request is a "failure" or not - status, headers, even request body
  • suppress metrics from some HTTP requests from being emitted (i.e. do not pollute http_req_duration with failures)
  • add extra tags to the metrics (e.g. add failure: true|false to the metrics, one of the suggested approaches above)
  • emit custom metrics (e.g. emit http_req_failures for failed requests, the other suggested approach above)
  • do anything else our users might want...

So, it gives us the most flexibility. And for k6 v0.32.0, when we have had a bit of time to carefully consider things, we can simply set the default such callback in Go to something that we think is reasonable. For example, "do not emit http_req_* metrics for status >= 400 and emit a http_req_failures Rate instead". Or anything else. And then encode that in the cloud, if we want to have smarter results.

Example code:

import http from 'k6/http';
import { Rate } from 'k6/metrics'


let requestFailures = new Rate('http_req_failures')

export let options = {
    thresholds: {
        http_req_failures: ['rate < 0.1'],
    },
};

http.setResponseCallback(function (response, metricsObject) {
    let requestFailed = response.status >= 399 || response.status === 0;
    requestFailures.add(requestFailed);
    if (requestFailed) {
        metricsObject.doNotEmitMetrics(); // not a final API, just illustrative :D
    }
})

export default function () {
    http.get("https://httpbin.test.k6.io/status/200");
    http.get("https://httpbin.test.k6.io/status/503");
    http.get("https://httpbin.test.k6.io/status/503");
    http.get("https://httpbin.test.k6.io/status/403");
}

@na--

I share your views! Only thing I'd like to expand on is that I think it would make sense to be able to define both default and per-call callbacks. As in:

import { skipCallback } from 'http'

http.setDefaultResponseCallback(function (response, metricsObject) {
    let requestFailed = response.status >= 399 || response.status === 0;
    requestFailures.add(requestFailed);
    if (requestFailed) {
        metricsObject.doNotEmitMetrics(); // not a final API, just illustrative :D
    }
})

export default function () {
    http.get("https://httpbin.test.k6.io/status/200"); // Uses the one defined above
    http.get("https://httpbin.test.k6.io/status/503"); // Uses the one defined above
    http.get("https://httpbin.test.k6.io/status/503", { responseCallback: skipCallback }); // Overrides the one above with noop
    http.get("https://httpbin.test.k6.io/status/403", { responseCallback: (res) => { /* ... /*} }); // Overrides the one above with something else
}
  1. Most calls should not have a callback -> per-call
  2. Most calls should have a callback -> default + skips
  3. Most calls should have a callback, but some should have different ones -> default + per-call + skips
  4. All calls should have a callback -> default
  5. All calls should have a callback, but some should have different ones -> default + per-call

While switch cases and if's in the default callback sure is an option here, I'd argue that being able to do this per invocation makes sense, given that the URLs and other characteristics of the request might be dynamic, but the invocations themselves likely will be following a pattern you can reason about.

I like the solution @na-- posted, it seems to provide most bang for the buck! (although initially requiring some setup on the user end), _having a nice default_ (which is also brought up) would be nice. Perhaps the naming should indicate more clearly that the responseCallback is being processed before metrics are emitted? Like preResponseCallback or responseInterceptor idk. Could also be solved by docs I guess ๐Ÿคทโ€โ™‚๏ธ

:+1: for per-request callbacks. Not sure if having a special skipCallback thing in the http module is better than just null or function() {}, but those are implementation details we can discuss in https://github.com/loadimpact/k6/issues/1716

:+1: for a better name as well, my opinion here is :+1: for preResponseCallback, :-1: for responseInterceptor :smile: another possible suggestion postRequestCallback or postRequestHook, with a slight preference for the hook variant

There are many good things with a generic callback that can manipulate all metrics, but there are bad things here as well:

  1. we lose the "standard" metric name for http_req_failures because the user can define whatever name they want, and cloud won't know the meaning of it.
  2. It's more complex to implement for the users. User is now responsible for emitting metrics, tagging other metrics and sometimes preventing emission. This code looks like library code more than client-side code.
  3. The implementation of a generic hook that can do anything makes it easy for the user to shoot themselves in a foot.

I think we should have a generic hook as well, but I don't think it's required for this feature. We can have many hooks for many different purposes.

@sniku Isn't it specifically what is being mentioned here or did I misinterpret that?

And for k6 v0.32.0, when we have had a bit of time to carefully consider things, we can simply set the default such callback > in Go to something that we think is reasonable. For example, "do not emit http_req_* metrics for status >= 400 and emit a >http_req_failures Rate instead". Or anything else. And then encode that in the cloud, if we want to have smarter results.

Agree with @sniku . IMO, one of the main advantages of this feature is to define a threshold easily on the error rate.

import http from 'k6/http';

export let options = {
    thresholds: {
        http_req_failures: ['rate < 0.1'],
        // or 
        'http_reqs{failed:true}': ['rate < 0.1'],
    },
};


export default function () {
    http.get("https://httpbin.test.k6.io/status/200");
    http.get("https://httpbin.test.k6.io/status/503");
    http.get("https://httpbin.test.k6.io/status/503");
    http.get("https://httpbin.test.k6.io/status/403");
}

Yes, what @legander said :smile: And, I think we've discussed this before, but as an API design principle, if you can choose between:

  1. having a bunch of composable components (e.g. "users can intercept and change HTTP responses and emitted metrics with custom hooks") and a good and simple default on top of them (e.g "_by default_, we emit http_req_failures and do not emit request duration for failed metrics") that is good enough for the majority of people, but can be changed
  2. having a single inflexible solution for the problem the majority of users have (e.g. "we will always emit http_req_failures, but we'll allow users to tweak which requests are considered failure")

In my mind, the first is always better, since it delivers the same benefits but covers corner cases and alternative use cases much, much better. A huge percentage of the open k6 issues and existing technical debt are because we've chosen the second option far too often. The whole k6/http module is one huge example of it, in fact...

For example, the majority of people probably want http_req_failures, but some might prefer tagged metrics with failed: true when they are emitting them to an external output that is not our cloud. With my suggestion, both use cases would be covered, whereas with what you propose, only a single one will be and we'll have a bunch of open issues we'll never solve and a bunch of PRs we'll never merge :wink:

@ppcano, please explain how my suggestion precludes easily defining a threshold on the error rate?

import http from 'k6/http';
import { Rate } from 'k6/metrics'


let requestFailures = new Rate('http_req_failures')

export let options = {
    thresholds: {
        http_req_failures: ['rate < 0.1'],
    },
};

http.setResponseCallback(function (response, metricsObject) {
    let requestFailed = response.status >= 399 || response.status === 0;
    requestFailures.add(requestFailed);
    if (requestFailed) {
        metricsObject.doNotEmitMetrics(); // not a final API, just illustrative :D
    }
})

export default function () {
    http.get("https://httpbin.test.k6.io/status/200");
    http.get("https://httpbin.test.k6.io/status/503");
    http.get("https://httpbin.test.k6.io/status/503");
    http.get("https://httpbin.test.k6.io/status/403");
}

@na-- I might have misunderstood your example, but it looks the user needs to create a Rate Metric, implement the hook and track the values.

It is a big difference in comparison with only:

export let options = {
    thresholds: {
        http_req_failures: ['rate < 0.1'],
    },
};

Also, it would be more difficult to set a threshold on the error rate with the Test Builder.

@ppcano , you didn't misunderstand my example, you probably just didn't read the text right above it... and when @legander copy-pasted it...

And for k6 v0.32.0, when we have had a bit of time to carefully consider things, we can simply set the default such callback in Go to something that we think is reasonable. For example, "do not emit http_req_* metrics for status >= 400 and emit a http_req_failures Rate instead". Or anything else. And then encode that in the cloud, if we want to have smarter results.

So, when we set such a default (be it k6 v0.32.0 or v0.31.0 if we manage it), unless the user has some specific requirements that would force them to define a custom post-request hook, setting a threshold on http_req_failures will look like this with my suggestion:

import http from 'k6/http';

export let options = {
    thresholds: {
        http_req_failures: ['rate < 0.1'],
    },
};

export default function () {
    http.get("https://httpbin.test.k6.io/status/200");
    http.get("https://httpbin.test.k6.io/status/503");
    http.get("https://httpbin.test.k6.io/status/503");
    http.get("https://httpbin.test.k6.io/status/403");
}

This is the whole script - the _default_ post-request hook takes care of updating http_req_failures...

Btw, FWIW, having a default post-request hook also makes the breaking change much smaller. We all agree that a breaking change is necessary - the vast majority of people probably wouldn't want to count failed requests in http_req_duration and would benefit from http_req_failures. But for anyone who doesn't, it would be easy to opt out and keep the current k6 behavior by just doing something like this:

import http from 'k6/http';
http.setDefaultPostRequestHook(null);

And I think another good API design principle is that, when you're making a breaking change, you should strive to at least provide an easy solution for people who don't like it and want to go back to how things were... In this specific case, I imagine some people who have adopted their own solution for tracking HTTP errors might make use of it while they transition to our new hook-based approach.

btw I'm starting to dislike postRequestHook as the name, it might be confused with a hook for POST requests... :sweat_smile: So, changing my vote, I am not in favor of preResponseCallback or preResponseHook as the name.

I was re-reading the issue and I see I didn't address two parts of @sniku argument:

The implementation of a generic hook that can do anything makes it easy for the user to shoot themselves in a foot.

I don't accept the premise here, this can literally be an argument against almost any change. Especially everything k6 does above and beyond tools like wrk or ab... :stuck_out_tongue_winking_eye: We can discuss if a pre-response hook is something very unusual (I don't think it is), or if it violates POLA, but as long as the API and defaults are reasonably good, I can't think how giving our users more power and flexibility is bad.

I think we should have a generic hook as well, but I don't think it's required for this feature. We can have many hooks for many different purposes.

I strongly disagree on both counts. Besides thinking that it's a better solution for the specific problem we are discussing here, as I've expressed above, generic callbacks will solve plenty of other issues real k6 users have had. For example, this forum post linked in https://github.com/loadimpact/k6/issues/1716. Or https://github.com/loadimpact/k6/issues/927 / https://github.com/loadimpact/k6/pull/1171, https://github.com/loadimpact/k6/issues/800 (custom callback on the first HTTP digest request that has expected 400 response), https://github.com/loadimpact/k6/issues/884 (if we have http.setDefaultPostRequestHook() can have easily also have http.getCurrentDefaultPreResponseHook() and have a way to temporarily wrap the hook in another hook and then restore it), https://github.com/loadimpact/k6/issues/1298, this and this issues from the forum can be also solved by a per-request callback. And I'm probably missing some...

As to the "many hooks for many different purposes" - this will be quite difficult to do technically (please read through some of the objections in https://github.com/loadimpact/k6/issues/1716, specifically how it's already tricky when you have http.batch() in a single-threaded JS runtime). But I also think it's bad UX...

When writing this issue I explored the generic hook idea as well but I couldn't come up with API that was intuitive and didn't require the user to define/re-define the built-in http_req_failures metric.

I would really like to see a concrete proposal for the generic hook to solve this specific problem before making up my mind.

Here are some comments on the proposal from @na--

let requestFailures = new Rate('http_req_failures'); // redefinition of the built-in-metric should not be allowed like this (we have an issue for this)

http.setResponseCallback(function (response, metricsObject) {
    let requestFailed = response.status >= 399 || response.status === 0;
    requestFailures.add(requestFailed); // user responsible for manipulating a built-in metric feels wrong, but acceptable. Tags to this metric are also missing (status, url, etc)
    if (requestFailed) {
        metricsObject.doNotEmitMetrics(); // not convinced this is a good API :-) We do want to emit _some_ metrics.
    }
    // tagging of http_reqs is missing. I could not come up with a good suggestion for this. 
})

Even if we do come up with a nice implementation for the default hook, the user will have to copy-paste our entire default implementation just to change the status ranges (let's say 403 is acceptable by the user). It's not possible to extend the default implementation, it needs to be fully replaced by the user.

I like the http.setResponseCallback() API. That makes sense to me.

At the moment, I still think that boolean httpResponseSuccess() hook is easier for the user to understand and change. I think having more hooks is better UX than having _one hook to rule them all_, but I'm willing to be convinced by a nice API proposal.

I couldn't come up with API that was intuitive and didn't require the user to define/re-define the built-in http_req_failures metric.

I'll create an issue for a metric refactoring proposal, hopefully later today, for a prerequisite we need to implement https://github.com/loadimpact/k6/issues/1435, https://github.com/loadimpact/k6/issues/572, and https://github.com/loadimpact/k6/issues/1443 / https://github.com/loadimpact/k6/issues/961#issuecomment-557482834. In essence, it will make it so that it doesn't matter if the user has let requestFailures = new Rate('http_req_failures') in their script, or if we've defined it ourselves in the default preResponseCallback we have written in pure Go.

I would really like to see a concrete proposal for the generic hook to solve this specific problem before making up my mind.

Fair enough. I guess this is the next step for this issue, or maybe we should put this on hold and "go back" to https://github.com/loadimpact/k6/issues/1716 to discuss it.

Even if we do come up with a nice implementation for the default hook, the user will have to copy-paste our entire default implementation just to change the status ranges (let's say 403 is acceptable by the user). It's not possible to extend the default implementation, it needs to be fully replaced by the user.

I am not sure this is the case. It depends on how we design the API, of course, but I think it might be possible to make it reasonably composable, we'll see. And even if it's not possible, the default implementation I suggested above would be something like these whooping... 6 lines of code :smile:

let requestFailures = new Rate('http_req_failures')
http.setResponseCallback(function (response, metricsObject) {
    let requestFailed = response.status !== 403 && (response.status >= 399 || response.status === 0);
    requestFailures.add(requestFailed);
    if (requestFailed) {
        metricsObject.doNotEmitMetrics(); // not a final API, just illustrative :D
    }
})

Having to copy-paste them is not perfect, but it's reasonably compact and understandable and, most importantly, it lacks any magic at a distance... And we can easily write a jslib wrapper that just accepts the requestFailed condition as a lambda and deals with everything else, for people who want one-liners...

Sorry, @sniku, I completely missed the code comments you added. To respond:

redefinition of the built-in-metric should not be allowed like this (we have an issue for this)

I already discussed some of the current problems with metrics in that internal discussion we have, but I will elaborate more in the new k6 issue I plan to write today or on Monday about the metric refactoring that's a prerequisite for https://github.com/loadimpact/k6/issues/1435, https://github.com/loadimpact/k6/issues/572, and https://github.com/loadimpact/k6/issues/1443 / https://github.com/loadimpact/k6/issues/961#issuecomment-557482834. In summary, "creating" a new metric with the same name as a built-in system metric (e.g. new Rate('http_req_failures')) will just return the same metric (i.e. internally, a Go pointer to it) if the type matches, and throw an exception if it doesn't. https://github.com/loadimpact/k6/issues/1435 is the issue you probably meant.

user responsible for manipulating a built-in metric feels wrong, but acceptable. Tags to this metric are also missing (status, url, etc)

Also related to above, but the distinction between built-in and custom metric is quite artificial. And the API can be like this, Rate.add() accepts tags:

requestFailures.add(requestFailed, metricsObject.getTags());

not convinced this is a good API :-) We do want to emit _some_ metrics.
tagging of http_reqs is missing. I could not come up with a good suggestion for this.

I said it was illustrative, and I agree, it's not a good API. But what metrics exactly do you think we should emit? http_reqs?

In summary, "creating" a new metric with the same name as a built-in system metric (e.g. new Rate('http_req_failures')) will just return the same metric (i.e. internally, a Go pointer to it) if the type matches, and throw an exception if it doesn't. #1435 is the issue you probably meant.

If we must allow users to manipulate built-in metrics, wouldn't it be better to allow users to import them rather than redefine them?

import {http_req_duration} from 'k6/metrics/built-in';
http_req_duration.add(1);

Also related to above, but the distinction between built-in and custom metric is quite artificial.

I have a different opinion. Built-in metrics have some guarantees:

  • system tags are guaranteed to be present on built-in metrics.
  • users can't redefine the contains parameter.

k6 cloud depends on the first guarantee to work correctly.

I said it was illustrative, and I agree, it's not a good API. But what metrics exactly do you think we should emit? http_reqs?

I understand that it was a quick/initial mockup, but I really think we should get a reasonably final API proposal before deciding how to go forward. I sincerely doubt that the final implementation will be a "whooping... 6 lines of code".

  1. I would like to see how the metricsObject is structured. Does it hold all built-in metrics for the request? (http_reqs, http_req_duration, http_req_blocked, ...)
  2. Does k6 cloud get http_req_li_all inside of metricsObject while k6 run gets unagreggated metrics?
  3. Why isn't http_req_failures inside of metricsObject but must be outside?
  4. How can user tag http_reqs metric with a standard failed tag?
  5. How can user prevent only http_req_duration+friends from being emitted, but allow http_reqs to be emitted?

The fact that we put the burden of all this on the user doesn't feel like good UX to me, but again, I would like to see a good API proposal before I agree/disagree :-)

I like the import of metrics, though that's not going to solve the issue of add()-ing metrics without all of the required tags. In that respect, import {http_req_duration} from 'k6/metrics/built-in' is exactly the same as new Trend('http_req_duration') returning the built-in metric, and the contains parameter won't be over-written in any case. To solve this issue, we probably need to expand https://github.com/loadimpact/k6/issues/1435 and validate that the required tags for the metrics are present.

I understand that it was a quick/initial mockup, but I really think we should get a reasonably final API proposal before deciding how to go forward.

Me too :smile: That was the whole reason for me pushing to define this task before we prioritize it... I guess now it's my responsibility to come up with this API, given that I dislike the original proposal and have suggested an alternative. I'll try to think about it some more and propose something over the next week, but here are some _very preliminary_ answers to the questions you posed:

  1. no idea yet, we probably won't even have a dedicated metricsObject as a second parameter; instead a parameter that's an object containing the response (which has the actual timings), something that controls what metrics will be emitted, as well as the system tags for the metrics.
  2. there would be no differences between k6 cloud and k6 run, the callback will get called with exactly the same data; however, because of our hack with http_req_li_all for k6 run -o cloud, we have to figure out how to accommodate httpext.Trail and aggregation in a non-breaking way (:heart: the tech debt :smile:)
  3. It might be, depending on exactly what the API ends up being. If it's a conceptual API working with metric types, it probably will be. If it's an API that directly deal with metric samples (i.e. values), it probably _can't_ be inside, since it'd be a bit of a chicken and egg problem - how would the http_req_failures value already exist when we're in the callback to determine it :smile:
  4. and 5. Not sure yet. I would have said by accessing the built-in metrics and sending values on them when suppressing the normal sending, but that would be too cumbersome. Will think about it some more.
  5. (I realized that I'd forgotten to respond to your original question about redirect chains) Yes, we should fire any hook on every request in a redirect chain, otherwise we're going to have a bunch of problems.

Anyway, I'll let this percolate somewhat and hopefully get back to you sometime next week.

So I'm just thinking a bit of how to maintain a JavaScript feel to it, and my suggestion might be way of. But here goes.

As for the interface for manipulating the response event (or what you'd call it) I would go for something like the following:

const redirect = new Counter("redirect")

http.on("response", (ev) => {
  console.log("default metric: ", ev.metric.name); // whatever k6 thinks it should be

  if (ev.request.url === "https://test.k6.io") {
    // You go ahead and handle this for me!
    return 
  }

  const status = ev.response.status

  if (status >= 400) {
    // Change ev.metric to http_reqs_failure
    ev.fail()
  } else if (status >= 300) {
    redirect.add(1)
    ev.preventDefault() // tell k6 to ignore this ev
  } else {
    // Change ev.metric to http_reqs_success
    ev.succeed()
  }
})

The above just makes sense to me as a JavaScript-developer. It matches how NodeJS and browsers handle these kinds of events. For instance, in the browser anchor.addEventListener("click", ev => ev.preventDefault()) tells the browser to ignore handling the click (the on/off naming is node-like and much nicer than `addEventListener).

We could also have specific events for success/failure in case you know that you only want to modify failure into success or vice versa. Could improve performance a bit, by not being a catch-all.

// An evil API that has reversed error with success status
http.on("failure", ev => {
  ev.succeed()
})

http.on("success", ev => {
  ev.fail()
})

The list goes on what would be possible:

http.on("request", ev => {
    // I'm lazy and don't want to add this everywhere
    ev.request.headers.add("X-Request-Id", uuid())

    // Add tags for every request
    ev.addTags({

    })
})

I could go on and on. ๐Ÿ˜ฌ

look at that, @allansson coming into the thread killing it with good suggestions! ;)

I agree with my proposal :). Thanks @na-- for bringing it up instead of I having to do it.

As for a compromise so that we don't have to have any JS callbacks for the first version (which likely will still not be in v0.31.0)

I propose we have an additional built-in callback that is exteremely "special" object that can't be used for anything else (at least for now). But you can register it as the "callback" of http.setResponseCallback or w/e we will end up calling it, as well as per request. This callback will be able to be used even after we start supporting JS ones and will be more performant. I actually meant to propose we have some built-in ones like this so that we don't impact performance as much for common cases, but now it seems like it will be good for other things as well ;).

My proposals for the one that should be included is:

http.setResponseCallback(http.expectedStatuses([200,299], [301,302], 308, 418); 

The arguments are either an array of 2 ints or an int. If an int they are just the status code we expect, if they are an array they are the inclusive start and end of a range of codes we expect. This one callback will be easy to implement, is likely going to cover 95% of all use cases ever(famouse last words), does not stop us from having JS callbacks later, will mean that the performance for this simple one will likely be quite close to what it would be if we just hardcode it.

I would say that this way we at least won't need to figure out what the arguments to the JS callback need to be now. While still not leaving us with more technical debt, except having to support the built-in one, which I would've argued in the first place.

When we have JS callbacks the code will just check whether it gets an built-in callback or a not built in one. Also possibly the built-in ones will be able to be called with the same arguments as the JS ones and do what they usually do, possibly being used inside a JS callback - not really certain if that will really be all that useful, as at least for performance - calling in JS -> GO -> JS is probably going to be more work then just the ifs that will be saved :shrug:.

:+1: for different API @allansson , we actually internally discussed that this will likely be useful for before doing requests, especially in redirect chains. So it is likely that we will need to either have to have multiple callback regsiters, or have an 'on' one.

I am somewhat of the opinion that it will be better if this is with a tag "failed" then with a new metric, and then omitting the failed request metrics, as this way you won't know the performance of your failing requests(or you won't be able to mark them as failed). I think that it is fairly importatnt to know that your failed request also don't take forever and will is likely to be helpful when finding out what is going on when they are failing to know that they are failing in 10 seconds instead of 1 or that the tls handshake took 20 seconds.

I would also again like to bring everybodies attention to the fact that k6 supports not only HTTP, so when we start moving this to other protocols, and add protocols, adding additional metrics or having HTTP specific names is probably not going to be ... great.

To actually address @sniku concerns with this:

some outputs don't ingest tags. It would not be possible to use this functionality with statsd
some "special case" handling would be required for displaying.|

This is also true for having more then 1 URL being hit by the script, as the URL is a tag again. So I am of the opinion that if tags aren't supported by an output, that output should probably not be recommended/supported.

http_reqs would be backwards incompatible unless we combine rate and counter into a new metric type.

http_reqs is still a Counter, it is just that it now has a tag "failed" and we have to have a way to print the number/percentage of failed/not failed in the terminal, but that doesn't make it a Rate. k6 will just need to make a Rate like thing/metric out of it when displaying

if they are an array they are the inclusive start and end of a range of codes we expect.

I'm strongly against this. Conditional logic in argument usage is... weird and reaaaally error-prone. Especially if you don't label it. If you really want to be able to mix entries with min-max style ranges, I'd definitely vote for having a separate object you could pass for that, with min and max as properties, like:

http.setResponseCallback(
  http.expectedStatuses(
    {min: 200, max: 299}, 
    308, 
    /* ... */
  )); 

If it takes an array as well, that array should be considered a list of single status codes.

Other than that, I think having an optimized, predefined status comparator makes perfect sense.

I like almost all the ideas @MStoykov brought up here :+1: :+1: :+1:

:heart: for being the voice of reason and cutting down the scope while not compromising on future extendability.

I'm also for the slight API modification proposed by @simskij

http.setResponseCallback(http.expectedStatuses([{min: 200, max: 299}, {min: 301, max: 302}, 308, 418]); 

It makes it less error-prone.

As far as I can tell, this proposal solves several issues:

  • it's going to be performant.
  • we don't need to implement a full-blown generic callback mechanism (which would likely delay the feature considerably)
  • it's user-friendly - it's easy to modify this callback without copy-pasting default callback code from k6
  • user can't shoot themselves (and us) in the foot by forgetting to tag or emit metrics.
  • it solves 95% of the problem and allows us to fix the remaining 5% later.

I also very much like the callback API proposed by @allansson :heart: but I think that proposal should be copy-pasted into https://github.com/loadimpact/k6/issues/1716 rather than be a part of this issue.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

na-- picture na--  ยท  3Comments

ppcano picture ppcano  ยท  3Comments

if-kenn picture if-kenn  ยท  4Comments

caalle picture caalle  ยท  4Comments

gushengyuan picture gushengyuan  ยท  3Comments