Kibana: [ML] Solutions: Request for mock ML job data for testing

Created on 3 Sep 2020 · 9Comments · Source: elastic/kibana

Request for mock ML job data for use in plugins integrating with ML.

As we work toward being more Solutions-oriented, more plugins are integrating with ML. It would be great to provide mock data for plugins to use when they create their own functional tests - there is no need for them to have to actually go through the whole ML job flow as we cover that in the ML plugin tests.

cc @pheyos

:ml Functional Testing v7.11.0

Source

alvarezmelissa87

Most helpful comment

Thanks for the feedback! We had a few discussion around this topic and here's a quick summary:

First of all: We understand the need for good and reliable test data, we're all in the same boat. But currently, we don't see a way without downsides, so this is just a first step and we'll continue to search for a better workflow (acknowledging this is not ideal).
We would like to avoid snapshoting .ml-* indices. The reason for this is that there are no guarantees about the structure of these indices, they're basically an implementation detail that we shouldn't rely on in functional UI tests. Things can easily change in these indices, even across minor versions, so it would be challenging to keep the snapshots up to date.
We don't want to mock API responses in end-to-end tests for a similar reason: the real API response could change and the tests wouldn't catch that as it's still running with the old mock data.
For the moment, our suggestion is to actually run the ML jobs during tests with prepared test data and job configuration such that they give the expected results. As some details like anomaly scores can easily vary between runs, it's best to not check for exact values, but rather for e.g. there's a critical anomaly in the time range X.
There are already a number of helper methods in the ml.api service, that let you create and run ML jobs as well as checking job states and results. If you need additional methods in there, let us know.
We will work on adding more datasets and job configurations to cover more use cases. This will be on a general basis and not necessarily suit all solutions, but we still expect to see some synergies.
We will create a documentation with a collection of tips and tricks how to e.g. downsample datasets for use in functional tests and how to modify data and/or configure the jobs to force certain behavior (like an anomaly in a certain time range or a job state). This will help solutions teams who are experts on their data to prepare their tests.
Everyone is welcome to reach out if they need help in this process.

pheyos on 25 Nov 2020

👍2

All 9 comments

Pinging @elastic/ml-ui (:ml)

elasticmachine on 3 Sep 2020

Hello! When we want to test anomalies and other ML integrations in Observability, we often just want to make sure our graphs are displaying the right way based on expected data. To that end, it'd be great to have a way to access fixtures of well-tested, maintained API response data for various endpoints that we as solutions interact with, so we can use those instead of setting up real jobs, waiting for them to produce results, etc.

In addition, it would also be nice to have document templates of some kind, or some other way to write data into the right indices so that these API endpoints produce the desired results, for times when we want to set up a cluster that is already in a given state rather than hoping and/or waiting for ML jobs to produce that state reliably.

I think these are two different use cases and I'm happy to provide more information on both! Thanks!!

jasonrhodes on 15 Oct 2020

👍1

Update: it sounds like since there doesn't appear to be an API for retrieving anomalies, the mock data is probably not going to be helpful for our anomaly cases. We appear to be using the mlAnomalySearch method which looks like it is mostly an ES client so we are probably going to need document templates more so that we can fill ES with anomaly data.

jasonrhodes on 21 Oct 2020

For added context, we are already able to inject anomaly data by manually loading documents into the results index. It would be nice, though, to somehow generate them such that they are guaranteed to be consistent.

An even bigger problem is to get the jobs themselves into the desired state for testing. This includes, but is not limited to

forcing specific stages in the life-cycle
forcing a "memory limit reached" situation
forcing categorizer warnings to occur

weltenwort on 26 Oct 2020

cc @pheyos

jgowdyelastic on 26 Oct 2020

it would be nice, though, to somehow generate them such that they are guaranteed to be consistent.

For any ML mock data exercise to have a chance of being effective, we would need mock data coming out of the agent(s) so we can close the circle. Do we know if this is available?

sophiec20 on 17 Nov 2020

In the case of logs we can generate documents or load them from a fixture.

weltenwort on 18 Nov 2020

Thanks for the feedback! We had a few discussion around this topic and here's a quick summary:

First of all: We understand the need for good and reliable test data, we're all in the same boat. But currently, we don't see a way without downsides, so this is just a first step and we'll continue to search for a better workflow (acknowledging this is not ideal).
We would like to avoid snapshoting .ml-* indices. The reason for this is that there are no guarantees about the structure of these indices, they're basically an implementation detail that we shouldn't rely on in functional UI tests. Things can easily change in these indices, even across minor versions, so it would be challenging to keep the snapshots up to date.
We don't want to mock API responses in end-to-end tests for a similar reason: the real API response could change and the tests wouldn't catch that as it's still running with the old mock data.
For the moment, our suggestion is to actually run the ML jobs during tests with prepared test data and job configuration such that they give the expected results. As some details like anomaly scores can easily vary between runs, it's best to not check for exact values, but rather for e.g. there's a critical anomaly in the time range X.
There are already a number of helper methods in the ml.api service, that let you create and run ML jobs as well as checking job states and results. If you need additional methods in there, let us know.
We will work on adding more datasets and job configurations to cover more use cases. This will be on a general basis and not necessarily suit all solutions, but we still expect to see some synergies.
We will create a documentation with a collection of tips and tricks how to e.g. downsample datasets for use in functional tests and how to modify data and/or configure the jobs to force certain behavior (like an anomaly in a certain time range or a job state). This will help solutions teams who are experts on their data to prepare their tests.
Everyone is welcome to reach out if they need help in this process.

pheyos on 25 Nov 2020

👍2

We don't want to mock API responses in end-to-end tests for a similar reason: the real API response could change and the tests wouldn't catch that as it's still running with the old mock data.

I understand this concern, but I wonder if it may be time for us to consider our Kibana APIs to be more than hidden implementation details and treat them as exposed APIs that are subject to some amount of backwards-compatible version control? I know that in Logs and Metrics, we will need to run tests that don't rely on running ML jobs. The idea with this ticket, from our end, was to avoid this very problem of stale data because the mocks are controlled by the ML team directly, and updated as part of the overall process.

jasonrhodes on 30 Nov 2020

Was this page helpful?

0 / 5 - 0 ratings