Kibana: [Logs UI] Create screen to set up analysis ML jobs

Created on 24 Jul 2019  路  18Comments  路  Source: elastic/kibana

Summary

The ML-based log analysis functionality (#41505) requires initial setup by the user before it can become active. The screen that contains the setup controls will be shown when the user activates the "Analysis" tab, but the ML jobs are not (yet) set up correctly.

Acceptance criteria

  • Mockups of exist for the various screen states (in progress, success, error, etc...) including the transitions between them. elastic/kibana#41497 tracks the design work itself.
  • The screen is shown on the "Analysis" tab if the job status API reports that not all jobs are set up correctly.
  • The user is required to select a time range to limit the analysis to.
  • The user can choose to let the ML job run continuously as new data come in.
  • The ml plugin's module setup API is used to create the jobs (see elastic/kibana#42593 and implementation notes below).
  • A blocking loading indicator is shown while the creation is in progress.
  • If the creation failed the user is informed and has the option to change the parameters before retrying.

Implementation notes

  • The ml plugin's module setup API at /api/ml/modules/setup/:moduleId will be enhanced by the ML team (#42409) to allow us to:

    • dynamically set the Elasticsearch index patterns without requiring a Kibana index pattern

    • override the timestamp and other field names with those matching our source configuration

    • The enhancement of this API by the ML team has a PR open at elastic/kibana#42946

    • The creation / adding of a ML Logs UI module is open at elastic/kibana#42872

  • The check for existence of the jobs is included in #41879.

Mockups

:warning: This is only a preliminary mockup from the initial proposal, which will have to be refined according to the acceptance criteria.

grafik

Logs UI logs-metrics-ui v7.4.0

Most helpful comment

@Kerry350
If you don't specify a start time, the job will start at the beginning of the data in the index.
If you don't specify an end time, it will continue running until stopped and any new data added to the index will be analysed.

All 18 comments

Pinging @elastic/infra-logs-ui

@hbharding do we have a more up-to-date mockup of the setup screen that we could include in the description?

What should go under advanced settings?

Also Time Range is a dropdown, right? What options go under there? I assume same as what you can select in the Stream but just want to confirm.

@Zacqary no "advanced settings" needed, that's a preliminary mockup for a general feel. @hbharding do we have a more up to date mockup for this?

Also, I'm not sure if we're coming out of the gate with the "field to categorize" -- @Kerry350 for 7.4 I think we should leave that out, right? Doesn't make much sense to choose that if we only show log rate which doesn't use it... but I may be missing something there.

@Zacqary

Here are the latest setup screen designs I can see on Figma:

Screenshot 2019-08-13 20 52 57

Screenshot 2019-08-13 20 53 30

@jasonrhodes is correct, there's no need for "Advanced settings" in this iteration. Also no need for selecting a field to categorise, as we're only providing the log rate analysis at this stage.

Infact, I don't think at this stage we need the user to configure anything - unless we wanted to let them configure the bucket span for the job, but I think we were going to always use 15 minutes as that's what the ML team recommended.

It would be good to look at #43050 too. I've opened that up for review now. This is a React hook that Felix setup - it deals with all of the parameters needed to setup the new module that exists in #42872 (which is awaiting ML review). There's no need to get bogged down in that review, but it doesn't make sense to throw that work away - it can be used directly for implementation work on this ticket. This ticket should be a case of building out the UI, that connects with that React hook.

To setup the module as it exists we need the sourceId, spaceId, indexPattern, timeField, and bucketSpan - all of these things are inferred (from the source configuration, active space etc), except for bucket span which is set to 15 minutes.

For clarification:

What I'd expect to see at this stage is basically explanatory text about what's going to happen (we're going to setup some ML jobs), and a button to explicitly confirm that action, and a blocking loading state whilst this setup is happening.

There's a lot going on in Figma so I'm always nervous to dive in there haha. Thanks for the shots. I thought the "time range" was something the ML job needed to know how far back to consider for data or something along those lines, but is that not right?

If we don't need time range than yeah, this ticket becomes a simple block of text + a button that uses the work done in #43050 to make the ML API call ...

@jasonrhodes

I thought the "time range" was something the ML job needed to know how far back to consider for data or something along those lines, but is that not right?

Ah, yes, you're right. There was two parts to this, A) how far back to consider data from and B) whether to collect data indefinitely or use an end date. These are the start dates and end dates of the datafeed when using the actual ML UI. I can't see anything in our module definition that pertains to start / end dates, so I'm not sure if by default that's set to do "from now, and indefinitely". Let me take a look in to this bit.

@Kerry350
If you don't specify a start time, the job will start at the beginning of the data in the index.
If you don't specify an end time, it will continue running until stopped and any new data added to the index will be analysed.

Okay, so, the setup endpoint takes in the start and end directly.

The JSON body in #43050 should just need to be extended to include a start and end (Felix had said that hook PR was to be used as inspiration, rather than a perfect finished product, which is why it was a draft, so this is something I've overlooked).

I think in terms of UI it's probably better to have an explicit startcalendar picker, and end calendar picker, if start isn't selected it will gather data from the start of the index, and if end isn't selected data will be collected indefinitely. So they're both optional.

@jgowdyelastic beat me to it 馃槄

(We could of course just use the defaults - start of index and continuously collecting data - and not add these parts to the UI to keep it even simpler. But I'm thinking this might put people off using the feature altogether if they have large indexes, or just want to glean information from a very specific point in time).

So to clarify:

  • Use the Figma designs
  • But use #43050 to set the value of the time range field instead of making it an input field

Is that about right?

Actually I'm re-reading this and realizing we don't actually need any input fields. Should I just replace the Get Started button with one that says Create ML job and ignore the second config screen?

@Zacqary

Use the Figma designs, but more for inspiration than anything. We don't need to adhere to the designs 100% at the moment. And utilise EUI components as much as possible.

The screen should allow you to configure an optional start and end. The hook in #43050 needs a slight extension to accept a start and end parameter, the endpoint itself is already setup to accept it as explained above.

We should have some explanatory text about what is happening, i.e. you're setting up machine learning jobs to analyse your logs for anomalies.

The button should be disabled whilst this setup takes place.

Yeah, one screen will work for now.

Are start and end in the form of seconds or milliseconds? Or an ISO string?

@Zacqary milliseconds

Was this page helpful?
0 / 5 - 0 ratings

Related issues

MaartenUreel picture MaartenUreel  路  3Comments

bradvido picture bradvido  路  3Comments

timroes picture timroes  路  3Comments

tbragin picture tbragin  路  3Comments

celesteking picture celesteking  路  3Comments