The ML-based log analysis functionality (#41505) requires initial setup by the user before it can become active. The screen that contains the setup controls will be shown when the user activates the "Analysis" tab, but the ML jobs are not (yet) set up correctly.
ml plugin's module setup API is used to create the jobs (see elastic/kibana#42593 and implementation notes below).ml plugin's module setup API at /api/ml/modules/setup/:moduleId will be enhanced by the ML team (#42409) to allow us to::warning: This is only a preliminary mockup from the initial proposal, which will have to be refined according to the acceptance criteria.

Pinging @elastic/infra-logs-ui
@hbharding do we have a more up-to-date mockup of the setup screen that we could include in the description?
What should go under advanced settings?
Also Time Range is a dropdown, right? What options go under there? I assume same as what you can select in the Stream but just want to confirm.
@Zacqary no "advanced settings" needed, that's a preliminary mockup for a general feel. @hbharding do we have a more up to date mockup for this?
Also, I'm not sure if we're coming out of the gate with the "field to categorize" -- @Kerry350 for 7.4 I think we should leave that out, right? Doesn't make much sense to choose that if we only show log rate which doesn't use it... but I may be missing something there.
@Zacqary
Here are the latest setup screen designs I can see on Figma:


@jasonrhodes is correct, there's no need for "Advanced settings" in this iteration. Also no need for selecting a field to categorise, as we're only providing the log rate analysis at this stage.
Infact, I don't think at this stage we need the user to configure anything - unless we wanted to let them configure the bucket span for the job, but I think we were going to always use 15 minutes as that's what the ML team recommended.
It would be good to look at #43050 too. I've opened that up for review now. This is a React hook that Felix setup - it deals with all of the parameters needed to setup the new module that exists in #42872 (which is awaiting ML review). There's no need to get bogged down in that review, but it doesn't make sense to throw that work away - it can be used directly for implementation work on this ticket. This ticket should be a case of building out the UI, that connects with that React hook.
To setup the module as it exists we need the sourceId, spaceId, indexPattern, timeField, and bucketSpan - all of these things are inferred (from the source configuration, active space etc), except for bucket span which is set to 15 minutes.
For clarification:
What I'd expect to see at this stage is basically explanatory text about what's going to happen (we're going to setup some ML jobs), and a button to explicitly confirm that action, and a blocking loading state whilst this setup is happening.
There's a lot going on in Figma so I'm always nervous to dive in there haha. Thanks for the shots. I thought the "time range" was something the ML job needed to know how far back to consider for data or something along those lines, but is that not right?
If we don't need time range than yeah, this ticket becomes a simple block of text + a button that uses the work done in #43050 to make the ML API call ...
@jasonrhodes
I thought the "time range" was something the ML job needed to know how far back to consider for data or something along those lines, but is that not right?
Ah, yes, you're right. There was two parts to this, A) how far back to consider data from and B) whether to collect data indefinitely or use an end date. These are the start dates and end dates of the datafeed when using the actual ML UI. I can't see anything in our module definition that pertains to start / end dates, so I'm not sure if by default that's set to do "from now, and indefinitely". Let me take a look in to this bit.
@Kerry350
If you don't specify a start time, the job will start at the beginning of the data in the index.
If you don't specify an end time, it will continue running until stopped and any new data added to the index will be analysed.
Okay, so, the setup endpoint takes in the start and end directly.
The JSON body in #43050 should just need to be extended to include a start and end (Felix had said that hook PR was to be used as inspiration, rather than a perfect finished product, which is why it was a draft, so this is something I've overlooked).
I think in terms of UI it's probably better to have an explicit startcalendar picker, and end calendar picker, if start isn't selected it will gather data from the start of the index, and if end isn't selected data will be collected indefinitely. So they're both optional.
@jgowdyelastic beat me to it 馃槄
(We could of course just use the defaults - start of index and continuously collecting data - and not add these parts to the UI to keep it even simpler. But I'm thinking this might put people off using the feature altogether if they have large indexes, or just want to glean information from a very specific point in time).
So to clarify:
Is that about right?
Actually I'm re-reading this and realizing we don't actually need any input fields. Should I just replace the Get Started button with one that says Create ML job and ignore the second config screen?
@Zacqary
Use the Figma designs, but more for inspiration than anything. We don't need to adhere to the designs 100% at the moment. And utilise EUI components as much as possible.
The screen should allow you to configure an optional start and end. The hook in #43050 needs a slight extension to accept a start and end parameter, the endpoint itself is already setup to accept it as explained above.
We should have some explanatory text about what is happening, i.e. you're setting up machine learning jobs to analyse your logs for anomalies.
The button should be disabled whilst this setup takes place.
Yeah, one screen will work for now.
Are start and end in the form of seconds or milliseconds? Or an ISO string?
@Zacqary milliseconds
Most helpful comment
@Kerry350
If you don't specify a start time, the job will start at the beginning of the data in the index.
If you don't specify an end time, it will continue running until stopped and any new data added to the index will be analysed.