Kibana: Make alert params and action config searchable

Created on 11 Nov 2019  Â·  25Comments  Â·  Source: elastic/kibana

Alerting R&D Alerting Services discuss enhancement research

All 25 comments

Pinging @elastic/kibana-stack-services (Team:Stack Services)

Relates to "Make rules sortable, filterable, and aggregatable" in #50222.

This would be good. I've been discussing with users how to be able to report on their rules e.g. in a dashboard, canvas etc which would need to aggregate based on alert parameters. You can just about do something with scripted fields at the moment but it's not ideal.

Also see this issue: Request for alerting internal tags structure. The gist is to add a new "tags" property to alerts, but would be separate from the existing tags property, which is used by customers. Let's call this new tags property internalTags. It's meant to be used by alert implementors to add their own tags, for whatever purpose they want, without conflicting with the tags customers use or having the customers see them in the UI.

Would having this be good enough to solve the requirements? I think it depends on how/what you want to search on.

I don't really want to go down the path of adding mappings for alertType-specific data, seems like the migration problem would be ... messy. And not sure what the other options are.

For some other efforts outside of alerting where we use tags both external and internal within a saved object structure we decided on using a leading underscore to designate that the internal tags should remain internal.

tags
_tags

fwiw. That Saved Object has nothing to do with alerting but figured I should mention it.

I've only just began looking into this but I want to summarise what we currently know as it doesn't look like this will be easy to support and I want to make sure we have a good understanding of the context.

Current State

Before we talk about the need, lets just describe what we currently have.
Backing each alert we create a Saved Object with the following shape:

interface RawAlert extends SavedObjectAttributes {
  enabled: boolean;
  name: string;
  tags: string[];
  alertTypeId: string;
  consumer: string;
  schedule: SavedObjectAttributes;
  actions: RawAlertAction[];
  params: SavedObjectAttributes;
  scheduledTaskId?: string;
  createdBy: string | null;
  updatedBy: string | null;
  createdAt: string;
  apiKey: string | null;
  apiKeyOwner: string | null;
  throttle: string | null;
  muteAll: boolean;
  mutedInstanceIds: string[];
}

The fields we want to make searchable as part of this issue are params and action which both implement the SavedObjectAttributes type which means these are string based key-value records which can be deeply nested.

In the face of it querying by these internals objects is straight forward, but as this Saved Object needs to support all types of alerts and actions, we have a challenge as each alert type can have different shapes to these fields. To support these multiple shapes we tell Elasticsearch not to create a _mapping_ for these objects - which means that querying by their shape isn't actually possible.

The Need

That said, we'd like to be able to query for specific Alerts based values that are stored in these fields, so that we can:

  1. Sort by them
  2. Filter by them
  3. Aggregate over alerts using these fields in some manner

Possible Solutions

So, the reason we can't currently support querying against these fields is clear, but there are a few approaches we could take to make these requirements possible- none of which are straight forward, so some discussion is needed to understand the cost-value ratio.

One assumption that I'm making for all of these is that we want to rely on ES for this and not do any of these operations in memory as that would make it inefficient and hard to support pagination.

Create a Saved Object type for each AlertType

This approach would mean that whenever a new AlertType is created we generate a brand new type of SavedObject with its own mapping.

There are a few challenges with this approach:

  1. We would have to handle an unknown number of SavedObjects types, all of whom need to be administered via the Alerting Framework.
  2. We will likely have to make changes to the SavedObjects Client APIs as they currently assume that you're only ever operating at the level of a single SO type, not multiple types.

There's also a clear limitation to this approach:
This will still not allow us to query based on the action field as these would still be different shapes across the different SavedObject types and would have top remain with a disabled mapping.

@mikecote has already told me that there is a danger here of a _mapping explosion_ which I need to investigate further.

Enable dynamic mapping + create a deep object

Instead of splitting the SavedObject types between the AlertTypes, we can take an approach similar to what SavedObjects itself does.
While we would have one single SavedObject type of alert (as we do now) we can store each alert type's params and each action type's config under a corresponding key based on their unique ID and keep the mapping set to dynamic so that we can query against the internal shapes.

For example, this would mean that given an alert of type "example.always-firing" with an action of type ".index" you would store the data like so:

{
          "alert" : {
            "consumer" : "alerting",
            "alertTypeId" : "example.always-firing",
            "params" : { 
                "example.always-firing" : { 
                  "numOfInstances" : 5
               },
            },
            "actions" : [
              {
                "actionTypeId" : ".index",
                "params" : {
                    ".index" : {
                      "documents" : [
                        {
                          "val" : "{{alertInstanceId}}"
                        }
                      ]
                    },
                },
                "actionRef" : "action_0",
                "group" : "default"
              }
            ],
            /// ... other fields
          },
          "type" : "alert",
          "references" : [
            {
              "name" : "action_0",
              "id" : "9aab14cd-87e1-43ca-98f3-1caa9723ce98",
              "type" : "action"
            }
          ],
          "updated_at" : "2020-05-11T14:48:15.017Z"
        }

Instead of what we currently do which is this:

{
          "alert" : {
            "consumer" : "alerting",
            "alertTypeId" : "example.always-firing",
            "params" : { 
              "numOfInstances" : 5
            },
            "actions" : [
              {
                "actionTypeId" : ".index",
                "params" : {
                  "documents" : [
                    {
                      "val" : "{{alertInstanceId}}"
                    }
                  ]
                },
                /// ... other fields
              }
            ],
            /// ... other fields
          },
          "type" : "alert",
          "references" : [
            {
              "name" : "action_0",
              "id" : "9aab14cd-87e1-43ca-98f3-1caa9723ce98",
              "type" : "action"
            }
          ],
          "updated_at" : "2020-05-11T14:48:15.017Z"
        }

You may note how we have the addition of the "example.always-firing" and ".index" keys as appropriate under the params fields which would allow us to enable _Dynamic Mapping_ in these fields as their shape will no longer slash across types.

But this too has challenges:

  1. Dynamic mapping uses the first type it encounters, so supporting fields that might change type between instances will be very tricky and might require AlertTypes to preemptively provide us with mappings of their own at setup time which we would then need to manually merge.
  2. It's not yet clear how _migrations_ might work in such a model and needs further investigation.
  3. Each AlertType will likely have to provide us with some mappings constraints of their own though, as they might have some fields in their params that they do not want to create mappings for as they might change often (such as the document field in the .index action which will be different on every single instance!
  4. Here too there's a danger of a mappings explosion that needs to be investigated.

Static mapping + flattened object

Another option is to standardise the shape of params across all AlertTypes such that each AlertType will specify the exact shape and types of their params and we'll merge these shapes together into one static shape which will be used to define the mappings of these params.

This is the simplest solution in terms of the mapping, but introduces a whole set of challenges in the framework:

  1. What do we do what two AlertTypes use the same name for a field?
  2. How do we handle migrations within a specific AlertType? And what about across all types?
  3. What happens if the shape is wrong? Do we validate ourselves? Rely on plugins?
  4. If this means dynamically merging mappings and types on the "way into the framework" and then exposing it as "portions" of this type on the "way out of the framework" back to the solution - are we introducing a lot of complexity that will be hard to maintain in the future?

Challenges across the board

All of the above options will require changes in the SavedObjectsClient as you can only sort/filter by a rood field at the moment, and supporting "deep" fields inside of "objects" isn't currently
supported and doing so will require some further research.

Next steps

As you can see, none of these options are straight forward and there isn't a clear winner.

This all requires a lot more investigation, and playing around with the code locally I think that the second option (Enable dynamic mapping + create a deep object) produces the most maintainable option, but it has potential issues that still need investigation which is what I'll likely be looking into next.

If anyone has thoughts or concerns on these options (or perhaps a 4th option we can investigate) I'm all ears. :)

I'm worried about any solution that introduces new mappings for alertType-specific data, due to all the challenges pointed out ^^^.

It's not completely clear that we need an ES solution here; from the SIEM issue https://github.com/elastic/kibana/issues/50222:

Either that or a plain API (even if slower like a table scan) to abstract us away so we can natively to the actions/alerting objects would make it to where we don't have write our own hand rolled solutions.

I read "table scan" as "do as much of a query as you can with ES, then do the remaining filtering/mapping/aggs on the results in JS". That will work if the number of alerts is "reasonable", which I don't know if it is.

I think we should also find out if having an "internal tags" via https://github.com/elastic/kibana/issues/58417 would be good enough for now. This would presumably be a parallel of the current tags structure, but not editable (or probably viewable) in the UI, only programmatically. Not nearly the same as ES fields, but may be useful enough for common needs.

There was also mention of scripted fields in the discussion above, but I'm not sure how we might use them.

@FrankHassanabad Would being able to query/filter/sort by a set of internal tags be enough for you?
It looks like making the params/config actually searchable would a significant piece of work that we'd need to be cautious before picking up.

FYI, scripted fields seems to work and maybe at least better than doing it
in js.

On Wed, May 13, 2020 at 1:49 PM Gidi Meir Morris notifications@github.com
wrote:

@FrankHassanabad https://github.com/FrankHassanabad Would being able to
query/filter/sort by a set of internal tags be enough for you?
It looks like making the params/config actually searchable would a
significant piece of work that we'd need to be cautious before picking up.

—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
https://github.com/elastic/kibana/issues/50213#issuecomment-627960745,
or unsubscribe
https://github.com/notifications/unsubscribe-auth/ABYDJMPTGPFNH7JXRPMRVC3RRKJOXANCNFSM4JL2VCDQ
.

--

Matthew Adams

{

“title”: "Senior Solution Architect”,

“location”: “5 Southampton St.Covent Garden, London WC2E 7HA”,

“url”: “elastic.co”,

}

Search. Observe. Protect.

FYI, scripted fields seems to work and maybe at least better than doing it in js.

Sorry but could you be more specific?
Given the limitations we have on making these fields searchable, how would you go about using scripted fields to achieve what SIEM are looking for?

table scan

I lifted that term from relation DB's where the definition is:

(also known as a sequential scan) is a scan made on a database where each row of the table is read in a sequential (serial) order and the columns encountered are checked for the validity of a condition. Full table scans are usually the slowest method of scanning a table due to the heavy amount of I/O reads required from the disk which consists of multiple seeks as well as costly disk to memory transfers.[1]

[1] Ref: https://en.wikipedia.org/wiki/Full_table_scan

Don't know if there is a ES adapted term but basically anytime I have to read all of the Saved Objects into memory either through buffering or streaming I count that as a table scan and since it's done through network calls from ES -> Kibana it is adds to the expense.

"do as much of a query as you can with ES, then do the remaining filtering/mapping/aggs on the results in JS"

Yeah that's our ultimate goal. We are hoping to avoid any operation that causes us to iterate over all the alerts in memory from ES -> Kibana due to:

  • Ugly boiler plate code we have to write
  • Increases odds of race conditions
  • More network bandwidth issues and more network error conditions to manage
  • performance issues
  • memory usage increase within Kibana

Would being able to query/filter/sort by a set of internal tags be enough for you?

Can we get aggs as well with arrays in Elastic Search? The use case is that sometimes we need to do unique counts of items and display them. If this PR is merged we might have it?
https://github.com/elastic/kibana/pull/64002

If we were able to query/filter/sort/aggs I think that covers all of our use cases to avoiding table-like scans.

As an aside:

If you still are going to allow mapping I would suggest combining two parts of your approaches above.

This part:

While we would have one single SavedObject type of alert (as we do now) we can store each alert type's params and each action type's config under a corresponding key based on their unique ID and keep the mapping set to dynamic so that we can query against the internal shapes.

and this part:

Another option is to standardise the shape of params across all AlertTypes such that each AlertType will specify the exact shape and types of their params and we'll merge these shapes together into one static shape which will be used to define the mappings of these params.

So that it becomes this:

While we would have one single SavedObject type of alert (as we do now) we can store each alert type's params and each action type's config under a corresponding key based on their unique ID. Each AlertType will specify the exact shape and types of their params and we'll merge these shapes together into one static shape which will be used to define the mappings of these params.

Then you have exact mappings and avoid mapping explosions and conflicts and the Saved Objects migration system takes over when we update our mappings. Then we are responsible as a team for migrating our mappings altogether.

For what is worth also ... On the community forums and community slack, users are opening up their .kibana saved objects by granting permissions to it and then creating dashboards our rules which includes the params:
https://elasticstack.slack.com/archives/CNRTGB9A4/p1589221644238900

I know that might not be exactly what we may want this quickly but it is something we have to keep in mind that users are granting each other privileges to their saved objects index so they can write their own dashboards against static SIEM rules.

Since things are "beta" I think they would be ok with updates but might become frustrated if their dashboards are no longer possible. On the flip side, if this brings more features they couldn't have before such as sorting/filtering/querying/aggs then they will be very delighted even if we have to advise them on how to update there existing dashboards.

This might be an unsupported or discouraged thing that users are opening up saved object indexes? However I want to point it out as it's already happened.

FYI, scripted fields seems to work and maybe at least better than doing it in js.

Sorry but could you be more specific?
Given the limitations we have on making these fields searchable, how would you go about using scripted fields to achieve what SIEM are looking for?

If, for example, you create a filtered alias to just pick the rules from .kibana then add an index pattern in kibana with this scripted field using painless

def mitre = [];
for (value in params['_source']['alert']['params']['threat']) {mitre.add(value['tactic']['id'])} return mitre

You can then aggregate on the tactic ID for reporting.
Matthew

Are sorting and aggregations absolutely necessary? If not, https://github.com/elastic/elasticsearch/issues/33003 is a potential solution worth evaluating.

New possible solution mentioned in option 4 of #67290. That issue explores options to solve Elasticsearch merging objects on update when the mapping has enabled: false.

This may not be a good approach but worth mentioning. Probably doesn't solve sorting or aggregations which would make https://github.com/elastic/elasticsearch/issues/33003 more a solution worth evaluating as @kobelb mentioned.

Change object structure to array

I noticed the alert's actions[x].params attribute doesn't have this problem. There is a possibility that storing alert params and action configs into an array structure would solve this problem and possibly also make them searchable.

The params value structure could be something like the following:

[
 {
   "name": "index",
   "value": "foo"
 },
 {
   "name": "timeField",
   "value": "@timestamp"
 }
]

This would allow a consistent mapping of name and value where value can be enabled: false but won't be impacted by this issue (due to being within an array). This would also require a saved object migration.

The step further that would be required to make the values searchable would be do split the values into different mapped fields. This would require orchestration between field name and value field.

Screen Shot 2020-05-25 at 1 47 15 PM

To query this data, it would look something like this:
Screen Shot 2020-05-25 at 2 00 19 PM

Are sorting and aggregations absolutely necessary?

Yes. We have params such as risk_score and a lot of users internal and external to our organization are asking why we cannot sort our tables or queries based on these important items.

This would be awesome thing for uptime alerts as well, we will be able to determine which alerts are enabled or not.

we will be able to determine which alerts are enabled or not.

You can filter on that today, no? Using KQL

we will be able to determine which alerts are enabled or not.

You can filter on that today, no? Using KQL

@mikecote first we need to determine how can we find an alert via params. if params aren't searchable, we cant find an alert.
What are the alternative ways to find an alert, can we add custom tag to an alert while creating? or maybe specify an alert name in the flyout, and dont let user change it.

@mikecote @gmmorris any update on this? Any plans for 7.10 or 7.11 for this?

@shahzad31 we haven't agreed on a path forward for this yet so we don't have plans to work on this yet. We are planning in the meantime to add support for internal tags in 7.11 (https://github.com/elastic/kibana/issues/58417).

@mikecote we are beginning to get a lot of feedback from internal Elastic rule creators as well as external users that our 200+ prebuilt detection engine rules is painful to sort and find in the Rules page. As we are expecting to continue to ship more prebuilt rules in each release, I'd like to discuss the possibility of putting this issue on the 7.11 priority list of the Kibana Alerting team.

cc: @arisonl @peluja1012 @spong @MikePaquette

@dontcallmesherryli We put it in our 7.12 tentative plan for now. We will keep the implementation discussion going until then to have an idea what approach to take.

This came up again when I talked to the o11y folks this week. I keep hoping that our "system tags" issue may end up solving this. @sqren noted that this would then end up being painful trying to keep params AND tags up-to-date - it's a least more boilerplate-y code that alerts clients would have to manage.

One thought on that would be to provide some kind of "helper" function instead of code, which given a params object would return a set of "field tags" or such. Here's an example for index threshold alert type, imagining you might want to eventually be able to search on the index used in the alert:

function fieldTags(params: IndexThresholdParams) {
  return {
    'index': params.index
  }
}

We would call this function (defined in the alert type, but it's optional of course), and then add these to the "system" tags with the resulting data.

Another issue with this is that my thoughts on the "system tags" is that they certainly would not be updateble via HTTP, only the in-process alert client, and it's possible we may not want these readable either. But I think we'd want these "field tags" to be readable/searchable via http, so we can make use of these in Kibana.

cc @arisonl

We should do an investigation to see if https://www.elastic.co/guide/en/elasticsearch/reference/7.x/flattened.html can be used as a solution. Adding the research label.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

bradvido picture bradvido  Â·  3Comments

snide picture snide  Â·  3Comments

tbragin picture tbragin  Â·  3Comments

timroes picture timroes  Â·  3Comments

celesteking picture celesteking  Â·  3Comments