Kibana: [Usage Collection] Extend the APIs provided to the `fetch` method

Created on 25 Aug 2020 · 14Comments · Source: elastic/kibana

Describe the feature:

As originally mentioned in #74840, it might be interesting to provide the new ES client and a scoped Saved Objects client in addition to the current callCluster we provide to the fetch method.

@chrisronline recently suggested it would be very useful for them to know the req: KibanaRequest themselves when triggering their collector, so they can properly scope other APIs like the Alerting one.

Telemetry KibanaTelemetry

Source

afharo

🚀1 ❤1

All 14 comments

Pinging @elastic/kibana-telemetry (Team:KibanaTelemetry)

elasticmachine on 25 Aug 2020

+1 for adding the ES client and SO client helpers to the collector.

We trigger the collector from the server side thus most cases we are not bound to a requesting user.

Generally I'd say its not a good design to send user scoped data back to telemetry. I'd love to hear more about the usecase you have in mind for this @chrisronline

Bamieh on 25 Aug 2020

We'd love to collect telemetry on what monitoring-specific alerts exist for users. We are interested in an overall count, but in order to get this data from the alerting APIs, we need a valid user-scoped request object. While I understand that this type of data is user-specific (and could be different from user to user), we still think it's valuable.

Right now, the alerting team collects some telemetry data that looks like:

"alerts" : {
  "count_total" : 7,
  "count_by_type" : {
    "monitoring_alert_cluster_health" : 1,
    "monitoring_alert_logstash_version_mismatch" : 1,
    "monitoring_alert_license_expiration" : 1,
    "monitoring_alert_kibana_version_mismatch" : 1,
    "monitoring_alert_nodes_changed" : 1,
    "monitoring_alert_elasticsearch_version_mismatch" : 1,
    "monitoring_alert_cpu_usage" : 1
  },
  "schedule_time" : {
    "avg" : "60s",
    "min" : "60s",
    "max" : "60s"
  },
  "connectors_per_alert" : {
    "min" : 0,
    "avg" : 1,
    "max" : 1
  },
  "count_active_total" : 7,
  "count_active_by_type" : {
    "monitoring_alert_cluster_health" : 1,
    "monitoring_alert_logstash_version_mismatch" : 1,
    "monitoring_alert_kibana_version_mismatch" : 1,
    "monitoring_alert_license_expiration" : 1,
    "monitoring_alert_nodes_changed" : 1,
    "monitoring_alert_elasticsearch_version_mismatch" : 1,
    "monitoring_alert_cpu_usage" : 1
  },
  "throttle_time" : {
    "min" : "86400s",
    "avg" : "86400s",
    "max" : "86400s"
  },
  "runs" : 8,
  "count_disabled_total" : 0
}

It does offer a total count, but as you can see, the monitoring alerts span across multiple types and we'd need to use a scripted field in the telemetry Kibana to properly sum those up. However, that scripted field would also need to know the names of all the monitoring alert types, which will grow large over time.

We currently maintain the list of alerts explicitly so for us, it'd be nice to be able call the alerting APIs from our usage collector, provide the array of alert types, and get back a count of which ones are enabled.

In the future, we'd also like to get configuration details from each one (like if the user changed the default threshold to something else).

chrisronline on 25 Aug 2020

@Bamieh from the technical POV, even on the server side, the Alert API requires the request context in the API they expose on their Start contract:

https://github.com/elastic/kibana/blob/67e28ac8b45df85c18fe71902833a0c5bd36fe2d/x-pack/plugins/alerts/server/plugin.ts#L77-L80

I wonder if we can create a _fake_ KibanaRequest when the collection is server-driven (maybe the @elastic/kibana-platform team can drive us in the right direction here?)

afharo on 25 Aug 2020

FWIW, the alerting team themselves have a need for something like:

I wonder if we can create a fake KibanaRequest when the collection is server-driven

in https://github.com/elastic/kibana/issues/59813 so it might be worth collaborating for a common solution.

chrisronline on 25 Aug 2020

👍1

Thanks for explaining. I argee, I think we'd need to collaborate with @elastic/kibana-alerting-services for a solution

Bamieh on 25 Aug 2020

👍1

I added a note here: https://github.com/elastic/kibana/issues/59813#issuecomment-689708389 so we consider or revisit this issue at the same time.

mikecote on 9 Sep 2020

❤1

TinaHeiligers on 30 Sep 2020

In https://github.com/elastic/kibana/pull/82638#discussion_r518644017 we've identified another possible use case for the need for KibanaRequest. In this case, ideally, it would be provided only when it's available (there's an actual unencrypted API request triggering the collection).

This doesn't change the situation with the Alerting APIs (since they still require it for every request and KibanaRequest does not always exist).

afharo on 9 Nov 2020

I wonder if we can create a fake KibanaRequest when the collection is server-driven (maybe the @elastic/kibana-platform team can drive us in the right direction here?)

That's part of https://github.com/elastic/kibana/issues/39430. It has been opened for quite a long time now, but we are supposed to tackle this during 7.12.

pgayvallet on 9 Nov 2020

❤1

I'm not a fan of exposing the KibanaRequest to the usage collectors, but it's possible that I'm lacking some context here.

Do we _really_ need to execute these requests with the credentials of the KibanaRequest, or do we merely need the request in order to properly create Kibana's various service clients?

If the former, then I need some help understanding the benefits. It seems like you'd end up with wildly inconsistent usage data if it's all scoped to the current user's privileges, and so I'm having a hard time seeing the benefit to capturing that at all. Is bad data better than no data?

If the latter (we want service clients created), then my preference is to wait for #39430, where we can solve this properly.

Inconsistent data/data quality issues aside, my other primary concern is that consumers will not consider the end user's credentials when building their usage collectors. It's very possible that the user "chosen" to transmit telemetry won't have access to much of anything themselves. As an example, you can create a user who only has read access to Advanced Settings, and _nothing else_. If you accidentally or intentionally used their KibanaRequest to calculate usage data, then you might very well end up with authorization errors, and/or an empty data set.

legrego on 9 Nov 2020

Thank you @legrego for your comments. I feel like we are missing a bit of context in this issue to fully describe the current behaviour and the reason why we are considering exposing the KibanaRequest:

At the moment, the fetch method in each collector receive some common clients: ES client (both, the new one and the deprecated legacy one) and the SO client. All of them are properly scoped.

Why do we scope the clients?

In our efforts to be transparent about the data that is sent to our remote telemetry service, we offer the users a way to retrieve the telemetry sample that we would send via the Advanced Settings page:

There is the risk that some users with limited access might gain access to some information they shouldn't see (i.e.: if the user doesn't have access to Stack Monitoring, it shouldn't be able to see any Monitoring-related stats). For that purpose, when requesting the sample in this way, we _scope_ the response to what that user can actually see.

However, when requesting the telemetry payload that will be sent to the remote telemetry service, the clients are scoped to the Kibana Internal user. This payload is encrypted on the server-side and decrypted on the remote telemetry service. This way we _mask_ the data when using the browser sending mechanism for those users with limited access.

Why do we need to provide `KibanaRequest`?

You might be wondering: if fetch is already receiving the scoped clients, why should we provide KibanaRequest as well?

The answer to that: We've noticed a couple of situations when the collectors need custom clients (i.e.: using the alerting APIs, or as recently discovered, some Monitoring collectors are using their own custom ES client with access to the .monitoring index).

my other primary concern is that consumers will not consider the end user's credentials when building their usage collectors.

That statement totally resonates with us, and we are considering what's the best way to discourage their use in a way that it's only used when "the dev knows what they're doing".

Either via documentation (README and TSDocs or even extending the API to request for an additional config parameter in order to include it in the context (plus still documenting why devs should only use it with caution).

I hope this additional context helps understand the whys. The question still remains though: Should we take that risk? Or are we OK with exposing that data to the users that might not be allowed to see that data?

afharo on 10 Nov 2020

@afharo I appreciate the additional context, thanks for taking the time to explain this to me.

There is the risk that some users with limited access might gain access to some information they shouldn't see (i.e.: if the user doesn't have access to Stack Monitoring, it shouldn't be able to see any Monitoring-related stats). For that purpose, when requesting the sample in this way, we scope the response to what that user can actually see.

That makes sense, and I'm happy to see that we're taking this into consideration. Based on my limited exposure to the usage collectors that plugins register, I assume that this is handled transparently by the usage collection service? In other words, the individual collectors don't need to know when they're retrieving data for the unencrypted sample payload vs the actual encrypted payload?

That statement totally resonates with us, and we are considering what's the best way to discourage their use in a way that it's only used when "the dev knows what they're doing".

Either via documentation (README and TSDocs or even extending the API to request for an additional config parameter in order to include it in the context (plus still documenting why devs should only use it with caution).

❤️ Happy to hear this. I'm a fan of forcing consumers to explicitly request the KibanaRequest via an additional config parameter (and documenting when it's safe to use). Docs are immensely helpful if you seek them out, but they don't provide any safeguards.

The question still remains though: Should we take that risk? Or are we OK with exposing that data to the users that might not be allowed to see that data?

My understanding is that this wouldn't expose data to unauthorized users. If a usage collector creates a scoped client using the KibanaRequest, then it shouldn't be possible to return data that this user can't see. I think the larger fear is that an engineer will "test" this by looking at the unencrypted payload, and assume that it'll also work for the encrypted payload. To be fair, we run that risk today, but the risk is contained to a single location since you're providing the scoped clients to all consumers, instead of asking the clients to create their own.

If my understanding is correct (that we don't expose data to unauthorized users), then I don't have a strong objection from a security standpoint.

I'm certainly not one to dictate the design, and you may have already considered this, but this sounds like a problem that the ContextContainer was designed to solve. (cc @elastic/kibana-platform). This would allow plugins like Alerting to provide scoped clients to all usage collectors, so they wouldn't have to create their own.

legrego on 10 Nov 2020

For Stack Monitoring, I think the root cause of our issue is that the data we want to report on isn't really user-specific, but it's contained in a system which is only user-specific. I know the alerting team is working on making alerts that exist outside the scope of a user so once that is better supported, we might have a better way of getting this data without needing a request object.

chrisronline on 10 Nov 2020

Was this page helpful?

0 / 5 - 0 ratings