I'm opening this feature request as a follow-up of a conversation with @ruflin. Today users typically use numeric types (eg. long
, float
, scaled_float
) with a convention regarding units (sometimes made explicit in the name of the field, eg. transferred_bytes
or duration_ms
) in order to store durations or byte sizes, but we could make the experience better by having native support for these fields in Elasticsearch:
+bytes_transferred:[1MB TO 1GB] +duration:[1s TO 1d]
One risk is that we end up with lots of feature requests to support distances, weights, etc. Where do we draw the line? It's been suggested that we only have one field that we configure with what it is going to store but it might not be practical given that some units have their own specificities, eg. k
means 1024 for byte sizes and 1000 for weights, some durations are not fixed (months, years, etc.). At first sight it looks cleaner to have one type per unit, which doesn't mean they can't share code internally.
Pinging @elastic/es-search-aggs
Hmm, no other database/search-engine has this type of field, correct ?
Good question. I don't know of any, but since I have limited knowledge of what field types other datastores provide, I could easily miss (even a major) one.
Discussed in FixitFriday: we want to do it. We will start with duration and byte sizes, which are common data that is stored in Elasticsearch. There might be asks for distances and temperatures coming next, we will handle such requests as they come depending on how much usage we expect from them.
@elastic/kibana-visualizations This will mean changes in supported types (e.g. do they actual return numeric values we can use in charts? will they include scaled extensions?), and might also require some changes in how values need to be handled or how we can show them.
@elastic/kibana-discovery This might mean changes to the filtering UI. Also this might mean changes to KQL to query for those fields.
@elastic/kibana-management This mean new fields types (if that effects index patterns somehow), this might also mean some changes to field formatters for those types.
@jpountz Please mention the above teams in case you are creating a PR or further tickets related to this feature.
/cc @epixa
/cc @alexfrancoeur (I think you know about that topic already, but it looks like you are not [yet] following that issue)
These seem similar to the range types, which afaik we don't do anything special for in Kibana. Is there something different about these that would imply we need to support them at launch, or is it a similar level of priority as other field types that we don't currently support?
I think we should at least be involved from the very beginning to highlight potential issues. For example I talked yesterday to Adrien, and right now the API was planned to return strings for those units, which would make it impossible to use any of those values inside charts as metrics (like drawing the traffic usage over time, or the duration an API took per Endpoint). Since imho especially for those metric values, people want to visualize them quickly in Kibana, we should at least staying involved in that, and not start thinking about it, after ES has build that feature and possibly can't change any API around it easily anymore. At what point we actually want to put this on our roadmap I think is a different discussion we need to have :-)
@Bargs as more people begin to use auto complete, is there anything we'll have to do to support this in KQL/Kuery?
KQL queries get turned into regular query DSL queries like range
, match
, exists
and query_string
, so assuming these field types don't need any special treatment in order to be used in those queries we should be fine.
I want to bump this thread as I still see quite a few use cases especially for the duration type.
Note that there's an ISO standard for duration and time intervals, including syntax and semantics. We should respect those standards for maximum reuse and the principle of least astonishement.
@jasontedor I wanted to bump this issue here as we started to discuss again around bytes and duration fields in Elasticsearch in the context of ECS and adding metrics: https://github.com/elastic/ecs/pull/480
Most helpful comment
Discussed in FixitFriday: we want to do it. We will start with duration and byte sizes, which are common data that is stored in Elasticsearch. There might be asks for distances and temperatures coming next, we will handle such requests as they come depending on how much usage we expect from them.