Kibana: Allow logarithmic scale on a chart with empty buckets (buckets with the value 0)

Created on 11 May 2017  路  12Comments  路  Source: elastic/kibana

Kibana version: 5.2.2

Describe the feature:

As per the discussion in the community forum, it is currently impossible to create bar-charts (and possible line charts and others too?) with a log scale, if there are buckets which contain no values. However, this feature would be useful if you get different types of messages where the message count per type are orders of magnitudes apart.

For example, consider a date-histogram showing success and error message count per time-slice. The use-case for that graph is to both see that something happened (which is why the success messages are included) and to see if something broke. In a normal scenario, errors happen sporadically and are often much smaller in number than success messages. See the following screenshot for an example:

screenshot_9

The two arrows point at buckets which contain error messages. Unfortunately they are hard to spot.

Using a logarithmic scale would help in this case. Unfortunately, when selecting this option, the whole graph fails with an error message complaining that: "Values less than 1 cannot be displayed on a log scale"

In this particular instance, looking at the count of messages, it is impossible that the value lies between 0 and 1.

ElasticCharts Visualizations KibanaApp bug

Most helpful comment

looking into this again today .... and the idea i came up with is that we are not interested in log values ... we just want log scale to better see our values.

this is already reflected in the tooltips .... when you mouse over a certain data point you will see the actual value, not log of that value ... which makes sense.

so the only thing we actually want to do is adjust the scale (so distance between 1 and 2 is the same as distance between 10 and 20 and not between 10 and 11 as what we would get with linear scale)

then:

  • negative values should not be a problem. we should calculate -log(-x) if x is a negative value
  • 0 should not be a problem, it should be represented as 0

not sure if this is really a good way, so some feedback would be welcome ... also not sure how happy D3 will be about this :)

regaring the NaN discussed above, i think thats a separate problem, nothing to do with log scale, as the same applies for other scales as well. at the moment kibana will filter all the values where y is NaN out.

All 12 comments

I urgently need this feature, too. I've previously used Grafana and switched to Kibana because it's generally better... well, until I discovered this nuisance (grafana supports the logarithmic scale just fine, mind you).

I also would like this to be addressed

Started to investigate this issue and wanted to leave some notes

Data set validation occurs here _point_series.js

    validateDataCompliesWithScalingMethod(data) {
      const invalidLogScale = data.values && data.values.some(d => d.y < 1);
      if (this.getValueAxis().axisConfig.isLogScale() && invalidLogScale) {
        throw new InvalidLogScaleValues();
      }
    }

The value scaling is done with d3.scale.log. The d3 scale is created here axis_scale

    getD3Scale(scaleTypeArg) {
      let scaleType = scaleTypeArg || 'linear';
      if (scaleType === 'square root') scaleType = 'sqrt';

      if (this.axisConfig.isTimeDomain()) return d3.time.scale.utc(); // allow time scale
      if (this.axisConfig.isOrdinal()) return d3.scale.ordinal();
      if (typeof d3.scale[scaleType] !== 'function') {
        return this.throwCustomError(`Axis.getScaleType: ${scaleType} is not a function`);
      }

      return d3.scale[scaleType]();
    }

Here is a sample of the output generated by d3.scale.log

const logScale = d3.scale.log();
console.log("logScale(10)", logScale(10));  // 1
console.log("logScale(1000)", logScale(1000)); // 2.9999999999999996
console.log("logScale(0)", logScale(0)); // NaN
console.log("logScale(0.3)", logScale(0.3)); // -0.5228787452803376

Open questions

  • What is the desired solution? Should rows with values less than one be discarded so the graph can be generated?
  • Should the filtering just be allowed for log scales
  • Would a more generic solution be useful where any Metric can specify a range for dropping rows and then visualizations that need log scales could use this feature to drop rows less than one

cc @elastic/kibana-visualizations

At first thought filtering on the values to make it possible in any graph sounds interesting. But I don't think if other graphs could really benefit from it. But then again, you never know which kind of visualisations a user comes up with, so it may come in handy in some cases.

But that filtering sound to me like another thing altogether.

The problem that I encountered and shown in this issue, is that by just one erroneous value for the selected visualisation & options, the whole graph becomes unusable. That is kind of extreme. If for some reason a faulty value sneaked in you will never be able to generate a visualisation with that faulty data point.

I've seen the concept of NaN in some other graphing tools. munin and pandas come to mind (even though pandas is not for graphing only). In munin these values simply lead to gaps in the graph. Which I find is maybe a bit too subtle. But I could imagine that for the example I've given in the above screenshot, values which are impossible to draw could show a kind of very faint bar with diagonal lines, or some other unobtrusive visual indication that that bucket contains invalid data.

But again, I thin the "filtering" option you talked about will solve the issue as well, but personally, I don't quite like it too much because:

  • Invalid data simply "disappears" from the visualisation. Remember that a mouse-over, or showing the data-table could reveal those values if they are not filtered out. This may or may not be wanted by the user
  • It feels to me like fixing a problem by adding a new feature, which kind of feels wrong to the purist in me.

i think that is just a typo in the code ...
log(0) or log(0.5) are not problematic .... only log of negative numbers is (it will produce the I numbers)

oh sorry, my math is a bit rusty :) log(0) is not defined .... however i guess we should threat it as 0 ?
at least thats what D3 does ....

or maybe +1 to every value ? as log(1) == 0 ...

I'm not quite a fan of "faking" data just to make the graph show up. In that case mouse-over popups could be misleading.

Then again, it might be a "quick-and-dirty" fix for this issue to just consider them as 0, which could be easily be done with a max(real_value, 0).

But I still strongly believe that the concept of a NaN would greatly benefit Kibana. These things happen. It makes sense do distinguish between NaN and missing values.

If someone looking at the graph sees a data-point and hovers over the point, reading 0, NaN and missing are three completely different things and may be caused by different things:

  • 0: The value was actually retrieved and calculated. 0 is the exact value at that data-point.
  • missing: No data available at that point. Either no data was sent to the stack due to network failures, application crashes, or the like.
  • NaN: The value is likely calculated and simply does not make sense. Like log(-1). This may hint to an error in the way the value is calculated or incorrect source-data.

While at that topic (although this is getting a bit tangential), there are two more concepts where it may be useful to have special values for:

  • positive-infinity
  • negative-infinity

Being able to identify these special concepts also gives authors of visualisations a way to draw them in a special way, and in turn providing valuable information to whoever is looking at a visualisation.

looking into this again today .... and the idea i came up with is that we are not interested in log values ... we just want log scale to better see our values.

this is already reflected in the tooltips .... when you mouse over a certain data point you will see the actual value, not log of that value ... which makes sense.

so the only thing we actually want to do is adjust the scale (so distance between 1 and 2 is the same as distance between 10 and 20 and not between 10 and 11 as what we would get with linear scale)

then:

  • negative values should not be a problem. we should calculate -log(-x) if x is a negative value
  • 0 should not be a problem, it should be represented as 0

not sure if this is really a good way, so some feedback would be welcome ... also not sure how happy D3 will be about this :)

regaring the NaN discussed above, i think thats a separate problem, nothing to do with log scale, as the same applies for other scales as well. at the moment kibana will filter all the values where y is NaN out.

My use case is very basic. I only care about error counts.

  • If there are no errors, I don't want to see anything, so we can filter out count == 0.
  • Counts are integers, so we don't need to worry about fractions < 1.

Solution: Just display values >=1

To make sure that a value of 1 is actually visible we can still use a baseline at .1

+1 for this feature

Adding this in Y-Axis/Advanced/Json Input will somehow solve the problem
{
"script" : {
"source": "_value < 1? 1:_value"
}
}

But this creates another issue. For aggregation types like sum, this will get +1 added for each individual value that is summed up (and was <1 before).

Was this page helpful?
0 / 5 - 0 ratings