Chart.js: [FEATURE] Histograms

Created on 13 Feb 2017  路  11Comments  路  Source: chartjs/Chart.js

A nice feature would be to support histogram charts, or alternatively, update the bar chart so that a histogram can be manually created given the right data. At the moment it seems like a bar chart only can represent value-category pairs, while a histogram would require the bars to be drawn at the right positions on a numerical/linear axis.

help wanted enhancement

Most helpful comment

Any way to position the xAxis labels to align 'with' the gridlines instead of below the bar? So as to give more of that "bucket" feel...

All 11 comments

You're correct about the current limitations of the bar graph. If you pre-process your data into bins then you can easily create the histogram using the current bar chart.

Some thoughts on bar sizing in this case:

  1. Bars have a fixed width and gaps may appear between the bars
  2. Bars width is determined by the distance between bars and calculated so that bars touch

I think option 2 is better here since it produces a better result

In terms of data processing, I think this would be best as a new chart type that extends from the bar chart. The calculateBarX and calculateBarWidth methods would need to be overridden
https://github.com/chartjs/Chart.js/blob/master/src/controllers/controller.bar.js#L196
https://github.com/chartjs/Chart.js/blob/master/src/controllers/controller.bar.js#L165

Thanks for the tips! I will try that out.

As long as one can get away with:

  1. bins of equal width, e.g., [0, 2), [2, 4), [4, 6), [6, 8], in contrast to bins of variable width,
  2. labels under bars that denote the bar's range, e.g., "[0, 2)", "[2, 4)", "[4, 6)", "[6, 8]", or "0-2", "2-4", "4-6", "6-8", in contrast to having a numerical axis (whose ticks may or may not align with the bars' vertical sides),
  3. manually calculating each bar's height, that is counting the value for each bin,

then the current bar chart may be used, just making sure that:

barPercentage : 1.0,
categoryPercentage : 1.0,

are used as axis options, so that the bars touch. I have used the bar chart this way myself where a histogram was needed and it was nice.

But if the histogram was to be fully supported, then points (1) and (2) should be dealt with, as shown in:

I am not sure about point (3). I think that this cannot and should not be avoided. If thousands of values fell into one bin, should the chart "know" all those values? I guess not. I many cases the dev would not "know" all those values either.

So, one way or another, one should define something like the following as data for the histogram:

[
  [0, 2, 15],
  [2, 4, 29],
  [4, 6, 32],
  [6, 8, 15],
]

or (irregular histogram):

[
  [0, 1, 15],
  [1, 4, 29],
  [4, 7, 32],
  [7, 8, 15],
]

I am not familiar with the source code, so feedback is needed. My question is this: Is extending the bar chart the way to go? Because the numerical axis is definitely needed.

EDIT:

Scatter chart builds on line chart, but uses a numerical scale as well as a different data representation. Following this approach, maybe histogram could build on bar chart and a representation such as the following could be used (here width is added as a parameter, another option would be to have xLeft and xRight - this depends on library conventions):

data: [
  {
    x: 0,
    width: 1,
    y: 15,
  }, {
    x: 1,
    width: 3,
    y: 29,
  }, {
    x: 4,
    width: 3,
    y: 32,
  },
  {
    x: 7,
    width: 1,
    y: 15,
  },
]

I feel like extending the bar graph is the way to go. Bins should be equal-sized, at least as the first cut. You can easily fit data into buckets by doing something along these lines with integer values, assuming a bucket size is 10.

In [20]: [(i/10)*10 for i in [1023, 1025, 1021, 1033, 1039, 1037, 1033]]
Out[20]: [1020, 1020, 1020, 1030, 1030, 1030, 1030]

This is simple enough to implement to group values into containers. The only tricky part is determining proper number of bins. Some people prefer fairly blocky histograms, some like the more hairy ones, where bins are smaller and you get more of them. Ideally one should be able to pick a number of bins to display, but you could probably make this value fixed for first, just to see how it works out with a bar graph. You could say stick with 30 to 50 graphs and then divide the range of data by that number to get a rough size of each bucket, perhaps.

I wish I had more free time to help with this, but have been slammed as of late. I am happy to test and offer constructive feedback. :)

I'm trying to plot histograms but I have variable bins sizes (semi log) so I'd really like the option to set a [start, end[ for each bar - and overlay some cumulative % at the mid point

For now I'm trying to do this by hand using scatter but being completely new to chartjs I still haven't found how to draw lines between points (X,Y line graph instead of X,Y scatter point graph) - hint/pointer for that would be welcome

Edit: found that I can just use "line" as the mode and style have X,Y in the data - when also using type: 'linear' xAxes

Here is a jsfiddle that I cobbled together over lunch using a bar chart to make a histogram. Just a rough draft to help people get started.

https://jsfiddle.net/s8qas3km/17/

if you want to see you can see fortio's line based implementation on
https://istio.fortio.org/

source code: https://github.com/istio/fortio/blob/master/ui/static/js/fortio_chart.js

Any way to position the xAxis labels to align 'with' the gridlines instead of below the bar? So as to give more of that "bucket" feel...

I'm using bar charts with fair results to plot histograms. I would like to see a feature to label edges rather than bars, and may look into implementing. I've implemented an NPM module to calculate the bin sizes here: https://www.npmjs.com/package/compute-histogram.

Here's an example of it in use: https://yield.io

Testing this in v2.9,3, setting gridLines.offsetGridLines to false on the x axis gets a bit closer

v2.1.3 https://jsfiddle.net/z136gkL4/
v2.9.3 https://jsfiddle.net/3ehp4L58/1/
v3.0.0-alpha https://jsfiddle.net/8cothjzs/

@benmccann @kurkle one thought on how we could support this:

  • Keep offsetGridLines for the bar location in it's current form
  • Introduce a new option (name TBD) for controlling the tick label alignment. Values: 'start' and 'center'. We'd default to 'center' for bar charts and this would put mimic the current v2 behaviour. If the user overrides to 'start' we place it on the grid line.

If I do a Google images search for histogram none of the results have vertical gridlines. I think I'd just turn off the gridlines in the fiddle and call it a day :smile:

Was this page helpful?
0 / 5 - 0 ratings

Related issues

JewelsJLF picture JewelsJLF  路  3Comments

joebirkin picture joebirkin  路  3Comments

SylarRuby picture SylarRuby  路  3Comments

HeinPauwelyn picture HeinPauwelyn  路  3Comments

frlinw picture frlinw  路  3Comments