Influxdb: Hierarchical Data in InfluxDB

Created on 26 Jun 2014  ยท  20Comments  ยท  Source: influxdata/influxdb

I was looking at Grafana's "play" dashboard before I set up my own Grafana and noticed while editing one of the charts that there were nested tables/series in the tool for selecting data.
Is this supported in InfluxDB?
If so, how is it done?

An example of the selection tool can be reached by selecting the title of one of the charts, selecting edit, and then editing the query bar beneath the chart.

Thanks!

Most helpful comment

Is it possible to make series of series?

Rather, if I had a building with rooms with sensors, how would I organise the building to have multiple rooms, each of which have multiple sensors, each of which have multiple columns of RGB and other data?

Did you finagle that sort of functionality?

All 20 comments

+1 Documentation on the nuances of getting data out of Grafana when you use a series list with multiple columns to retrieve the data.

https://github.com/novaquark/sysinfo_influxdb/issues/5

Technically I've done this; but it's not as pretty as I would like it.

Is it possible to make series of series?

Rather, if I had a building with rooms with sensors, how would I organise the building to have multiple rooms, each of which have multiple sensors, each of which have multiple columns of RGB and other data?

Did you finagle that sort of functionality?

@calben I struggled to make it work well; at least with Grafana it could be done but you ended up making a bunch of queries to grab all the different columns

It works but I actually found it easier to use http://influxdb.com/docs/v0.7/api/continuous_queries.html to get them into a easier to query format.

using influxdb-cli it was just beautiful to query and a real joy; it just did not really scale well in Grafana.

Thanks @damm !
Can you explain to me briefly why this is so much simpler using a Graphite server rather than InfluxDB for Grafana? (I'm just coming into this and still figuring out what's what and what needs to be addressed)

@calben Grafana was really written for Graphite. OpenTSDB and InfluxDB came after it's release.

I started filing issues so that torkelo be aware of the it. I don't think he uses influxdb primarily and I would say it could use some polishing still when it comes to influxdb.

I wish I could select multiple series in 1 list; use Regexp's (wildcards) and a few other things. Which are issues created.

https://github.com/grafana/grafana/issues/477

If we need to help we just need to know.

screen shot 2014-06-25 at 7 09 59 pm

@calben that screenshot there kinda shows my pain of grafana.

  • Find the metric to query, it produces a whole list that overlaps everything (that autocompletes) so good and bad
  • the mean(value) and or just specifying the value needs to be more clear; grouping by time is confusing for new users as well they don't realize they can be reducing their dataset.

This is what I refer to nested fwiw. Grabbing Trying to draw the stats for each cpu on user was just 4 queries; then you add other bits :(

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ time          โ”‚ sequence_number โ”‚ id   โ”‚ idle โ”‚ nice โ”‚ sys โ”‚ total โ”‚ user โ”‚ wait โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚ 1403417121512 โ”‚ 188490001       โ”‚ cpu3 โ”‚ 991  โ”‚ 0    โ”‚ 2   โ”‚ 1002  โ”‚ 3    โ”‚ 6    โ”‚
โ”‚ 1403417121512 โ”‚ 188480001       โ”‚ cpu2 โ”‚ 987  โ”‚ 0    โ”‚ 3   โ”‚ 999   โ”‚ 9    โ”‚ 0    โ”‚
โ”‚ 1403417121512 โ”‚ 188470001       โ”‚ cpu1 โ”‚ 995  โ”‚ 0    โ”‚ 1   โ”‚ 999   โ”‚ 3    โ”‚ 0    โ”‚
โ”‚ 1403417121512 โ”‚ 188460001       โ”‚ cpu0 โ”‚ 965  โ”‚ 0    โ”‚ 3   โ”‚ 999   โ”‚ 12   โ”‚ 18   โ”‚
โ”‚ 1403417121512 โ”‚ 188450001       โ”‚ cpu  โ”‚ 3938 โ”‚ 0    โ”‚ 10  โ”‚ 4001  โ”‚ 27   โ”‚ 24   โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”˜

@damm Grafana does support regex in the series clause. There was a bug in 1.6.0 that made it stop working, fixed in 1.6.1.

It is true that I have mainly used Graphite and that is a problem because I do not daily use InfluxDB and experience the same pain you do. So I rely more heavily on user feedback. Keep in mind that Grafana is still a hobby project for me (working on it on free time). So I have to prioritize issues after my own judgement and the number of +1 votes issues get. In just one week I will leave my current client and start working on Grafana full time. I plan to spend more time on improving usage for InfluxDB users.

The problem with InfluxDB is also that it is much more low level than Graphite and requires a lot more work to get the same ease of use / experience. Things like the group by time in the influxdb query is needed as without down sampling the data you heavily risk of hanging your browser with millions of rows returned from influxdb. This is all taken care of with Graphite (you do not have to think about it when you query). The same with series aliasing and group by, this requires a lot more of the client as with Graphite it is built in.

When you for example group by a property in influxdb you then have to do another group on on the receiving end as you only get a single series back with a bunch of columns. With graphite you get grouped series that are ready for display. I am not saying InfluxDB is bad, it is in fact great and in many cases more powerful than Graphite but at the same time more low level and requires a lot more work on the client to be easy to use. Things get more complicated as different InfluxDB users use InfluxDB differently (some use one series per metric, others use a single series plus other columns to make it unique).

The query language in InfluxDB is also a little hard to write good tooling around. Might have been easier with a json object query api as well (like Elasticsearch has).

I will fix your issue https://github.com/grafana/grafana/issues/477 around math in column select clause asap. Did try to fix it for 1.6.1 but got stuck on how to handle series aliasing for influxdb.

Feedback on this issue is much welcomed (on enhancing series aliasing):
https://github.com/grafana/grafana/issues/525

@torkelo yeah I need to upgrade to 1.6.1 I just have not gotten around to it yet.

P.S. Please don't take my previous comments as a negative critique of Grafana; I found my own experience of leaving graphite and just using Influxdb and learning how to get what I want changed a lot over the past 4 months.

So selecting nested data from Grafana is going to be problematic for now, which is fine.

Ignoring the reading of the data, how do I _write_ this sort of data?
The following json is rejected, so I'm not sure how to do it:

{
   "name":"building_1",
   "columns":"room1,room2",
   "points":[
      {
         "columns":"sensor1,sensor2",
         "points":[
            1,
            2
         ]
      },
      {
         "columns":"sensor3,sensor4",
         "points":[
            4,
            3
         ]
      }
   ]
}

+1 for @torkelo's comments on InfluxDB being low level. Sorry in advance if my comments / rant may seem a little off-topic to the ticket btw. I should share my findings to save people headaches, that or https://github.com/vimeo/graphite-influxdb.git :)

I think the missing component here for a lot of users would be graphite web. I feel like Grafana will be expected include a lot of features to replace it's functionality, though on the client side. While this may be achievable, I'm not sure it's the right way to proceed.

I am one of those people using collectd + InfluxDB's graphite plugin and I don't think it's the most efficient way to work. It has nothing to do with Grafana or Influx, both of those things are amazing tools and doing there job just fine. In my opinion it's the way collectors are creating information in Influx which I believe needs to change.

One of the problems I've seen is when piping metrics from say collectd to Influx is the lack of "column" usage. Very few collectors are yet to take advantage of extra columns made possible in Influx. So while graphite-web would do things for you like allow you to select a host calculated from a series name, Grafana (InfluxDB source) is going to have a pretty tough job of working that out for you. It's not yet aware (and probably should never be) of the naming conventions presently employed by graphite users, that is using series names as a mean of identifying things.

I might switch back to carbon + graphite-web and spend my time on making collectd play more friendly with Influx (for example)

As far as I can tell Influx is not a replacement for Graphite + Carbon, It's a whole new way of looking at metrics aggregation and presentation. I believe some of those changes will need to be reflected in Grafana.

Cheers,

Pauly

I'm still surprised there isn't an obvious way to store hierarchical data in InfluxDB (at the same time, I know it's difficult to implement efficiently, so there could well be a good reason for that).
I could also implement the feature using "tags" with a tag for each parent in the hierarchy, as @damm did.
Is that what you guys generally do?

@pmyjavec , thanks for the link!
I'm looking through that codebase now.

It sounds like my best bet might be to switch to Graphite + Carbon for now, but that'd be a damn shame.

Or maybe I'm approaching the model from the wrong perspective.
If I was trying to store hierarchical data, how would it be done with InfluxDB?
Should I make a separate database such as a MongoDB to keep hierarchical relationships and then build a proxy between Grafana and InfluxDB?

+1 for nested data. Not using Grafana, but seeing the same case for event sourcing. There are some events which have objects within the event it self

@calben it can be done it just requires a little work on the SELECT's to pull it off. For anything that requires I started using influga.

But we need to identify the problems and identify the solutions in order for us to drop Graphite + Carbon; as it provides a whole lot (and likely most people only use 10% of the features it offers). What it takes for me to convert is different than other people; so as we build more of a consensus on what we need the solutions can come with that.

@calben,

The thing I will say about running with graphite + carbon is that you can easily have InfluxDB running alongside it all as a "carbon relay", which is what I'm planning on doing. As soon as things mature a little more you can flick the switch :)

Cross-link to conversation on grafana and hierarchical series:
https://github.com/grafana/grafana/issues/347#issuecomment-55474746

This is hurting.

In my instance, a separate solution has been commissioned. :frowning:

Has anything come of the carbon relay yet, @pmyjavec ?

frankly i've always pondered that the idea of series and columns might be too much of an enforcement of a specific (too-structured) paradigm.
ultimately what it all comes down to is that your records correspond to certain properties (which can be modeled as simple strings, and/or key-value pairs), and when you want to query/visualize it's a matter of selecting those records that match certain criteria/properties, and potentially aggregating across certain properties.
But as far as modeling these properties as columns and/or pieces of string in the seriesname,
I have the impression that from the users perspective this separation is way too harsh and cumbersome for no reason.

Whether you want to model the properties that separate datapoints from each other, as part of the series name, or a differentiating column, can be thought of as an implementation detail instead of the enforced datamodel. (the implementation of column indexes reinforces this idea). To model data inside influxdb, I would think more in the direction of having properties to records instead of series names and columns. (implementation wise, the data could be stored in a leveldb with a key being the hash of the record's properties). I think this could be very powerful in a bunch of other contexts (selectors in queries and configuration). I think this would be much more powerful without becoming more difficult to implement or work with.
The discussion about pros/cons of both approaches on a lower level (effects on storage, sharding, etc) is necessary, but I don't think it will reveal big problems.

(BTW these ideas, and a bunch more, are in my side project metrics 2.0).

Too structured?

We are heading in quite the opposite direction in favour of a database with type, strict structure, and immutability.
This allows for easier manipulation using LAPACK, easier compression, and easier rolling storage of old, less used data as mirrored HDF5.

More flexibility is definitely valuable, but what would a user want to do that is made easier by having a less structured system?
I haven't encountered the use case.

3+ years later, any plans to implement structured data storage in InfluxDB?

@dandv This is a very old issue based on the 0.8 version of Influx, which had a different data model than Influx 0.9 and beyond. This is a request to support a data model like Graphite's.

We chose to go with a tag based approach as opposed to a hierarchy based one. The widespread adoption of Prometheus and the coming changes to Graphite to add tagging support bear this out as a good decision.

Changing the data model to support hierarchies would be incredibly challenging from both an implementation and an API/documentation/consistency of approach perspective. I think we're unlikely to ever add this. Unless it's something we consider for 3.0, which is at best at least three years away.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

allenbunny picture allenbunny  ยท  3Comments

airyland picture airyland  ยท  3Comments

MayukhSobo picture MayukhSobo  ยท  3Comments

affo picture affo  ยท  3Comments

jayannah picture jayannah  ยท  3Comments