Influxdb: [Feature request] Support dropping conflicting types on writes

Created on 20 Nov 2015  路  10Comments  路  Source: influxdata/influxdb

Started with https://github.com/influxdb/influxdb/issues/3460

Should we support an option at the shard-level to simply drop (and count) points that conflict by type? if set and if a batch of data contains 1000s of points, but one point is of the wrong type, should the system drop the point, and return success?

RFC arewrites kinenhancement

Most helpful comment

The other endpoints and the PointsWriter already support partial write semantics. I think it would be good to have the lower-level shard writes also support this. So, if you write 1000 points, but 2 fail because of type conflicts, 998 should succeed and we return a partial write error (with details of what failed) back up the stack.

In the SQL world, this is analogous to auto-commit mode.

All 10 comments

PM call required @pauldix

Other systems in the area of time-series and indexing systems, which determine value types on the fly, support similar options.

That said, InfluxDB does not have a single source-of-truth for field types, so even with this in place, one could still have differences between shards. The only way to truly lock down types is through https://github.com/influxdb/influxdb/issues/3006

This one actually isn't an issue anymore now that the line protocol requires a trailing i to explicitly call out an integer. I think it's fine to throw an error.

I was thinking about the case @pauldix where one builds and runs a service for other uses (multi-tenant SaaS, in-house monitoring system etc), and InfluxDB is not directly accessible to the users. The designers and operators may wish to offer this flexibility to their users.

For high-volume systems, the code that accepts the data for ingestion may be decoupled from InfluxDB (a Kafka-based pipeline is the canonical example these days). The system returns OK to the client once the pipeline accepts the data. By the time it is detected that 1 point in the batch of 10K points is bad, the system components upstream may have responded OK to the client. The client sees the entire batch of 10K points missing, when they might have preferred losing just 1 point.

I do need to check the latest code to determine how much of the batch would actually be affected (assuming it's all destined for a single shard -- perhaps all good points are actually written).

The other endpoints and the PointsWriter already support partial write semantics. I think it would be good to have the lower-level shard writes also support this. So, if you write 1000 points, but 2 fail because of type conflicts, 998 should succeed and we return a partial write error (with details of what failed) back up the stack.

In the SQL world, this is analogous to auto-commit mode.

We do already have the partial write semantics since writes can hit multiple shards. So we'd just need to push it down to the shard level.

Just got bit by this. Had some hosts which were writing one field of a single measurement as a different type than other hosts were. With the way telegraf works, this was causing the majority of our metrics to go missing (even completely different measurements than the one that was erroring) as it does a batch write, and only the metrics before the bad one were getting written (and the bad one was near the top of the batch).
I think this is a bad user experience. If the metrics before the bad one get inserted, then the ones after it should be inserted too. And if the solution is a change so that metrics before the failing one don't get inserted, a failure of one measurement type shouldn't affect other measurements.

Any status on this? We're writing in batches of 1 point to avoid potentially losing data because of this.

closing this because https://github.com/influxdata/influxdb/issues/7814 is a dupe of it

should have closed #7814 but we started discussing there so :man_shrugging:

Was this page helpful?
0 / 5 - 0 ratings

Related issues

Witee picture Witee  路  3Comments

Raniz85 picture Raniz85  路  3Comments

jayannah picture jayannah  路  3Comments

756445638 picture 756445638  路  3Comments

davidgubler picture davidgubler  路  3Comments