Vector: All values of "-" are converted into "nil" by type conversion

Created on 11 Jul 2019  路  9Comments  路  Source: timberio/vector

Pull #580 included a patch to the types conversion framework whereby any value of exactly "-" is converted into the string "nil" and not into a nil value. I am pretty sure that is not what was desired in issue #523.

To do this properly probably requires a new nil type in enum ValueKind, since it is distinct from any other kind of value. A quick patch might be to convert the "-" fields to an empty string. Both should be scoped to the tokenizer transform, as that is what was requested in the issue. As it stands now, this affects the regex transform as well, and I found it when implementing tests for the grok transform.

good first issue tokenizer bug

All 9 comments

Nice find. That is definitely not what we wanted. I believe we want to avoid introducing a nil type and just drop the field. @lukesteensen is that correct?

@bruceg so I think @lukesteensen actually wanted to go ahead with an empty string instead of an actual nil variant on ValueKind as I think that will become harder to encode down the line. (Though not too sure).

I was just trying to think which option would be the least surprising and cause the fewest issues for users. A string "nil" is very surprising and weird. An occasionally missing field _can_ lead to issues, depending on how it's handled (I think we usually drop the event if we fail to find an expected field), and is effectively an inconsistent schema. Leaving the field as an empty string could lead to some unexpected output where it gets interpolated, but it's less likely to cause explicit errors and being "empty" is kind of correct?

Alternatively, we could just leave it as "-" :smile:. It's maybe somewhat less convenient for people who want to operate on that field, but it shouldn't be very surprising or cause issues.

What if we let the user choose what it results as?

We could provide an option if we think that'd be valuable, but we still need to pick a default behavior.

The part of this bug that affects all type conversions (and so the regex parser) is blocking solving issue #477 due to causing existing tests to fail. If I back out the change to src/types.rs to let the grok tests pass, the tokenizer tests fail, and I see no reason to have to modify the existing grok tests to compensate for this behavior.

Since there isn't consensus on what the exact behavior should actually be, and the current behavior is clearly surprising, that would seem to point to reverting the previous change until the problems can be ironed out.

@bruceg I just pushed a commit that backs out most of the change, leaving some of the tests that seemed worthwhile. That should let us move forward for now and then we can adjust those tests to enforce whatever tokenizer behavior we decide on.

I think this can be closed then, since the immediate issue is handled?

Yep!

Was this page helpful?
0 / 5 - 0 ratings

Related issues

binarylogic picture binarylogic  路  4Comments

Hoverbear picture Hoverbear  路  3Comments

LucioFranco picture LucioFranco  路  3Comments

kaarolch picture kaarolch  路  3Comments

raghu999 picture raghu999  路  3Comments