For Helm, users are instructed to provide values.yml file. You'll notice in the example we mix both YAML and TOML. This is awkward. Given that I don't have a lot of experience with Kubernetes and Helm I have two questions:
If Helm is strictly YAML, and we think users will have a much better experience configuring Vector with YAML, we should consider expediting #3174.
- What do other tool do with non-yaml configurations?
They do pretty much the same. I borrowed the our implementation from somewhere.
- What's common in the Helm ecosystem?
Helm used to support TOML format for values.yaml (values.toml), but they've switched to YAML at 2016: https://github.com/helm/helm/issues/768
Nowadays, Helm is configured exclusively via YAML.
I inspected helm source code, and discovered an undocumented toToml template function. This should make it possible to take input as JSON, rather than raw TOML literals, and convert it to TOML for passing to the ConfigMap under the hood.
Surprisingly, the super-common
toYamlisn't documented either. It's just nowhere in the docs, while Helm itself uses it by default a lot in thehelm creategenerator. The documentation of Helm needs some work.
Although this looks promising, I presume this will be confusing as well. All our docs are toml-oriented, and it will be awkward to convert from TOML to YAML for the users. That said, it is somewhat an improvement to the immediate problem at hand - as raw toml literals are even worse.
I've posted an update at the #1327 yesterday: https://github.com/timberio/vector/issues/1327#issuecomment-705712305. I think #3174 is a dup, so I didn't comment there.
With or without Helm, in the end we need to construct a ConfigMap object. It's typically represented as YAML document, but this doesn't matter really, cause it's data is represented as key/value pairs, with values being strings.
What this means is, an alternative option to passing values to the Helm-managed ConfigMap via the odd mix of YAML and TOML, it is possible to just upload a ConfigMap to the kube-apiserver yourself, and reference it through the config.
We support this via existingConfigMap (which I, apparently, forgot to document at values.yaml, shame on me), and, for more advanced use cases, extraVolumes and extraVolumeMounts.
There are downsides to using externally-managed resources with Helm charts: typically it's just extra work. The primary reason for Helm existence in the first place is to reduce the manual work required to the minimum, and every bit of doing things externally rather than via Helm is a UX downer (for many reasons).
While some users will prefer this way of configuring Vector to the Helm chart due to the specific of their configuration pipeline, I expect that right now most will choose to use Helm-managed config, despite all the awkwardness of the TOML/YAML mixing.
@MOZGIII I have no idea what you're proposing here. What is the path forward? It's one of:
Let's switch Vector config format to YAML (as proposed at https://github.com/timberio/vector/issues/1327#issuecomment-705712305) by 0.12.
For 0.11, let's ship as-is.
This is potentially a naive question and it may not totally apply here - but, I'm wondering about this issue as this pain is something we hit in hab as well. If the vector config format is TOML but we're using serde for serialization and de-serialization shouldn't it be relatively trivial to support parsing config in any format that serde (and serde libs) can handle? (This is obviously dependent on how we're handling config on our side which is what I'm hoping to learn here)
Our issue specifically was that while our runtime config values were commonly written into a .toml configfile we also provided envvars as a valid primary configuration API. Writing TOML inside an envvar is also (probably obviously) pretty ugly. That envvar API was kind of important in the context of 12 factor app standards and it meant that docker users were having to write toml, inside envvars, (and then frequently inside yaml).
We ultimately just allowed TOML and JSON but if I recall it basically meant doing a quick verification of the data on read to determine format so we could route to the appropriate serde lib.
I think the main concern here is that we want to make sure that our documentation and user configuration are all in one format. This is kind of important for providing support and etc. Although, I'm not sure about how important that is to us.
I don't see why we wouldn't be able to allow both JSON and YAML as a configuration language. In addition, I've been experimenting with Helm templates, and it seems like we can define the portions of configs in YAML in values.yaml and convert them to TOML on-the-fly when we're injecting them to the ConfigMap. Although, while it is possible, doing it this way and retaining the TOML as the only accepted config format would mean that the documentation is all in TOML, but in Helm chart we only accept YAML. I see this as very problematic for the users, as they'd have to know how to convert arbitrary TOML to YAML.
So, essentially, I'm not for dropping the TOML support, but, to have a real good UX, I guess we need to:
values.yaml (either by switching the whole config to YAML and accepting YAML config files in Vector, or by translating YAML to TOML internally in the Helm template)The key point here is (1) - as without it things will be very confusing.
I think the easiest way was to get the (1) would be the transition to YAML entirely, however it might be possible to also render both YAML and TOML, cc @binarylogic. But then the concern that I wrote at the very top of the message remains.
We are planning to support JSON as input via #3174, which will allow users to use JSONNet, etc.
I think your concerns are 100% spot on. It's difficult to find a balance between flexibility and prescription. Having native YAML for our k8s users feels like a no-brainer to me. I think as a (sort-of-outside)perspective the ideal change would be if we can provide that interface without breaking backwards compatibility or obligating existing users to rewrite all their toml configs to yaml.
I imagine we could always provide YAML and TOML and JSON examples in the docs with a button to toggle between which version a user might want to see.
We are planning to support JSON as input via #3174, which will allow users to use JSONNet, etc.
If we aim to support JSON, I guess supporting YAML is a no-brainer, since JSON is a subset of YAML.
Having native YAML for our k8s users feels like a no-brainer to me.
馃槃 Yep!
I imagine we could always provide YAML and TOML and JSON examples in the docs with a button to toggle between which version a user might want to see.
That's what I was thinking.
To further improve the user experience, we could also offer a neat UI for generating/documenting Helm's values.yaml. I assume it's not something we have to focus on right away (one of the reasons is that we're still experimenting with the config format, and it's better to get that out of the door, and also add more configuration knobs first), but it would be nice to have eventually.
We've settled on supporting TOML, YAML and JSON as configuration file formats in Vector natively.
In the documentation, we'll be using TOML, and mention YAML and JSON as advanced alternatives.
I've also created a follow-up issue: https://github.com/timberio/vector/issues/4932
With the aggregator and internal_metrics implementations, I feel like we have enough context to start addressing it.