Config errors are extremely common, and as Vector grows so will a typical users’ config file. Linting/validation of configs is a great first step. However, some config errors will still result in valid Vector configurations. A very common example is fumbled deployments where either the wrong (often older) config file ends up in production, or the service is never refreshed with the new config.
Importing configs (https://github.com/timberio/vector/issues/832) will also open up opportunities for users to become confused and unaware of what the resulting Vector configuration actually looks like.
It would be great in these circumstances if Vector were able to expose in some way what its internals look like. This would enable users to self-diagnose problems and also allows them to provide more context to others when asking for help.
Some services provide an HTTP /config endpoint, which echoes back whatever config it was provided. In Vectors case this might still be difficult for a human to parse.
Alternatively (or in addition) it could display a graph of its topology, which would be a visual representation of what it's doing internally. This could be plain text to start with but then get expanded later.
I'm sure that a command line flag would also get heavy usage. Imagine having a large config file, built up from a directory of snippets, and being able to run vector -c ./foo.toml --export-graph ./foo.png.
I like this and agree.
Imagine having a large config file, built up from a directory of snippets, and being able to run vector -c ./foo.toml --export-graph ./foo.png.
We've talked about a CLI that would expose the topology graph as well, and use some fancy ASCII art to visually represent it.
Some services provide an HTTP
/configendpoint, which echoes back whatever config it was provided. In Vectors case this might still be difficult for a human to parse.
I can imagine some use cases where a machine-readable representation would be useful too. For example, automated checks that Vector is reading from a particular source or writing to the correct host in a sink.
Reminded me of Vizceral.
I could see our topology graph represented as a svg as being VERY useful!
Another thing with config, unlike kube where the /config works users don't consume the config but merely observe it. I could see possibly having a sha hash of the config as being useful to indicate if the reload has been successful as well.
@Jeffail this looks great! Before we begin work we need to answer the following questions:
- Do we want to introduce an HTTP API? (ref #541 (comment)). If so, does it make more sense to represent the first step as a separate issue?
As we discussed, I think this will mostly involve generalizing the existing metrics server. An initial push to clean up and generalize that component would make sense to tackle before actually adding anything specific to exposing configuration.
2. What exactly do we want to do here? We have a few suggestions, so I would like to agree on the specifics. (ex: echoing the config, a config sha hash, a graph topology, etc).
I think a simple first step could be an endpoint that returns a JSON-encoded version of the currently loaded config.
Okay let's define this issue to add an endpoint to export a JSON representation of the running config. I've added https://github.com/timberio/vector/issues/1075 which is now a dependency of this work.
Since we're probably going to expand to other formats I think it makes sense to make the endpoint for this /config/json. I've mentioned in #1075 that we should consider separating regular endpoints from debug endpoints. If we go down that route then I think this endpoint should be considered debug since configuration files might contain secrets.
I feel it may be more common to see /config?format=json?
👍 , that or an Accept: application/json header.
Sounds good, but in that case are we happy with still defaulting to json when no Accept header is present? Otherwise I can return a 501 until we have some sort of plain text representation, which ideally I think we should have eventually.
I don't think we should rely on a header, there should just be a default format that if I end up just sending a curl itll work.