Vector: Wire ShutdownSignal into all sources

Created on 20 Mar 2020  路  9Comments  路  Source: timberio/vector

Issue 1091 created a SourceShutdownManager and added a ShutdownSignal object to all all source's build() method. Currently, however, the only sources that check for the ShutdownSignal are the TCP and UDP modes of the 'socket' source. We should update all sources to gracefully shutdown when the ShutdownSignal future resolves.

  • [x] docker source
  • [x] file source
  • [x] http source
  • [x] journald source
  • [x] kafka source
  • [x] kubernetes source
  • [x] logplex source
  • [x] prometheus source
  • [x] socket source
  • [x] splunk_hec source
  • [x] statsd source
  • [x] stdin source
  • [x] syslog source
  • [x] vector source
topology must

Most helpful comment

vector source was already wired with socket source.

kubernetes source is backed by file source and the shutdown bubbles up to kubernetes source through channels closing so it's also done.

All 9 comments

Noting, this is somewhat related to the new futures work: https://github.com/timberio/vector/issues/1142. We'll want to figure out the best order of execution in the context of that work.

Some sources are trickier for adding the shutdown, so I'll address them in separate PRs.

@stbrody @LucioFranco Is the relationship between lifetimes of out sink, returned Future, and ShutdownSIgnal documented/defined somewhere?

Is the relationship between lifetimes of out sink, returned Future, and ShutdownSIgnal documented/defined somewhere?

Besides the docs in here https://github.com/timberio/vector/blob/master/src/shutdown.rs there may not be. But the idea is that when the root of the topology chain closes that should be bubbled all the way down to the sink that will then get dropped via the Forward future, when the stream ends.

Ok, so I think we should define it. Something simple. Otherwise shutdown logic in the sources would get to defensive in covering every possible way that the source can be shutdown. file source is a good example, it has 3 parts each running in it's own thread/task:

  • Returned Source future
  • file server
  • event creator

Shutdown scenarios:

  1. ShutdownSignal is triggered.

    • file server should poll ShutdownSignal.

    • Source future should poll ShutdownSignal, or maybe should wait for event creator to end which would require additional tripwire/trigger pair.

    • event creator will end once file server drops it's end of the channel and processes all messages, or if the sink is closed.

  2. Source future is dropped

    • Source future should drop trigger A.

    • file server should poll tripwire A.

    • event creator will end once file server drops it's end of the channel and processes all messages, or if the sink is closed.

  3. Sink is closed, event creator panicked, or file source panicked

    • event creator will end.

    • file source will end and should drop trigger B.

    • Source future should poll tripwire B.

To cover all of them, besides ShutdownSignal, two additional trigger/tripwire pair's are needed.

For some of these cases it's reasonable to assume they won't happen, or reason about what can and can't happen by looking at shutdown/topology code, which I would like to avoid.

This all could be simplified if we define/document/add a few rules on what can and can't happen.
For example:

  • If sending side of out channel is dropped or if Source future returned error, proper shutdown procedure of the source should still be done. This is mostly to trigger ShutdownSignal.
  • Shutdown by force, by dropping Source future, should also trigger ShutdownSignal.
  • Source future can end before closing out sink.

Basiclly, all of the above would guarantee that ShutdownSignal is always triggered. With that, ShutdownSignal could be used instead of additional trigger/tripwires or even completely eliminate the need for it.

With that, shutdown scenarios for file source would be:

  1. ShutdownSignal is triggered.

    • file server should poll ShutdownSignal.

    • Source future should poll ShutdownSignal.

    • event creator will end once file server drops it's end of the channel and processes all messages, or if the sink is closed.

  2. Event creator panicked

    • event creator will close out sink and by extension trigger ShutdownSignal.

  3. file source panicked

    • file source will close channel to event creator.

    • event creator will end once it processes all messages and then closing the out sink and by extension trigger ShutdownSignal.

@LucioFranco I'm maybe just complicating, but I would really like to hear what do you think? Everyone else is of course also welcome.

Event creator panicked

We should treat this as fatal I believe. So this shouldn't happen and if it does its a bug.

I believe the file server should be able to attempt to send an item down the channel and that should fail if the rx end is dropped. So if we just cancel the source future via shutdownsignal, it will drop the rx end and should then cascade the rest of the shutdown process. So I don't think its necessary for the file server to have a shutdown signal but we should ensure it closes properly if the rx end is dropped.

So I would try to keep it as simple as possible and lean on futures as much as possible.

@LucioFranco thanks, it's clearer now how this should behave.

vector source was already wired with socket source.

kubernetes source is backed by file source and the shutdown bubbles up to kubernetes source through channels closing so it's also done.

Nice work @ktff

Was this page helpful?
0 / 5 - 0 ratings

Related issues

LucioFranco picture LucioFranco  路  3Comments

LucioFranco picture LucioFranco  路  3Comments

a-rodin picture a-rodin  路  3Comments

jamtur01 picture jamtur01  路  3Comments

binarylogic picture binarylogic  路  3Comments