Vector: Unify component build methods

Created on 22 Jul 2020  路  7Comments  路  Source: timberio/vector

Currently, we have both fn build and async fn build_async for each of sources, transforms, and sinks. This was a minimally invasive way to achieve the needed functionality, but the resulting API is not the clearest. Once the dust has settled a bit, we should circle back and clean this up.

For context, the existing build method is used into two place. First, the default implementation of build_async simply delegates to it. Second, and more involved, is the various component tests that invoke it. Since the tests are the vast majority of the code, we'll focus mostly on those.

There are (at least) two ways we could go about this. The simplest would be to remove fn build and use async fn build_async (renamed to build) everywhere. This is probably preferred due to the simplicity, but it does have the downside of requiring async stuff where it may not be needed (e.g. almost no transforms need it). Much of this downside could be alleviated by #[tokio::test] when we get to the point where that is usable and we don't require tokio-compat.

The other route would be to split out a better-named Sync{Source,Transform,Sink}Config trait and have the fn build live there. This would get rid of the confusion around which method to implement but complicates our model a bit by introducing a new type of thing. On the upside, we could easily impl SourceConfig for SyncSourceConfig and avoid having to change all the tests.

sinks sources transforms tech debt

Most helpful comment

We can not do this because all methods are public in traits...

I was thinking of adding build_sync to the struct, so something like this:

struct SomeSourceConfig{
...
}

impl SomeSourceConfig{
...
  fn build_sync(
        &self,
        name: &str,
        globals: &GlobalOptions,
        shutdown: ShutdownSignal,
        out: Pipeline,
    ) -> crate::Result<super::Source> {
    ...
    }
}

impl SourceConfig for SomeSourceConfig {

     async fn build(
        &self,
        name: &str,
        globals: &GlobalOptions,
        shutdown: ShutdownSignal,
        out: Pipeline,
    ) -> crate::Result<super::Source> {
       self.build_sync(name,globals,shutdown)
   }

}

All 7 comments

Going the route of completely replacing build with build_async seems like win win. #[tokio::test] is now usable, and while there are ~100 test call sites of build, most of them would benefit from being under #[tokio::test] since runtime() and block_on calls could be eliminated, while some of them already have #[tokio::test].

Regarding transforms and other components whose tests don't require async, each of those components could have private build_sync method which would be called in build_async(at that point just build) which the tests can use. This can be applied per component if neither its build method nor it's tests need async. That way, we won't complicate the model and wouldn't introduce async where it isn't needed. This is of course flexible so if adding async to test is just simpler, we can skip this.

@lukesteensen what do you think? Also

we don't require tokio-compat

why is not requiring tokio-compat needed? Does it have something to do with #[tokio::test]?

Regarding transforms and other components whose tests don't require async, each of those components could have private build_sync method which would be called in build_async(at that point just build) which the tests can use. This can be applied per component if neither its build method nor it's tests need async. That way, we won't complicate the model and wouldn't introduce async where it isn't needed.

I like this idea! That's an easy way to avoid requiring async for things simple enough not to need it.

why is not requiring tokio-compat needed? Does it have something to do with #[tokio::test]?

Yes, tokio::test uses a standard tokio 0.2 runtime instead of our own tokio-compat-enabled Runtime. This means that things relying on tokio 0.1 types will fail when run under tokio::test.

Then we should wait for #3478.

@ktff Is this now unblocked?

@jamtur01 yes, with #3478 done, we don't rely on tokio 0.1 anymore.

Regarding transforms and other components whose tests don't require async, each of those components could have private build_sync method which would be called in build_async(at that point just build) which the tests can use. This can be applied per component if neither its build method nor it's tests need async. That way, we won't complicate the model and wouldn't introduce async where it isn't needed.

I like this idea! That's an easy way to avoid requiring async for things simple enough not to need it.

We can not do this because all methods are public in traits...

The other route would be to split out a better-named Sync{Source,Transform,Sink}Config trait and have the fn build live there.

I do not think that additional trait for async/sync method required, for me name like SyncTransformConfig looks like transform itself sync, not build method.


I propose just use async build everywhere, add #[tokio::test] is not hard.

We can not do this because all methods are public in traits...

I was thinking of adding build_sync to the struct, so something like this:

struct SomeSourceConfig{
...
}

impl SomeSourceConfig{
...
  fn build_sync(
        &self,
        name: &str,
        globals: &GlobalOptions,
        shutdown: ShutdownSignal,
        out: Pipeline,
    ) -> crate::Result<super::Source> {
    ...
    }
}

impl SourceConfig for SomeSourceConfig {

     async fn build(
        &self,
        name: &str,
        globals: &GlobalOptions,
        shutdown: ShutdownSignal,
        out: Pipeline,
    ) -> crate::Result<super::Source> {
       self.build_sync(name,globals,shutdown)
   }

}

Was this page helpful?
0 / 5 - 0 ratings

Related issues

MOZGIII picture MOZGIII  路  3Comments

LucioFranco picture LucioFranco  路  3Comments

binarylogic picture binarylogic  路  3Comments

trK54Ylmz picture trK54Ylmz  路  3Comments

LucioFranco picture LucioFranco  路  3Comments