Tokio: Consider exposing simulation's types in Tokio

Created on 28 Nov 2019 · 3Comments · Source: tokio-rs/tokio

Despite the relatively limited use that tokio-rs/simulation has seen so far, @gardnervickers has been able to reproduce several complex bugs in H2 and Tonic. To make tokio-rs/simulation more accessible to more people, I think it's worth opening a discussion on re-exporting simulation's networking types from Tokio. I believe (and I might be misquoting @gardnervickers!) that the desired end state for this feature would enable a library like Hyper to be deterministically simulated just by changing which feature flags are enabled within Tokio.

To enable this, I think that Tokio will need to make the following changes:

A generalization of the removed tokio::runtime::Builder::clock to allow for setting of other hooks in Tokio as to enable deterministic simulation.
The exposure of simulation's types such as FaultyTcpStream through a rename and re-export under the name of the non-simulated type (In this case, tokio::net::TcpStream) if a feature flag enabling simulation is toggled.
Enforce that the single-threaded runtime wakes tasks in a deterministic order. I believe the current task waking order is accidental.

A discussion between @gardnervickers, @carllerche, and I covering how this would be implemented. It's slightly messy, I apologize about that.

gvickers: I've walked back a bit lately on having the Environment handle. I think eventually we'll just want to have a way to swap TcpStream/TcpListener implementation in Tokio, like how we do with tokio_timer::Now/Delay/Timeout today.
[4:15 PM] davidbarsky: yeah, i see why it's clone right now—spawn requires 'static, so it has to be an owned value
[4:15 PM] davidbarsky: oh, can you say more about that?
[4:20 PM] gvickers: Tokio today allows you to use
tokio_timer::timer::set_default and tokio_timer::clock::with_default to setup
the internal clock used by tokio_timer. The DeterministicRuntime sets these to
the mock clock, so you don't actually need to use the Environment handle to
create Delay's or query simulation time with tokio::clock::now(). I want to have
something similar for TcpStream/TcpListener, where we can use
tokio::reactor::with_default() to set a global which can be used to create
TcpStream/TcpListener instances backed by the simulation library's network.
[4:21 PM] davidbarsky: oh, neat, yeah.
[4:21 PM] gvickers: This should let us use existing libs without too many
changes, much in the same way that existing libs which create timeouts with
tokio::timer::Timeout can work with Simulation today.
[4:25 PM] gvickers: This will probably require something like task locals though in order to scope TcpListener binds to a virtual net::SocketAddr.
[4:25 PM] gvickers: Also, to give some sort of virtual filesystem/namespacing to files
[4:26 PM] gvickers: Luckily it doesen't have to be fast or anything, so just installing a thread local at the start of every task poll should be sufficient if I understand correctly.
[4:28 PM] davidbarsky: makes sense. this would require hooks into tokio, correct?
[4:29 PM] gvickers: Yes, I think something like tokio::reactor::with_default
should be sufficient. Carl mentioned a while back that the timer hooks are going
away temporarily for the new Tokio release though, so things might change around
that.
[4:39 PM] davidbarsky: @gvickers small PR that makes it slightly more "idiomatic": https://github.com/tokio-rs/simulation/pull/9
GitHub
slightly more idiomatic rust by davidbarsky · Pull Request #9 · ...

[4:39 PM] gvickers: Nice, thank you!
[4:40 PM] davidbarsky: (take it or leave it! no need to accept it)
[4:42 PM] gvickers: I see this used a bit, type Err = Box<dyn std::error::Error + Send + Sync + 'static>;
Is the pattern to have that as the top level error type for rust applications?
[4:55 PM] davidbarsky: for examples, I like to place a boxed error up there.
[4:56 PM] davidbarsky: it's kinda like a RuntimeException; don't really care too much about what the specific error is.
[4:56 PM] davidbarsky: so to answer for your question—I think it is, for smaller applications.
[4:56 PM] davidbarsky: for bigger ones, a proper enum variant.
[4:56 PM] davidbarsky: and I see what you mean by relying on Tokio's plumbing: https://tokio-rs.github.io/tokio/doc/src/tokio/runtime/handle.rs.html#11-25
handle.rs.html -- source
Source to the Rust file `tokio/src/runtime/handle.rs`.
[5:00 PM] gvickers: Yea, I have no idea how to do it but I think it should be possible to have TcpStream/TcpListener creation dispatch to alternate internal implementations.
[5:03 PM] davidbarsky: I think some additional builder methods on
https://tokio-rs.github.io/tokio/doc/tokio/runtime/struct.Builder.html that let
you specify with_timer, with_clock (maybe last two can be collapsed?),
with_io_handle, with_spawner, etc. should do the trick.
tokio::runtime::Builder - Rust
API documentation for the Rust `Builder` struct in crate `tokio`.
[5:04 PM] gvickers: Yea that would be ideal
[5:05 PM] davidbarsky: not sure how FaultyTcpStream can be exposed, though
[5:05 PM] gvickers: The DeterministicRuntime in simulation pretty much just copies the CurrentThread runtime
[5:06 PM] gvickers: I think FaultyTcpStream could just be made to provide the backing implementation for TcpStream.
[5:06 PM] davidbarsky: (cc: @Carl Lerche, if you're online)
[5:06 PM] davidbarsky: hmm, yeah. i think see what you mean.
[5:07 PM] gvickers: My main concern is if there would be a performance impact,
but I guess each TcpStream/TcpListener now has to lookup and see if there is a
backing reactor to register with anyway, so maybe not too bad?
[5:09 PM] davidbarsky: in test or production?
[5:09 PM] davidbarsky: but yeah, i think this is something that can be benchmarked
[5:12 PM] gvickers: I guess it would be in both test and production. I think at the least it would incur a vtable lookup on poll for both TcpStream and TcpListener.
[5:19 PM] Carl Lerche: @gvickers there is a feature flag to enable this kind of stuff
[5:19 PM] Carl Lerche: test-util
[5:19 PM] Carl Lerche: Look at how time freezing is done on master now
[5:28 PM] gvickers: Right, but in order to support simulation the mock time
model on master will need to be changed a bit (or hooks introduced into the
runtime to drive it). I suppose the test-util feature flag probably couldn't be
overloaded to support both simulation and normal unit tests, since the time
models are different.
[6:05 PM] Carl Lerche: @gvickers you can add hooks in the runtime for testing that are enabled w/ test-util
[6:05 PM] Carl Lerche: @gvickers my main point is that it is ok to have "overhead" (minimal) to support testing, we just enable it w/ the feature flag
[6:05 PM] davidbarsky: would you accept a PR soonish, or only after 0.2 is out?
[6:05 PM] Carl Lerche: not until after 0.2
[6:06 PM] davidbarsky: makes sense.
[6:06 PM] Carl Lerche: basically, make a proposal
[6:09 PM] davidbarsky: sounds good. @gvickers do you have a sense as to how you'd like this API to look like?
[6:18 PM] gvickers: I think for the timer implementation, to replace Park we
need a runtime hook which fires when no task can make progress. The timer should
also allow accessing the Instant at which the next future can progress. These
two hooks can then be used to advance the clock with advance.
[6:21 PM] gvickers: As for the TcpStream/TcpListener API, I'm not sure. All the simulation really needs there is to provide alternate AsyncRead + AsyncWrite implementations and peer_addr()/local_addr().
[6:41 PM] davidbarsky: makes sense. I think it'll be possible to have the "same" TCPStream/TCPListener re-exported under the same path/name but activated under a different feature flag
[6:42 PM] davidbarsky: so, [cfg[not("simulated")]]—or whatever the syntax is—would export the FaultyVariant that's in simuation
[6:42 PM] davidbarsky: I don't know if that's what you have in mind.
[6:45 PM] gvickers: I'm not entirely sure what the limitations of compile flags
are, but the goal would be for libraries like Hyper which call TcpListener::bind
to work under simulation.

A-tokio A-tokio-test C-feature-request

Source

davidbarsky

Most helpful comment

Thanks @davidbarsky for capturing this, I have written out a few more details on how we could integrate tokio-rs/simulation with Tokio.

Simulation Testing for Tokio

tokio-rs/simulation is an initial POC bringing deterministic simulation
testing support to tokio 0.2.0-alpha.6.

The way it currently works is by providing an Environment trait to users.
Applications are made generic over the Environment trait.

    #[async_trait]
    pub trait Environment: Unpin + Sized + Clone + Send + 'static {
        type TcpStream: TcpStream + Send + 'static + Unpin;
        type TcpListener: TcpListener + Send + 'static + Unpin;

        /// Spawn a task on the runtime provided by this [`Environment`].
        fn spawn<F>(&self, future: F)
        where
            F: Future<Output = ()> + Send + 'static;

        /// Return the time now according to the executor.
        fn now(&self) -> time::Instant;

        /// Returns a delay future which completes after the provided instant.
        fn delay(&self, deadline: time::Instant) -> tokio_timer::Delay;

        /// Returns a delay future which completes at some time from now.
        fn delay_from(&self, from_now: time::Duration) -> tokio_timer::Delay {
            let now = self.now();
            self.delay(now + from_now)
        }

        /// Creates a timeout future which which will execute T until the timeout elapses.
        fn timeout<T>(&self, value: T, timeout: time::Duration) -> tokio_timer::Timeout<T>;

        /// Binds and returns a listener which can be used to listen for new connections.
        async fn bind<A>(&self, addr: A) -> io::Result<Self::TcpListener>
        where
            A: Into<net::SocketAddr> + Send + Sync;

        /// Connects to the specified addr, returning a [`TcpStream`] which can be
        /// used to send and receive bytes.
        ///
        /// [`TcpStream`]:`TcpStream`
        async fn connect<A>(&self, addr: A) -> io::Result<Self::TcpStream>
        where
            A: Into<net::SocketAddr> + Send + Sync;
    }

The desired outcome of this is an application which is generic over sources of
nondeterminism, such as parallel scheduling, network and time. This works in
practice for simple applications, and existing libraries such as Hyper which
allow parameterization of the executor and TcpListener/TcpStream types can be
adapted to use the Environment trait.

However, while this can be made to work in some cases, there are 3rd party
libraries for which this is not possible. It is also cumbersome to have
to pass the Environment trait through entire application. Lastly, changes to
the public API in tokio 0.2.0 have removed some hooks which
tokio-rs/simulation relied upon, motivating this proposal.

Proposal

We propose adding simulation support directly to Tokio via a feature flag. This
proposal assumes the feature flag will be the existing test-util flag, but
that can be decided separately.

There are types which will need to be mocked when the feature flag is enabled.

Mocks

Scheduling

When the feature flag is enabled, all runtime creation will use the
basic_scheduler. The remote queue of the basic scheduler will be effectively
disabled as scheduling tasks from remote threads would introduce nondeterminism.
We could either log a warning or panic if scheduling from a remote thread was
attempted.

By disabling the remote queue, it appears that the basic_scheduler will
execute tasks in a deterministic order between program runs.

Each spawned task will be wrapped in a top-level task which will provide
metadata about the task, and allow for manipulating the work done by the task.
Metadata includes things like a logical location (datacenter/rack/machine), as
well as hostname and IP address. On each poll invocation, the task metadata
will be set on a well-known thread-local for other components to pickup. The
task metadata can be used by fault injection, networking and disk IO to uniquely
identify the task.

Tasks which are spawned via `tokio::spawn` will automatically inherit the
parents task metadata.

The task metadata will be optional to begin with, but in the future we would
like to support users specifying it explicitly when setting up a simulation run.

    #[tokio::simulation]
    fn simulation(seed: u64) {
      tokio::simulation::spawn_process(
        "us-east-1a", "rack-1", "server-1.svc.cluster.local", spawn_server()
      );
      tokio::simulation::spawn_process(
        "us-east-1b", "rack-2", "server-2.svc.cluster.local", spawn_server()
      );
      tokio::simulation::spawn_process(
        "us-east-1c", "rack-3", "server-3.svc.cluster.local", spawn_server()
      );
      tokio::spawn(tokio::simulation::clog_connections("us-east-1c", "us-east-1b"));

      verify_linearizability().await;
    }

Time

The mock clock will be advanced automatically by the executor whenever
Park::park_timeout is called. It will be advanced for exactly the
std::time::Duration passed to Park::park_timeout. This provides two
desirable properties for simulation testing.

Time is only advanced once there are no tasks which can make forward
progress. This couples time to task execution, allowing for users to
pinpoint the timestep where a regression occurs for gating things like
logging or bug fixes on a timestamp.
Ordering/prioritization can be naturally expressed through the use of
Timeout/Delay. It's guaranteed that all tasks waiting on later timeouts
execute before tasks which are waiting on earlier timeouts.

Networking

When the feature flag is enabled, the TcpStream and TcpListener types will
be swapped out for types which register themselves with a process-global
in-memory networking implementation. The existing TcpStream will wrap
something akin to this SimulatedTcpStream type, and the existing TcpListener
will return TcpStream's which wrap the SimulatedTcpStream type.
The ToSocketAddrs implementations which are gated behind the DNS feature flag
will be swapped out to use the global in-memory networking implementation for
DNS lookups. DNS will be derived from the task metadata.

    pub struct SimulatedTcpStream {
        tx: mpsc::Sender<Bytes>,
        rx: mpsc::Receiver<Bytes>,
        staged: Option<Bytes>,
        shutdown: bool,
        local_addr: net::SocketAddr,
        peer_addr: net::SocketAddr,
    }

TcpStream::bind will register a new connection queue with the global
networking object, allowing subsequent TcpStream::connect calls to enqueue
connection objects. TcpStream::bind will use the task metadata when binding,
allowing for DNS resolution to occur.

Disk IO

Like networking, disk IO will be done using an in-memory mock filesystem. The
mock filesystem will be namespaced by task metadata.

This is not needed initially, and has not been explored yet with
tokio-rs/simulation. However, I'm including it in this proposal for
completeness.

Faults

Beyond providing a deterministic environment for executing simulations, it's
also desirable to expose an API for fault injection. Scheduling, networking and
disk IO all depend on a process-global state store to back the simulated types
they expose.

We can take advantage of this global state to also support an API for fault
injection. Targeted faults can be injected based on the spawned task metadata by
user tasks. This allows for users to write fault injectors as normal async
tasks.

    async fn swizzle_clog_fault() {
      loop {
        let connections: Vec<_> = tokio::simulation::get_connections();

        for connection in connections {
          connection.clog();
          warn!("clogging connection {:?}", connection);
          tokio::timer::delay_from(time::Duration::from_secs(10)).await;
        }

        warn!("waiting 30s before unclogging connections {:?}", connection);
        tokio::timer::delay_from(time::Duration::from_secs(30)).await;

        for connection in connections {
          connection.unclog();
          warn!("unclogging connection {:?}", connection);
          tokio::timer::delay_from(time::Duration::from_secs(10)).await;
        }
      }
    }

Some of the faults that will be supported:

Network latency on the send and receive side for server and client sockets.
Network disconnects for client and server sockets.
Network partitions between machines, racks and data centers.
DNS resolution failures and staleness.
IpAddr manipulation, such as swapping the IP address on two logical machines.
Scheduling delays (like injecting several calls to yield() at the root of a
task). This can be used to simulate slow machines or processes.

gardnervickers on 30 Nov 2019

👍3 ❤1

All 3 comments

Thanks @davidbarsky for capturing this, I have written out a few more details on how we could integrate tokio-rs/simulation with Tokio.

Simulation Testing for Tokio

tokio-rs/simulation is an initial POC bringing deterministic simulation
testing support to tokio 0.2.0-alpha.6.

The way it currently works is by providing an Environment trait to users.
Applications are made generic over the Environment trait.

    #[async_trait]
    pub trait Environment: Unpin + Sized + Clone + Send + 'static {
        type TcpStream: TcpStream + Send + 'static + Unpin;
        type TcpListener: TcpListener + Send + 'static + Unpin;

        /// Spawn a task on the runtime provided by this [`Environment`].
        fn spawn<F>(&self, future: F)
        where
            F: Future<Output = ()> + Send + 'static;

        /// Return the time now according to the executor.
        fn now(&self) -> time::Instant;

        /// Returns a delay future which completes after the provided instant.
        fn delay(&self, deadline: time::Instant) -> tokio_timer::Delay;

        /// Returns a delay future which completes at some time from now.
        fn delay_from(&self, from_now: time::Duration) -> tokio_timer::Delay {
            let now = self.now();
            self.delay(now + from_now)
        }

        /// Creates a timeout future which which will execute T until the timeout elapses.
        fn timeout<T>(&self, value: T, timeout: time::Duration) -> tokio_timer::Timeout<T>;

        /// Binds and returns a listener which can be used to listen for new connections.
        async fn bind<A>(&self, addr: A) -> io::Result<Self::TcpListener>
        where
            A: Into<net::SocketAddr> + Send + Sync;

        /// Connects to the specified addr, returning a [`TcpStream`] which can be
        /// used to send and receive bytes.
        ///
        /// [`TcpStream`]:`TcpStream`
        async fn connect<A>(&self, addr: A) -> io::Result<Self::TcpStream>
        where
            A: Into<net::SocketAddr> + Send + Sync;
    }

Proposal

We propose adding simulation support directly to Tokio via a feature flag. This
proposal assumes the feature flag will be the existing test-util flag, but
that can be decided separately.

There are types which will need to be mocked when the feature flag is enabled.

Mocks

Scheduling

By disabling the remote queue, it appears that the basic_scheduler will
execute tasks in a deterministic order between program runs.

Tasks which are spawned via `tokio::spawn` will automatically inherit the
parents task metadata.

The task metadata will be optional to begin with, but in the future we would
like to support users specifying it explicitly when setting up a simulation run.

    #[tokio::simulation]
    fn simulation(seed: u64) {
      tokio::simulation::spawn_process(
        "us-east-1a", "rack-1", "server-1.svc.cluster.local", spawn_server()
      );
      tokio::simulation::spawn_process(
        "us-east-1b", "rack-2", "server-2.svc.cluster.local", spawn_server()
      );
      tokio::simulation::spawn_process(
        "us-east-1c", "rack-3", "server-3.svc.cluster.local", spawn_server()
      );
      tokio::spawn(tokio::simulation::clog_connections("us-east-1c", "us-east-1b"));

      verify_linearizability().await;
    }

Time

Time is only advanced once there are no tasks which can make forward
progress. This couples time to task execution, allowing for users to
pinpoint the timestep where a regression occurs for gating things like
logging or bug fixes on a timestamp.
Ordering/prioritization can be naturally expressed through the use of
Timeout/Delay. It's guaranteed that all tasks waiting on later timeouts
execute before tasks which are waiting on earlier timeouts.

Networking

    pub struct SimulatedTcpStream {
        tx: mpsc::Sender<Bytes>,
        rx: mpsc::Receiver<Bytes>,
        staged: Option<Bytes>,
        shutdown: bool,
        local_addr: net::SocketAddr,
        peer_addr: net::SocketAddr,
    }

Disk IO

Like networking, disk IO will be done using an in-memory mock filesystem. The
mock filesystem will be namespaced by task metadata.

This is not needed initially, and has not been explored yet with
tokio-rs/simulation. However, I'm including it in this proposal for
completeness.

Faults

    async fn swizzle_clog_fault() {
      loop {
        let connections: Vec<_> = tokio::simulation::get_connections();

        for connection in connections {
          connection.clog();
          warn!("clogging connection {:?}", connection);
          tokio::timer::delay_from(time::Duration::from_secs(10)).await;
        }

        warn!("waiting 30s before unclogging connections {:?}", connection);
        tokio::timer::delay_from(time::Duration::from_secs(30)).await;

        for connection in connections {
          connection.unclog();
          warn!("unclogging connection {:?}", connection);
          tokio::timer::delay_from(time::Duration::from_secs(10)).await;
        }
      }
    }

Some of the faults that will be supported:

Network latency on the send and receive side for server and client sockets.
Network disconnects for client and server sockets.
Network partitions between machines, racks and data centers.
DNS resolution failures and staleness.
IpAddr manipulation, such as swapping the IP address on two logical machines.
Scheduling delays (like injecting several calls to yield() at the root of a
task). This can be used to simulate slow machines or processes.

gardnervickers on 30 Nov 2019

👍3 ❤1

This is a fantastic write-up @gardnervickers! I am super excited to start seeing some of this stuff being used.

I think overall the proposal is very very solid. The one area I am a bit fuzzy on is how we integrate with tokio directly. It may make sense to explore providing an enumed TcpStream that matches the tokio api 1-1 that lives within the simulation crate. This way we could move away from the giant Environment trait and explore what it might look like to have some of this code live within tokio and enabled via a feature flag.

LucioFranco on 30 Nov 2019

I can't speak to the details of the tokio integration, but in general this seems like a great direction. It certainly would be very valuable for tokio to have good support for in-memory testing (or every developer or protocol implementor would need to separately implement their own).

Will try to follow along!