Prisma1: Stream subscription events

Created on 25 Sep 2017  路  8Comments  路  Source: prisma/prisma1

Additionally to invoking functions via subscription events, it would be great if subscription events could be streamed directly to different event targets such as AWS Kinesis, Kafka etc.

Until this feature is directly supported via Graphcool, an easy workaround is to use functions and forward events programatically.

kinfeature aresubscriptions kindiscussion

Most helpful comment

After having an awesome call with @schickling yesterday, I would like to take the chance to elaborate on this a little bit further.

When it comes to designing a microservice architecture, you often build it in a "synchronous" or "asynchronous" kind of way. To some extend you even mix both approaches when the use case demands it:

Synchronous

This is something where Prisma is already shining. You build dedicated Prisma services and stitch them together in one or multiple gateways. Nice!

Asynchronous

In this scenario, your Prisma service ships with the possibility to subscribe to "CRUD events", which will be generated whenever the data, the Prisma service is responsible for, has been manipulated. You could create a service which consumes those events and holds an own (materialized) view of the data. IMHO, this area could be extended. Prisma is already pretty good when it comes to pushing real-time events, but imagine a scenario where a service needs not just the events after it has been booted, but also the events before that point in time. To illustrate that a little bit: Take an imaginary UserAccountCreationMetricsCollector. This service is responsible for storing a data point in a time series database whenever a new user account has been created. When using the current "real-time based subscription approach", the service engineer has to:

  1. Fetch all users via a query from the Prisma service in order to create a snapshot AND (at the same time) create a subscription to consume the events (will be buffered).
  2. Handle the snapshot (write each user data point to the time series database)
  3. When the snapshot has been processed switch to the events stream

The downsides with this approach are:

  • Complex data processing: Process snapshot first and then switch to (buffered) real-time events
  • Creating the subscription and creating the snapshot needs to be performed concurrently
  • The need for buffering the events can lead to an extensive memory consumption (when a lot of users will be created while processing the snapshot, etc.)

An append-only log to the rescue!

It would be awesome when Prisma could act as an event store as well. One approach could be to store each mutation event in an append-only log, so that the engineer can decide if she/he needs all events or just events from a given offset onwards. The engineer could create a subscription like ...

subscription {
  user(where: {
    after: <an-event-id>
    mutation_in: [UPDATED]
  }) {
    eventId
    node {
      name
      email
    }
  }
}

Where eventId is an unique event identifier and after defines that I want all events after the event with the given id. When the engineer omits after, the subscription will return _all_ ever recorded events and keeps the subscription "open" to push upcoming events when the old events has been replayed.

This approach would lead to a very flexible and therefore powerful data replication mechanism.

References

All 8 comments

We have decided to remove the top-level functions field. See #591

If sendConfirmEmail is a subscription that should put an item on a kinesis queue it should look like this:

subscriptions:
  sendConfirmEmail:
    handler:
      kinesis:
        stream: my-kinesis-stream
    query: ./src/sendConfirmEmail.graphql

update: we decided to not implement #591

See also #589 for discussion on custom events

This issue has been moved to graphcool/graphcool-framework.

This is still highly relevant for Prisma. Reopening.

@schickling Definitely! Would be great to push Prisma into the "Change Data Capture" space as well. I could imagine a Debezium-like scenario. Prisms ships with Server-side subscriptions which already handles the first part of the problem: the possibility of pushing change events to a message broker. Imagine a scenario where you want to replicate all change events. In this case you also need a mechanism for creating a snapshot of the current state and emit those objects as "create" events. When this snapshot has been rolled out, the Server-side subscription(s) could kick in and stream the recent change events.

The Debezium model might be a good fit here. See this part of the documentation for deeper insights how the actual MySQL connector works.

In relation to the Kafka integration, it would be interesting to evaluate whether Apache Avro should also be supported.

After having an awesome call with @schickling yesterday, I would like to take the chance to elaborate on this a little bit further.

When it comes to designing a microservice architecture, you often build it in a "synchronous" or "asynchronous" kind of way. To some extend you even mix both approaches when the use case demands it:

Synchronous

This is something where Prisma is already shining. You build dedicated Prisma services and stitch them together in one or multiple gateways. Nice!

Asynchronous

In this scenario, your Prisma service ships with the possibility to subscribe to "CRUD events", which will be generated whenever the data, the Prisma service is responsible for, has been manipulated. You could create a service which consumes those events and holds an own (materialized) view of the data. IMHO, this area could be extended. Prisma is already pretty good when it comes to pushing real-time events, but imagine a scenario where a service needs not just the events after it has been booted, but also the events before that point in time. To illustrate that a little bit: Take an imaginary UserAccountCreationMetricsCollector. This service is responsible for storing a data point in a time series database whenever a new user account has been created. When using the current "real-time based subscription approach", the service engineer has to:

  1. Fetch all users via a query from the Prisma service in order to create a snapshot AND (at the same time) create a subscription to consume the events (will be buffered).
  2. Handle the snapshot (write each user data point to the time series database)
  3. When the snapshot has been processed switch to the events stream

The downsides with this approach are:

  • Complex data processing: Process snapshot first and then switch to (buffered) real-time events
  • Creating the subscription and creating the snapshot needs to be performed concurrently
  • The need for buffering the events can lead to an extensive memory consumption (when a lot of users will be created while processing the snapshot, etc.)

An append-only log to the rescue!

It would be awesome when Prisma could act as an event store as well. One approach could be to store each mutation event in an append-only log, so that the engineer can decide if she/he needs all events or just events from a given offset onwards. The engineer could create a subscription like ...

subscription {
  user(where: {
    after: <an-event-id>
    mutation_in: [UPDATED]
  }) {
    eventId
    node {
      name
      email
    }
  }
}

Where eventId is an unique event identifier and after defines that I want all events after the event with the given id. When the engineer omits after, the subscription will return _all_ ever recorded events and keeps the subscription "open" to push upcoming events when the old events has been replayed.

This approach would lead to a very flexible and therefore powerful data replication mechanism.

References

Yes, @akoenig I am running into this issue currently.
Rewriting an app that I wrote in Meteor, which featured a chat like interface with realtime messages and notifications.

I looked into rethinkdb briefly before coming over to prisma.

Now that I am in the process of refactoring that application using Prisma, and Vue, I'm finding that the subscriptions system does not maintain existing data as you've mentioned above.

Southpaw might also be interesting when looking into this topic more closely.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

notrab picture notrab  路  3Comments

schickling picture schickling  路  3Comments

Fi1osof picture Fi1osof  路  3Comments

jannone picture jannone  路  3Comments

sorenbs picture sorenbs  路  3Comments