Athens: Endpoint for push synchronization

Created on 9 Apr 2018  路  14Comments  路  Source: gomods/athens

With push synchronization, Olympus deployment notifies other deployments about the change.

When olympus deployment OA receives push notification from deployment OM, OA does not increase its pointer for OM.
Next time pull synchronization happens, it will be performed with latest pull sync pointer, changes received through push notifications will be skipped due to deduplication checks.

Example:

T0 - OM has MxV1
T1 - OA joins and pulls from OM
    OA[MxV1], OM[MxV1]
T2 - OM receives MyV1 miss 
    OA[MxV1], OM[MxV1, MyV1]
T3 - OM pushes MyV1 change -- timeout
T4 - OM receives MzV1 miss
T5 - OM pushes MzV1 change
    OA[MxV1, MzV1], OM[MxV1, MyV1, MzV1]
T6 - OA pulls from pointer T1
        pulls OM[MyV1, MzV1] =dedup=> [MyV1] 
T7 - OA[MxV1, MzV1, MyV1], OM[MxV1, MyV1, MzV1]

Order of log entries does not matter.

All 14 comments

What should happen with the push notification? Should it be treated just like a cache miss from proxy, stored in a separate log or processed right away?

It shouldn't work exactly like a cache miss from the proxy. Instead, it should be processed just as if it were a pull operation, except the log pointer shouldn't be updated (it will be reported as duplicate on the next pull)

thanks for the clarification @arschles !
I have few more questions:

  1. I guess the handler should just create a backgroud job which does all the downloading, storing, preparing CDN links, etc. Am i right?
  2. Let's say OM gets a push notification from OA. Should OM pull it from the VCS or download the module from OA? If the latter is the case, how can we guarantee that the push really came from OA? Use a secret like in the case of Github push?
  3. The push can happen at the same moment as the pull sync. So theoretically, the same module can be stored twice. Would this be an issue?

I would prefer having one/set of workers (with a set you need to sync about what is getting processed to avoid processing same item at the same time multiple times). background job per request might turn out bad and decrease perf/
we did not think this trough but, I imagine there could be a worker fed periodically with data pulled and decoupled from other Os (bg job). this very same worker could get fed by push notifications maybe?
@arschles

Let me think this through some more and get back on Monday

right, i didn't express myself clearly - by "creating a background job" i meant putting a job on a queue.

Perfect. after thinking about this some more, I came to realize that's what you meant. In a production environment, we'll have workers running to consume from the job queue, and we can scale them up if the queue grows. I believe that a queue push per request is fine given that.

@michalpristas talked about this earlier today as well.

Also @marpio I have answers to your questions (2) and (3)

Let's say OM gets a push notification from OA. Should OM pull it from the VCS or download the module from OA? If the latter is the case, how can we guarantee that the push really came from OA? Use a secret like in the case of Github push?

OM should download it from OA, but not while it does deduplication and saving module _metadata_ (revision info, name, etc...) to its own event log. In more detail, OM should do deduplication etc... and save the module metadata in its own log. That metadata will direct to OA's source code (in OA's CDN). Then, in a subsequent background job, OM should download source code from OA's CDN. After that operation is done, OM should direct to its own CDN in its own log.

The push can happen at the same moment as the pull sync. So theoretically, the same module can be stored twice. Would this be an issue?

All deployments must append to the event log sequentially, but that wasn't clearly specified in the document (sorry about that - I'll clear it up). Anyway, if we have sequential appends, then we can do deduplication to avoid the store-twice problem you mentioned.

Thanks @arschles
I guess implementing all of this would extend the scope of this issue. Would it be ok if I would implement just the handler and a worker which accepts the push and does nothing at the moment?
Or maybe there's something else to do that makes more sense now?

@marpio yes absolutely. In your PR that does part of this work, just reference this issue so we can keep track of how far along the work is.

did #181 solve this @marpio

@robjloranger I think so. @michalpristas correct me if I'm wrong please.

yes #181 solves this closing

Was this page helpful?
0 / 5 - 0 ratings