Documentation: Publish entity events to Fedora queue

Created on 23 Nov 2016  路  17Comments  路  Source: Islandora/documentation

Write operations need to be published to the Fedora indexing queue. Use STOMP to publish a message to the fedora inexing queue on post-insert, post-update, and post-delete.

You need to specify the type of operation (create, update, delete) as a message header so that the indexer will know what to do. The body of the message should be expanded jsonld retrieved from the serializer service.

drupal

All 17 comments

@dannylamb

  • How will the STOMP package (https://github.com/stomp-php/stomp-php) be included? (Do we use the composer to install it?) Which version?

  • What is the Fedora indexing queue? What is the STOMP connector (url:port) to it? Does this need to be enabled/configured? How can this be tested?

  • By "serializer service", are you referring to the jsonld REST response to a fedora resource url?

  • Do we publish all fedora_resource related entity events or does this need to be configured? (A first pass can be all events.)

@Natkeeran

  1. Probably composer is the best way, but you figure what works. Same for version.
  2. https://github.com/Islandora-CLAW/islandora/blob/8.x-1.x/islandora/config/install/islandora.settings.yml#L4
  3. I think @dannylamb just wants a service that acts on a CRUD operation to add the event to a external queue.
  4. I think all to start and trim down via configuration later, but you might have a brainstorm in the creation...go with it.

Good to remember, whatever message queue/protocol/ etc we use, message structure should not contain tripples or data at all. As in fedora4 one, just rdf:types to be able to filter and the URL - path of the resource, in this case the drupal one right?.

Just to clarify, the Fedora API Spec will be moving toward using W3C Activity Streams 2.0 for messaging. You can see examples of those messages here: http://fcrepo.github.io/fcrepo-specification/. Those messages _will_ contain data, serialized as JSON-LD, conforming to AS2. So whatever transport protocol that is used, you'll want to make sure that consumers can make use of the AS2 compact JSON-LD syntax.

Ok, this changes a lot. So what state of the data will we have? Maybe not the place but i wonder why, i kinda like the read before write strategy. And if compacted, as does current JSON-LD serializer i wrote does, then: should keep it that wat @dannylamb instead of expanding?

@Natkeeran @whikloj STOMP is already installed using apt: https://github.com/Islandora-CLAW/CLAW/blob/master/install/scripts/drupal.sh#L14

@Natkeeran For your other questions:

  • What i'm referring to with the Fedora indexing queue will be a program that reads from that queue to insert content into Drupal. It's half of sync/salmon, basically. This is not completed yet, but some config for this got pushed up with https://github.com/Islandora-CLAW/islandora/commit/8df3bbfd99136f3b8df7373d648e719984bf2225
  • The serializer service the claw-jsonld module by @DiegoPino. We can inject it into any class we want to serialize jsonld for any purpose. It's not just limited to drupal's REST module.
  • I would suggest using Rules for the last part, but that's just me. We've yet to decide upon more as a group.

@DiegoPino I do plan on serializing state at first, but that's only because there's no vector clock yet. We need it to verify if we're late / versions don't align as state may have changed in the time it takes for us to process the message.

See https://github.com/Islandora-CLAW/CLAW/issues/429. It leads to the rabbit hole of maintaining an ontology, to which there's been no response.

@dannylamb i'm fine with that rabbit hole, i do feel we need an ontology, it's just i'm not sure if we should have other islandora stuff and THE vector info in the same ontology, because it leads to the crazyness we have now, like even flags inside RELS-EXT. So at least having different namespaces for not-even-technical metadata, (control metadata) versus UUIDs etc, would let me sleep better.

@DiegoPino However that plays out is fine by me. I would consider the vclock to be more like a system property than anything else. You're not really supposed to mess with it. However we can sanely model that is fine by me.

@acoburn @DiegoPino Check out #445. I'm writing tests for the default impl now and will push code shortly. Please feel free to comment/review once that's up.

Thank you all for the feedback.

@dannylamb, I'll work on this ticket. I will need assistance with some missing pieces.

  • Which queuing middle-ware are we planning to use? Is it ActiveMQ. Is it installed? I am not able to connect to it (tcp://localhost:61612).

  • I'll directly hook into the entity hooks (https://api.drupal.org/api/drupal/core!lib!Drupal!Core!Entity!entity.api.php/group/entity_crud/8.2.x) for now.

@Natkeeran Hold up on this, as it's turning into a pretty overarching ticket. I'm about to push up a PR for https://github.com/Islandora-CLAW/CLAW/issues/450 (again, finishing up tests) and you'll have a bit more in place to work with. Just trying to keep the scope of this down for you.

And if you just _can't wait_ to do something: https://github.com/Islandora-CLAW/CLAW/issues/448 would be a great intro to Java/Camel/OSGi land if you're interested. I'd be more than happy to show you the ropes.

@Natkeeran Also, to actually answer your question, message broker is completely configurable. Default is activemq, but you can use anything that works with STOMP.

@dannylamb Thanks, Seems like a generic solution is needed. I'll hold off. Is the default activemq currently enabled/working in the vagrant?

@Natkeeran Should be. We're using the embedded broker that comes with Fedora.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

jonathangreen picture jonathangreen  路  4Comments

acoburn picture acoburn  路  5Comments

Natkeeran picture Natkeeran  路  3Comments

akuckartz picture akuckartz  路  3Comments

dannylamb picture dannylamb  路  3Comments