Documentation: Create a BundleResolver service

Created on 11 Jan 2017  路  34Comments  路  Source: Islandora/documentation

We need a service that receives a jsonld document as imput and returns the drupal bundle to create for it. There are many ways to do this, so create an interface like

public function resolveBundle($jsonld);

The function should return the bundle id (a string).

You should provide a default implementation that should do something simple, like check to see if each rdf:type in the jsonld is present in a bundle's list of classes.

This should be registered as a service within Drupal for the API to use when trying to handle a POST request, so create an entry in islandora.services.yml.

drupal

Most helpful comment

I don't know much about how Hydra works internally-- I lost interest once it became clear that Hydra uses Fedora like an RDBMS. But I think perhaps that that's the problem at which you are pointing? There's a big difference between a mapping between RDF type and partial behavior that gets stored somewhere outside the resource and a predicate to select a full behavior that gets mushed into the resource.

All 34 comments

It might be nice if the default implementation stored the mapping between type and bundle in the repository. This would allow for introspection. The LDP arrangement for this could be pretty simple.

It would make this service trivial.

Is this similar to the much maligned predicate in Hydra that says which Rails model to instantiate? How bad does that smell to you? Maybe you could elaborate on the LDP arrangement?

I don't know much about how Hydra works internally-- I lost interest once it became clear that Hydra uses Fedora like an RDBMS. But I think perhaps that that's the problem at which you are pointing? There's a big difference between a mapping between RDF type and partial behavior that gets stored somewhere outside the resource and a predicate to select a full behavior that gets mushed into the resource.

I don't have an immediate plan in mind for how to arrange a mapping like this in LDP-- the first thing I would want to do would be to look at what API-X is doing for similar purposes. Might be some common ground?

@ajs6f I see. You're saying store the mapping separately. I mistakingly read "in the repository" as "on the resource".

Yeah, it could be repowide or better, it could work like authz by respecting LDP containment.

I'll take this on, but I'm still not clear on what context refers to here.

If I send it a json-ld document, based on some mapping (possibly stored in the repo) I return the drupal bundle?

So would we also need a tool for setting those mappings?

After reading #493, perhaps I'll hold off and see what @DiegoPino does there first.

@whikloj go for it. We can then later sync stuff. I wonder if

returns the drupal bundle

means actually return the bundle object or a string, a representation, a query, not sure.

@DiegoPino

The function should return the bundle id (a string).

Thanks for the clarification. So $bundle_id it is.

And now I have 3 mores questions @whikloj and @dannylamb

  1. What would be default response in case of no match? (see also question 2)
  2. What happens if the JSON-LD received matches our islandora provided default bundle? e.g
    https://github.com/Islandora-CLAW/islandora/blob/8.x-1.x/config/install/rdf.mapping.fedora_resource_type.rdf_source.yml, do we go and create generic rdf_sources?
  3. What happens if the JSON-LD received matches multiple bundles.

@DiegoPino

  1. Throw an exception
  2. Return that bundle
  3. Return the first you find... and don't make ambiguous rdf mappings. People who want to do more sophisticated things can always provide their own service.

@whikloj so here is a hint http://json-ld.org/spec/latest/json-ld-framing/#framing-algorithm for this particular ticket:

as @ajs6f said here before and confirmed in IRC http://irclogs.islandora.ca/2017-01-20.html
JSON-LD framing is something we should think of, implement and enforce.
It would help a lot with our bundle-map-jsonld thing also and also make the notion of our resource typing/content typing more universal.

A discussion @DiegoPino and I had in IRC resulted in a suggested workflow of.

  1. The BundleResolver creates a frame out of each entity bundle (and caches it).
  2. The ContextResolver sends the entity using a matching framing algorithm.
  3. The BundleResolver then does the comparison of the incoming entity frame to all the cached bundle frames to determine the correct entity bundle and outputs that string.

Or is that crazy?

Also and this is not super important (can be changed later). Should this be part of the islandora module, the json-ld module or a brand new module?

My understanding of framing is that it is un-flattens a representation consistently, resulting in a guaranteed shape. What would associating frames with bundles bring to the table over querying the already existing rdf mapping configuration? It seems like we're conflating two very different operations here.

They're not the same thing-- what's being claimed is that if you have a consistent set of mappings but without any consistency in the framing, external non-semantic clients are not going to be able to make any use of the results. Also, there is a problem to be solved in selecting or more likely merging frames that are configured on a per-bundle basis that is not unconnected (to my eye) with the problem in this issue and could therefore profitably be discussed in the same conversation.

@dannylamb I'm not sure I understand the idea of framing in the more advanced way @ajs6f is describing.

For me, I was thinking simply that if we take an entity bundle which is a set of fields mapped to RDF. Then we frame it in to a guaranteed shape. It would make it easier to compare with the incoming json-ld (framed in the same shape).

But I could be missing whole swaths important information.

I am not talking about anything other than http://json-ld.org/spec/latest/json-ld-framing/ .

Ok, when you use words like "non-semantic clients", "merging frames" and "conversation" I start to get confused.

@ajs6f and @whikloj both are in sync: it is nothing more than having a better standpoint on to which JSON-LD comparision can be made, and the holes/etc be filled by a standard shape.
I guess the non-semantic client statement is related to the idea of "if the shape/framing matches and is expected and consistent", you are basically comparing arrays, which is good, no need to be semantic, ontology, etc aware.

The problem this ticket is tackling is "We have RDF and we want to publish JSON." There are two pieces to that problem: what predicates to use and how to take a general graph and make it into a tree. The JSON-LD context is the first part of that, and framing is the second. This ticket can (probably should) be just about the first part, but I'm saying "Don't lose sight of the bigger picture because you'll end up making the second part harder."

Non-semantic client means a client that does JSON. Not JSON-LD. JSON. To such a client, JSON-LD should behave like any other JSON. That is the _entire_ point of JSON-LD. (Other than that, it's got no great advantages.) But such a client cannot take JSON-LD and reparse the graph out of it. It has to be given a predictable shape of tree to work. That's what framing does.

Merging: if your contexts occur on a per-bundle basis, that's fine. If you have three bundles in play, do all three mappings, get the triples, and just throw 'em all in. But this doesn't work for framing. If you have three different frames, and they overlap, how do you work out the right thing to do? You have to have some deterministic scheme. It doesn't have to be brilliant or perfect. I'm just saying you have to have some way to decide and everyone has to know what it is. That's all.

Ok, when you use words like ... "conversation" I start to get confused.

How about "semantic salmagundi"? :)

Thanks for that @ajs6f, however I don't think this ticket is around publishing JSON from RDF. I think it is around determining which Drupal bundle we want to create depending on the JSON-LD passed in.

I guess I don't see how you do the two directions independently?

@ajs6f ok..yeah. This is why I had asked about

Should this be part of the islandora module, the json-ld module or a brand new module?

The "json-ld" module provides the serializer (https://github.com/Islandora-CLAW/claw-jsonld) that @DiegoPino wrote to turn Drupal entities in to JSON-LD. So perhaps this BundleResolver should be part of that interaction.

I don't know enough to say anything useful about that decision, but I can say something (sort of) useful about how you make it: don't end up with two different ways to associate JSON-LD with bundles (and vice versa). That will give you no end of heartache.

So maybe this should be 2 services in the https://github.com/Islandora-CLAW/claw-jsonld module.

  1. that will take a json-ld document and a frame and return the framed json-ld.
  2. that will take 2 json-ld documents and compare them.

Then each bundle can have a frame definition as part of its configuration/code.

To be clear, when I wrote "We have RDF and we want to publish JSON." that was a (not very clear) way of saying "We have RDF in a repo and the world around that repo wants JSON." I wasn't thinking of "publish" as particularly meaning "go from data in the repo to JSON to send to clients outside". It could just as well be "clients outside send JSON which has to become RDF inside the repo". In either case, it seems like Drupal bundles are the way CLAW has decided to keep management of the mappings sane, which seems completely sensible to me. So there's this multiway association between kinds of JSON, kinds of RDF, and bundles, and most of what I'm saying comes to two things:

  1. Framing is part of one direction of that association.
  2. Don't make that association more than once. (Which is pretty obvious, I know, but it's easy to lose track of this stuff, especially with lots of modules flying around.)

I don't know what the impl details here are, but at least some PHP libraries support contexts and framing themselves.

I certainly think framing is an issue worthy of its own ticket: #503

After some more thought, since we're already planning on generating and caching contexts per bundle, and the context uri is provided in the representation, can we not use that uri to lookup a bundle? Just a simple association, one per bundle, with no conflicts.

If there is but one context per bundle (if there's not going to be an idea of "different ways to write down the same properties") and if it's not going to be possible for two different bundles to use the same context, then that seems simple and sufficient.

Closing this for now. Moving towards explicitly providing bundle as a header instead of relying on an implicit convention.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

ruebot picture ruebot  路  3Comments

ruebot picture ruebot  路  4Comments

dannylamb picture dannylamb  路  3Comments

manez picture manez  路  5Comments

dannylamb picture dannylamb  路  4Comments