Documentation: Create a file identification/characterization service

Created on 24 Jul 2016  路  6Comments  路  Source: Islandora/documentation

This service would do file identification and characterization on NonRDFSources.

This microservice would:

  1. Extract file identification/characterization using FITS

    • FITS could be run locally on the server

    • FITS could be run as a web service on another server

  2. Selected output from FITS would be save as properties on the resource. We will use this Technical Metadata Profile.
  3. FITS xml output would be saved as a NonRDFSource, and will be related to the NonRDFSource FITS was run on with an iana:describes predicate

The information gathered by this microservice would inform PDX and derivative creation microservices.

@Islandora-CLAW/committers @Islandora-CLAW/sprinters let me know what you think.

Crayfish architecture enhancement

All 6 comments

Feel free to make use of this Camel-based service if it would be useful: https://gitlab.amherst.edu/acdc/repository-extension-services/tree/master/acrepo-exts-fits

Given a URL prefix (e.g. http://example.org/technical/metadata), the path will be mapped to a Fedora object (e.g. http://example.org/technical/metadata/object/foo -> http://repository.example.org/fcrepo/rest/object/foo), the binary is POSTed to a FITS Servlet (running somewhere) and the XML metadata is returned. It runs in OSGi and is intended to be compatible with API-X.

Note: the acrepo-exts-fits service only performs step 1 from above. I have not yet written 2 or 3, but I will need them eventually. I was anticipating that steps 2 and 3 would be written in some scripting language (personally, I'd choose python, but if you all plan to write it in PHP, I'd be inclined to use your code).

@acoburn browsing through y'alls code, I'm not a 100% certain, but are y'all using the FITS webservice?

@ruebot yes, I'm using the webservice. The main reason for that is to avoid needing to save the fedora:Binary to disk before invoking FITS. This way, it also gives me the flexibility to run the FITS webservice wherever I want (i.e. on a separate system).

@acoburn EXCELLENT. I think this will make @DiegoPino happy, and possibly Danny in 7 days :smile:

_note to self, create a CLAW Call agenda item to talk about the future of Alpaca, and how Amherst's work might fit into it_

https://github.com/Islandora-CLAW/CLAW/wiki/August-10,-2016#agenda -- Give Danny until the second week :smile:

Was this page helpful?
0 / 5 - 0 ratings

Related issues

dannylamb picture dannylamb  路  5Comments

dannylamb picture dannylamb  路  3Comments

acoburn picture acoburn  路  4Comments

ruebot picture ruebot  路  3Comments

Natkeeran picture Natkeeran  路  3Comments