Documentation: Investigate Views Bulk Operations for 'Re-Indexing'

Created on 19 Jul 2017  路  18Comments  路  Source: Islandora/documentation

Now that everything's been pushed async, we need a way to rebroadcast an event to trigger a 're-index' of things in Fedora, the triple store, or to re-generate derivatives.

Install the Views Bulk Operation module, and see if you can create a bulk operation based on a Rules action. Eventually, we'll have a view for resources that failed during processing, but for now, just mess with the collection view or create your own to test it out.

We may need to change how we're doing our Rules actions to broadcast events into the system, so if you run into that, don't fret. The scope is just to see if VBO has Rules integration and works out of the box with any arbitrary Rules action.

drupal newbie

Most helpful comment

@DiegoPino @dannylamb Creating a custom action was not difficult using the template provided here: https://www.drupal.org/docs/8/modules/views-bulk-operations-vbo/advanced. I'll spend some time (not too much) to see how to execute the RulesAction

All 18 comments

@dannylamb These are the actions that you can perform with VBO:
screen shot 2017-07-21 at 16 17 42

Are you hoping that VBO would integrate with Rules so that you could have a list of nodes, select them, then run a rule such as Broadcast Content Update Event?

@kimpham54++ Yes, I was hoping to see our 'Rules Actions' on that list instead of plain old 'Actions'. It's a shame it's not to that level of integration yet :( Thanks so much for looking into this for me.

@kimpham54 @dannylamb should be simple to integrate. We can follow the basic example at \core\modules\node\src\Plugin\Action\ or in our case (more advanced) based on this https://www.drupal.org/docs/8/modules/views-bulk-operations-vbo/getting-started
and our existing RulesActions, it could be almost just an annotation based wrapper. Extending the ViewsBulkOperationsActionBaseclass and implementing inside ::execute and ::access methods
the actual lifting functionality by wrapping via things like
https://github.com/Islandora-CLAW/islandora/blob/8.x-1.x/src/Plugin/RulesAction/UpdateEventGenerator.php

Ideas?

@DiegoPino I can look into this!

@kimpham54 @DiegoPino This was a D7 feature that hasn't been implemented yet in D8 that I honestly planned on waiting for. Wrapping a RulesAction with an Action is a good temporary measure, but I don't want you to sink a ton of time into something that will eventually be moot. So don't be afraid to peel away if it turns into a lot of work.

There's also the possibility of seeing what it would take to port this sort of integration to D8, but that's probably even more work and would require interacting with the Drupal VBO maintainers.

Either way we're pretty well out of scope for this issue.

@DiegoPino @dannylamb Creating a custom action was not difficult using the template provided here: https://www.drupal.org/docs/8/modules/views-bulk-operations-vbo/advanced. I'll spend some time (not too much) to see how to execute the RulesAction

@kimpham54 excellent!!

@DiegoPino @dannylamb looks like you can execute a custom rule using rules_invoke_SOMETHING(), see http://www.drupalcontrib.org/api/drupal/contributions%21rules%21rules.module/function/rules_invoke_event/8. Now I'm wondering if you either invoke_event "After updating content (rules_entity_update:node)" or just pass the ID of a custom rule, such as https://github.com/Islandora-CLAW/islandora/blob/8.x-1.x/src/Plugin/RulesAction/UpdateEventGenerator.php#L13.

Not sure if I'm really on the right track given my limited knowledge of Drupal 8... and programming in general.

Anyways, before I try a few things out, I realized I'm not actually sure what to expect when that rule is triggered. @dannylamb can you tell me how to actually test that rule to confirm that it's working?

@kimpham54 Yeah, our custom rules actions are problematic to test because they're only meant to generate the jsonld and publish them to a queue. If rules_invoke_event triggers all the actions we have, then you can tail the logs in karaf to see if the message gets published and consumed. If you vagrant ssh into the box, you can see the logs by popping into the karaf console and issuing the log:tail command.

$ /opt/karaf/bin/client
> log:tail

If a bunch of gibberish flies by on the screen when you trigger the operation, then it's working.

If that doesn't work out, maybe you can try with an action that ships with the rules modules, like sending an email? Then at least you'll have something a bit more tangible to see if it works.

We've done as @DiegoPino suggested and made everything as actions that then get wrapped downstream, just with the context module instead of rules. Now we can bulk re-index through the Drupal UI! It's like having bookmark baked into core.

Maybe regenerate derivatives via VBO?

Anything written as an Action can be applied VBO style, so yeah, totally! Derivatives will be done this way too.

I've been exploring using VBO on Islandora and have run into a problem (this is on a playbook VM built a month or so ago, but pretty current otherwise). Also, using VBO 8.x-2.6 because 3.6 doesn't yet honour views filters.

I get different behaviour trying to index an object via standard Admin > Content view vs a VBO view; the first indexes into fcrepo, the VBO approach doesn't

  • Add item, with Content context disabled.
  • has uuid=8bb5fcc4-32fb-4b09-a5d5-58807ed80550
  • so fcrepo should be http://127.0.0.1:8080/fcrepo/rest/8b/b5/fc/c4/8bb5fcc4-32fb-4b09-a5d5-58807ed80550
  • 404 not found, as expected.

In VBO view:

  • set Action=Emit a node event to a queue/topic
  • Check checkbox for "@3 Test record without context"
  • Click "Apply to selected items" button.
  • Form renders with fields for Queue and Event type, set to islandora-indexing-fcrepo-content and Update respectively.
  • Click "Apply" button,
  • VBO batch progress bar renders
  • EmitEvent execute() fires, with AFAICT the correct entity, token, user id etc.
  • http://127.0.0.1:8080/fcrepo/rest/8b/b5/fc/c4/8bb5fcc4-32fb-4b09-a5d5-58807ed80550 still 404
  • Try with Event type=Create, still 404 in fcrepo

Now try via Admin > Content

  • set Action=Index Node in Fedora
  • click "Apply to selected items" button
  • EmitEvent execute() fires, with queue=islandora-indexing-fcrepo-content, event=Update
  • Now http://127.0.0.1:8080/fcrepo/rest/8b/b5/fc/c4/8bb5fcc4-32fb-4b09-a5d5-58807ed80550 shows the expected fcrepo content

So, what is the difference in these two approaches?
Any hints as to where I should be looking are welcome.

This is particularly relevant because the standard Admin > Content view only shows 50 items, whereas VBO can use a batch to process 100s or 1000s of items.

EmitEvent execute() fires, with AFAICT the correct entity, token, user id etc.

It is this bit that gives me pause. Can you verify by reviewing the Milliner and Gemini logs? That index event should show up in both along with the JWT used. Throw that JWT into a debugger to ensure all the appropriate parts are there. If you aren't seeing the indexing actions trigger events in there I would step back to the Karaf logs or even the ActiveMQ admin interface to make sure those messages are queuing/dequeuing appropriately.

Flying blind (I haven't spun up a test to reproduce the issue as described), my guesses would be 1) something odd is causing the current user (and thus, the associated roles) to be dropped from the event (we've seen things like that happen before; to @ajstanley with SOLR most recently to my memory ) which should show up as errors in the Milliner and Gemini logs OR 2) the message isn't getting to the queue (so Karaf doesn't even process the events at all).

@seth-shaw-unlv Thanks for the pointers. I'll dig deeper...

I've successfully pulled this off in ISLE and am working through the particulars. I can offer examples/documentation shortly.

....so....

Looks like if we want to reindex in Fedora, we've got to ditch gemini. Delving into attempting to reindex exposes all kinds of spots where this use case just wasn't taken into consideration, and the index itself gets messy _real_ fast. I'm pushing through that work, but for now, I can at least drop this nugget as an example of how to reindex stuff with VBO:

# Re-index RDF in Fedora
drush --root /var/www/drupal/web -l localhost:8000 vbo-exec non_fedora_files emit_file_event --configuration="queue=islandora-indexing-fcrepo-external&event=Update"
drush --root /var/www/drupal/web -l localhost:8000 vbo-exec all_taxonomy_terms emit_term_event --configuration="queue=islandora-indexing-fcrepo-content&event=Update"
drush --root /var/www/drupal/web -l localhost:8000 vbo-exec content emit_node_event --configuration="queue=islandora-indexing-fcrepo-content&event=Update"
drush --root /var/www/drupal/web -l localhost:8000 vbo-exec media emit_media_event --configuration="queue=islandora-indexing-fcrepo-media&event=Update"

@dannylamb if we document this in the docs, can we close this? :D

Was this page helpful?
0 / 5 - 0 ratings

Related issues

acoburn picture acoburn  路  4Comments

Natkeeran picture Natkeeran  路  3Comments

DiegoPino picture DiegoPino  路  5Comments

ruebot picture ruebot  路  4Comments

jonathangreen picture jonathangreen  路  3Comments