Exist: Reindexing collection causes xpath in xsl transform to fail.

Created on 16 Jan 2018  路  13Comments  路  Source: eXist-db/exist

What is the problem

Reindexing a collection causes XSL transformations run by the transform:transform function to fail. XPath expressions and keys generated on a document included by the xsl:doc function behave strangely and generate different results after the reindex requested from the Java client.

What did you expect

The attached example which counts the nodes in a document illustrates the problem. Prior to the reindex, the node count reported is 1, after the reindex, the count is 0. Editing and saving the XML document restores the count to 1. Removing either or both of the two comments at the start of the XML document appears to eliminate the problem as well.

Describe how to reproduce or add a test

Using the Java client:

  1. Create a test collection, /db/test.
  2. Upload the test.xql, xml.xml and xsl.xsl documents attached to this collection using the Java client.
  3. Open test.xql and choose submit. The result should be 1. (<param-count expr="count($xml//*)">1</param-count> specifically.)
  4. Choose the /db/test collection and reindex.
  5. Submit test.xql again. The result is 0.
  6. Open xml.xml and choose save.
  7. Submit test.xql again. The result is 1. Repeating step 4 and 5 will cause the result to become 0 again.
  8. Remove either comment at the beginning of xml.xml and save the file.
  9. Submit test.xql. The result will be 1.
  10. Reindex test again, and resubmit test.xql. With the comment removed, the result will be 1 after the reindex.

Although this test doesn't demonstrate it, I also found this behavior: If the XML document consists of two nodes: <a><b/></a>, then after reindexing, the expression count(doc('xml.xml')//*) returns 0 , but count(doc('xml.xml')/a//*) returns 1 if evaluated inside a transformation. If these expressions are evaluated in the XQuery, they return the expected results before and after the reindex. Only expression evaluation inside a transformation are affected.

test.zip

Context information

Please always add the following information

  • eXist-db version: 3.6.1
  • Java version: Java(TM) SE Runtime Environment (build 1.8.0_151-b12)
  • Operating system: MacOS & Linux
  • 64 bit
  • Any custom changes in e.g. conf.xml: N/A (problem exists on customized and uncustomized instances)
bug investigate

All 13 comments

Very interesting, Paul. Nice job whittling down the test case.

@wolfgangmm A long shot here, but do you think this bears any relation to https://github.com/eXist-db/exist/issues/1648? No range index is involved here, but reindexing is.

This bug still exists in the 4.0.0 release.

@paulmer Did you ever test with 4.0.1 which as @joewiz suggested (via #1648) had some range index issues fixed in https://github.com/eXist-db/exist/pull/1763

Perhaps you could now test your procedure above against both eXist-db versions:

  1. 4.2.0
  2. 5.0.0-RC1

Sorry, it somehow didn't sink in that it was suggested I should try 4.0.1. I wasn't tracking the other bug report. Anyway, I did test 4.2.0 a couple of days ago and the problem is still present and reproducible 100% of the time on a clean install. I'll try 5.0.0 RC 1 as soon as I can.

4.3.1 and 5.0.0RC2 both exhibit this behavior. I highly recommend to anyone pursuing this issue to download the very simple test files I supplied. It's trivial to reproduce and you'll have much faster turn around.

so it is about re-indexing, but there is no collection.xconf configured for the collection?

For the test case I've given, that's correct, no special collection.xconf. I just installed eXist-5.0.0RC2, ran the Java Admin Client, loaded the test at /db/test using "Store files/directories", then opened and submitted the test.xql script to get the results.

I've tested it under other scenarios with the same results, such as placing the test collection under existing collections, some with xconf configurations, but the results have been the same.

@paulmer Thanks for your excellent reproducible test case.
I can reproduce the problem just as you describe. I will convert this into a series of JUnit tests and start looking into what the underlying problem is.

@paulmer okay I have created tests for your issue - https://github.com/adamretter/exist/commit/a1c7dfb8aed10992e59e8e26851f1f5e32198db9

I am still digging into the cause of it...

@paulmer Okay so this one took some time to solve, each time I fixed one problem it uncovered another. I am now able to reproduce the problem from XSLT as you describe here: https://github.com/adamretter/exist/commit/66a4d5728555cf7f96a6a9e0e832934891f64cac, and also directly in eXist-db without XSLT here: https://github.com/adamretter/exist/commit/f171771bddeac9f8f0032e9b8d141e13d3b83984

I can confirm that reindexing causes eXist-db to loose some data :-(

So far this seems to only happen to nodes which are direct children of the document-node. The issue I found seems to imply that this should happen at other tree levels too, but the tests that I have created don't show that happening. I have a fix for this, but I need to discuss it with @wolfgangmm first.

@adamretter Fascinating and extensive research with far-ranging consequences, such the preceding-sibling issues as you noted. Thank you for digging in!!

@adamretter I agree, fascinating. Thanks for spending the time looking into this!

Okay my PR to fix this is here: https://github.com/eXist-db/exist/pull/2113

Was this page helpful?
0 / 5 - 0 ratings