Exist: Namespace serialization bug over WebDAV or XML-RPC after XQuery Update

Created on 8 Feb 2018  Â·  9Comments  Â·  Source: eXist-db/exist

What is the problem

After using XQuery Update to insert an element (a) in the empty namespace and (b) with an xml:id attribute into a document with a non-empty namespace, the empty namespace of the inserted element is not shown when reading the resulting document over WebDAV or XML-RPC. However, when reading the document via the REST interface, the doc() function, or eXide (via its load.xql module), the inserted empty namespace is preserved.

This phenomenon is only evident when (a) and (b) are in place. In other words, the problem is not evident if you insert an element in a non-empty namespace, or if you insert an element without the xml:id attribute. Also, the phenomenon is not evident if you store a document with the same structure directly and without the use of XQuery update.

What did you expect

I expected node identity to be preserved regardless of the interface through which the node is serialized or method with which it was stored into the database.

Describe how to reproduce or add a test

Run the following query in eXide or Java admin client:

xquery version "3.1";

declare namespace foo="foo";

let $in-memory := 
    <root xmlns="foo">
        <x/>
    </root> 
let $store := xmldb:store("/db", "test.xml", $in-memory)
let $on-disk := doc("/db/test.xml")
let $new-node := <x xml:id="aargh"/>
let $update := update insert $new-node into $on-disk/foo:root
return
    (
        $in-memory,
        $on-disk
    )

The 2nd item returned is as follows:

<root xmlns="foo">
    <x/>
    <x xmlns="" xml:id="aargh"/>
</root>

Then retrieve the file via REST interface, at http://localhost:8080/exist/rest/db/test.xml, and the result should be identical.

Then retrieve the file via XML-RPC interface (e.g., Java admin client or oXygen data source explorer) or WebDAV (e.g., Transmit or oXygen WebDAV data source), and the inserted element has lost its empty namespace:

<root xmlns="foo">
    <x/>
    <x xml:id="aargh"/>
</root>

This phenomenon is not evident if the @xml:id attribute is removed from the inserted element, if the inserted element is in a non-empty namespace, or if the document is stored whole via the following query:

let $new-node := <x a="b"/>
let $new-node := <x xmlns="bar"/>
xmldb:store(
    '/db', 
    'test.xml', 
    <root xmlns="foo">
        <x/>
        <x xmlns="" xml:id="aargh"/>
    </root>
)

Context information

  • eXist-db version + Git Revision hash: eXist 3.7.0-SNAPSHOT + 45c3aa506
  • Java version: 1.8.0_152
  • Operating system: macOS 10.13.3
  • 32 or 64 bit: 64 bit
  • Any custom changes in e.g. conf.xml: none
bug

Most helpful comment

@joewiz I am still waiting for comments from @wolfgangmm on this one

All 9 comments

Prompted by @duncdrum's question during yesterday's community call, the following query returns false() in eXist 3.7.0-SNAPSHOT:

deep-equal(
    <root xmlns="foo">
        <x/>
        <x xmlns="" xml:id="aargh"/>
    </root>, 
    <root xmlns="foo">
        <x/>
        <x xml:id="aargh"/>
    </root>
)

Yes, I think this bug got me going under eXist 2.2 back when #1237.
false() seems the correct answer to me, which means that depending on retrieval or storage method, the same document is not deep-equal to itself… snap

@joewiz if I run your example in the Java Admin Client I don't get the results you suggested, rather I get:

<root xmlns="foo">
    <x/>
</root>
<root xmlns="foo">
    <x/>
    <x xml:id="aargh"/>
</root>

Note the missing xmlns="" on the second x child element.

If I run this in eXide 2.1.3 then I get the same as above, however if I run this in eXide 2.4.3 then I do get the correct output i.e.:

<root xmlns="foo">
    <x/>
</root>
<root xmlns="foo">
    <x/>
    <x xmlns="" xml:id="aargh"/>
</root>

Okay so the problem seems to be a difference between using either:

  1. Just org.exist.storage.serializers.NativeSerializer by itself.
  2. Using org.exist.storage.serializers.NativeSerializer with class org.exist.util.serializer.SAXSerializer.

The RESTServer (and hence eXide) seem to use the added SAXSerializer which returns the correct result. Whereas NativeSerializer by itself, used by XML-RPC and WebDAV does not seem to return the correct result.

I am not sure if SAXSerializer should always be employed or not. @wolfgangmm should we always be using the SAXSerializer approach? If not, when should we use SAXSerializer or not?

Another option to using SAXSerializer everywhere is to try and fix NativeSerializer I attempted this patch, but it causes a small problem elsewhere:

diff --git a/src/org/exist/storage/serializers/NativeSerializer.java b/src/org/exist/storage/serializers/NativeSerializer.java
index 27c69da..e8fbc32 100644
--- a/src/org/exist/storage/serializers/NativeSerializer.java
+++ b/src/org/exist/storage/serializers/NativeSerializer.java
@@ -174,7 +174,13 @@ public class NativeSerializer extends Serializer {
                        }
                }
                final String ns = defaultNS == null ? node.getNamespaceURI() : defaultNS;
-               if (ns != null && ns.length() > 0 && (!namespaces.contains(ns))) {
+            if(ns == null || ns.isEmpty()) {
+                String prefix = node.getPrefix();
+                if(prefix == null) {
+                    prefix = XMLConstants.DEFAULT_NS_PREFIX;
+                }
+                receiver.startPrefixMapping(prefix, XMLConstants.NULL_NS_URI);
+            } else if(!namespaces.contains(ns)) {
                 String prefix = node.getPrefix();
                 if(prefix == null) {
                     prefix = XMLConstants.DEFAULT_NS_PREFIX;

Comments from @wolfgangmm would be appreciated...

I refactored @Joewiz example into a test: empty-ns.xql

But made a silly mistake. The new test, is still WIP until I can figure out how to simulate the xml-rpc call.

xquery version "3.1";

(:~
 : Test for empty erroneously stripped namespaces
 :)
module namespace nss="http://exist-db.org/xquery/test/namespace-serialization";

declare namespace foo="foo";
declare namespace test="http://exist-db.org/xquery/xqsuite";

declare variable $nss:in-memory := 
    <root xmlns="foo">
        <x/>
        <x xml:id="aargh"/>
    </root>;

declare variable $nss:on-disk := doc("/db/test.xml");

declare 
    %test:setUp 
function nss:setup() {
    xmldb:store("/db", "test.xml", $nss:in-memory)
};    

declare 
    %test:tearDown
function nss:cleanup() {
    xmldb:remove ("/db", "test.xml")
};

declare
    %test:assertFalse
function nss:insert() {
let $new-node := <x xml:id="aargh"/>
let $update := update replace $nss:on-disk//foo:x[2] with $new-node
return
    deep-equal(
        $nss:in-memory,
        $nss:on-disk
    )
};

@adam Great, it sounds like you've found the root of the problem. Thanks in advance for your comments @wolfgangmm on whether to adopt SAXSerializer everywhere vs. to fix NativeSerializer.

@joewiz I am still waiting for comments from @wolfgangmm on this one

So after discussion with @wolfgangmm it seems that it should be safe for me to switch to using SAXSerializer.

Was this page helpful?
0 / 5 - 0 ratings