Jsdom: Parsed XML doc's XPath queries fail with upper case attributes

Created on 12 Mar 2019  路  3Comments  路  Source: jsdom/jsdom

Basic info:

  • Node.js version: 8.9.1
  • jsdom version: 14.0.0

When using XPath to query/evaluate an XML document generated by JSDOM's DOMParser, the attribute part of the query seems to be forced to lower case, making it impossible to query for attributes that contain upper cased characters.

For example, given the following document:

<?xml version="1.0" encoding="utf-8"?><example Foo="bar"></example>
                                               ^-- capital F

A query for //*[@Foo="bar"] (or //*[@foo="bar"] for that matter) returns no matches, however given this document instead with lower case attributes:

<?xml version="1.0" encoding="utf-8"?><example foo="bar"></example>
                                               ^-- lower case F

Now with this new document, //*[@foo="bar"] (and the upper case equivalent //*[@Foo="bar"]) successfully find the match.

I have found a similar issue from a long time ago (https://github.com/jsdom/jsdom/issues/651) that was fixed by introducing Saxes to parse XML documents separately from HTML documents, however currently at the parsing level everything seems to be parsed correctly (e.g. the attributes retain their case). It is at the XPath evaluation level that the query seems to be lower cased.

I have tried narrowing down where the error might be occurring, and have reached https://github.com/jsdom/jsdom/blob/b83783da63deeb7c5602b024a92e214df423a412/lib/jsdom/level3/xpath.js#L1659

Setting that shouldLowerCase value to false fixes for my use case, but I am not aware of what implications that has for the rest of the XPath implementation.

Minimal reproduction case

const { JSDOM } = require("jsdom");

const dom = new JSDOM();

const domParser = new dom.window.DOMParser();
const doc = domParser.parseFromString('<?xml version="1.0" encoding="utf-8"?><example Foo="bar"></example>', 'text/xml');
const result = doc.evaluate('//*[@Foo="bar"]', doc, null, XPathResult.ANY_TYPE, null);
const exampleNode = result.iterateNext();
console.log('Result:', exampleNode); // exampleNode is null

How does similar code behave in browsers?

https://jsbin.com/qegifaqumi/2/edit?js,console

The XPath query is case sensitive and can match against nodes with attributes with upper case characters.

x(ht)ml

Most helpful comment

All 3 comments

We've had a todo for a long time to replace our ancient hand-rolled xpath implementation with a maintained third-party one. That's probably the best route to fixing this, although a spot-fix PR with a test would also be acceptable.

Hi, any updates?

Was this page helpful?
0 / 5 - 0 ratings