Exist: eXist-4.6.0 excessive quoting in util:log* functions

Created on 26 Feb 2019  路  11Comments  路  Source: eXist-db/exist

What is the problem

The logging functions in util extension add unnecessary quote characters around the message. The behaviour is different in eXist-db 4.6 compared to eXist-db 2.2.

The following simple XQuery: util:log('info', 'some message that should not be quoted')
results in the following log message

2019-02-26 11:28:44,313 [qtp2043047179-41] INFO  (LogFunction.java [writeLog]:207) - (Line: 46 String) "some message that should not be quoted" 

The issue appears for both util:log and util:log-app functions. The issue might seem to be minor but we parse logs in Amazon CloudWatch to perform monitoring and it's sensitive to quotes so those extra quote characters break the monitoring. Other log monitoring solutions like ELK stack might have similar issues. Also in general it seems pointless to surround text with quotes as user can do it explicitly when required.

What I have found out myself is that LogFunction uses "adaptive" writer that is hardcoded here:
https://github.com/eXist-db/exist/blob/develop/exist-core/src/main/java/org/exist/xquery/functions/util/LogFunction.java
And adaptive writer explicitly adds those excessive quotes here:
https://github.com/eXist-db/exist/blob/449e06c19a3cd3412d08af60b67142a6cd05c1a3/exist-core/src/main/java/org/exist/util/serializer/AdaptiveWriter.java#L115
Moreover, it also escapes quotes that can be inside the log message which is also undesired.

I'm sure it should be fixed but not sure about the proper way. AdaptiveWriter might be used elsewhere and XQuerySerializer has only 3 options - adaptive, json, xml. Should another option be added like "log" or adaptive one modified, I don't know.

What did you expect

I expect no extra quote characters in log files as it worked in eXist-db 2.2, i.e.:

2019-02-26 11:28:44,313 [qtp2043047179-41] INFO  (LogFunction.java [writeLog]:207) - (Line: 46 String) some message that should not be quoted

Describe how to reproduce or add a test

Run simple XQuery and check log files:
util:log('info', 'some message that should not be quoted')

You might need to add the following logger to log4j2.xml to see the log message:

        <Logger name="org.exist.xquery.functions.util.LogFunction" additivity="false" level="info">
            <AppenderRef ref="STDOUT"/>
            <AppenderRef ref="exist.core"/>
        </Logger>

Context information

  • eXist Version : 4.6.0 / Git commit : 5b5ba69e6
  • Java 1.8.0_191
  • Operating System : Mac OS X 10.14.3 x86_64
  • Installed using JAR installer
  • Fresh installation of eXist-db + change in log4j2.xml
triage

Most helpful comment

I think it is better to opt for a change around https://github.com/eXist-db/exist/blob/develop/exist-core/src/main/java/org/exist/xquery/functions/util/LogFunction.java#L149 : if Item is of type StringValue, we can boldly invoke Item#getStringValue() to get the value.

This way we avoid the more costly serialization, avoid the escaping and double quotes.

For the other changes I made in my previous PR (fn:trace): I'd like to keep it as is.

All 11 comments

It can be solved by replacing lines 148-159 in LogFunction:
https://github.com/eXist-db/exist/blob/develop/exist-core/src/main/java/org/exist/xquery/functions/util/LogFunction.java#L148-L159

with the following code (which is taken from eXist-db 2.2 revision):

        while (i.hasNext()) {
            final Item next = i.nextItem();
            if (Type.subTypeOf(next.getType(), Type.NODE)) {
                final Serializer serializer = context.getBroker().getSerializer();
                serializer.reset();
                try {
                    buf.append(serializer.serialize((NodeValue) next));
                } catch (final SAXException e) {
                    throw (new XPathException(this, "An exception occurred while serializing node to log: " + e.getMessage(), e));
                }
            } else {
                buf.append(next.getStringValue());
            }
        }

I can send the PR if this approach is approved. For now I'm duplicating log functions in my own extension.

@dizzzz I think you made some changes here recently, what is your take on this?

@adamretterm

  • for the logger part: I am not so concerned, but.... as the logger is capable is to serialize more than one item in one go, it will be not possible for an enduser to see the split of to items. I would find that confusing too.
  • for the adaptive serializer (here is the actual change?), I think we have a potential challenge here with eXide and other users of the adaptive serializer?

@lagivan your change would break the adaptive serializer for the logs functions completely? T would not agree with that, as the adaptive part was introduced recently on purpose. Use case is json/map/array serialization.

reverting would give us back the old situation.

see https://github.com/eXist-db/exist/pull/2357/files

@adamretter @joewiz maybe we should discuss this at our community telco. I am really fond with the adaptive serializer usage here as it solved some issues with maps/json.

Could we somehow make the serialization method used in the logger configurable? Then a user like @lagivan who has an application that expects "xml" serialization (the old default) could configure his eXist to use this, whereas the default could remain (as it has been since #2357) "adaptive".

In essence it is only about xs:string serialization no?

I think it is better to opt for a change around https://github.com/eXist-db/exist/blob/develop/exist-core/src/main/java/org/exist/xquery/functions/util/LogFunction.java#L149 : if Item is of type StringValue, we can boldly invoke Item#getStringValue() to get the value.

This way we avoid the more costly serialization, avoid the escaping and double quotes.

For the other changes I made in my previous PR (fn:trace): I'd like to keep it as is.

@dizzzz I like that idea - serializing strings as raw strings but any other item types with adaptive serialization. Users who want to force a particular serialization method can just pre-serialize the content and pass it as a string.

I also like this idea, it will solve the issue and give enough flexibility to end users.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

joewiz picture joewiz  路  4Comments

lguariento picture lguariento  路  5Comments

Bpolitycki picture Bpolitycki  路  4Comments

jonjhallettuob picture jonjhallettuob  路  3Comments

opax picture opax  路  3Comments