User:Mjb/Oracle XSLT processor bugs

From Offset
Jump to navigationJump to search

Oracle's Java-based XSLT processor, commonly referred to as "Oracle XSL" or by the name of its command-line wrapper and Java class, "oraxsl", has a number of problems which render it incompatible with other processors, and more or less nonconformant. I observed these issues while using the processor in late 2001 and early 2005.

I was never able to find a good way to report bugs to Oracle. They have the worst bug reporting and tech support system ever. Metalink is poorly designed and hard to use, Oracle's message boards don't accept bug reports, and the company doesn't want to hear from the general public; you have to be a paying customer to even be acknowledged, and even then, you can only file a general trouble ticket, not a bug report, and then you have to wait for someone in India to first try to resolve it by arguing with you about whether it's really a bug. So it is basically impossible to file a simple bug report, and it's no wonder things never get fixed.

Here are the bugs:

Empty-string parameter sometimes tests true

This bug may only manifest in certain contexts, or with certain versions of the processor.

Example needed

Empty-string parameter sometimes causes crash

Example needed

xml-stylesheet PIs are omitted from source tree

Example needed

xsl:message output can corrupt XML or HTML result tree serialization

The following stylesheet, applied against any source doc, should result in the output of a prolog and an empty <result/> element, along with a 'hello' message from the processor:

<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet version="1.0"
  xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

  <xsl:output method="xml" indent="yes"/>

  <xsl:template match="/">
    <result>
      <xsl:message terminate="no">hello</xsl:message>
    </result>
    <xsl:text>
</xsl:text>
  </xsl:template>

</xsl:stylesheet>

In these command-line examples, 'oraxsl' is an alias for 'java -classpath /home/mike/xml/oracle/lib/xmlparserv2.jar:/home/mike/xml/oracle/lib/xmlmesg.jar oracle.xml.parser.v2.oraxsl':

The main problem is that an extraneous element start tag is serialized:

$ oraxsl dummy.xml oraxsl-bug.xsl out
file:/portnoy/home/mike/xml/test/oraxsl-bug.xsl: 
Message: hello

$ cat out
<?xml version = '1.0'?>
<result>
   <result/>

A secondary problem is that if an explicit output target is not given, then both the serialized result tree and the processor message are sent to stdout, whereas it would make more sense for the processor message to go to stderr, where it will not interfere with the stdout stream.

$ oraxsl dummy.xml oraxsl-bug.xsl > out

$ cat out
<?xml version = '1.0'?>
file:/portnoy/home/mike/xml/test/oraxsl-bug.xsl: 
Message: hello
<result>
   <result/>

The expected results were more like this:

$ saxon -o out dummy.xml oraxsl-bug.xsl
hello

$ cat out
<?xml version="1.0" encoding="utf-8"?>
<result/>

The effect of these issues is that xsl:message cannot be safely used in this processor at all. How is it that this bug has lasted so long?

Last-modified date/time of importing stylesheet affects loading of imported stylesheets

Imported stylesheets are not reloaded when they're changed, unless the last-modified timestamp of the importing stylesheet is also updated.

This is another one with a baffling longevity.

Example needed

xsl:when without a 'test' attribute is allowed

This bug may only manifest in certain contexts, or with certain versions of the processor.

Example needed

xsl:param in imported stylesheets not handled correctly

Example needed

xsl:key does not set keys for converted result tree fragments

Keys are not set on node-sets constructed with exsl:node-set().

Stylesheet:

 <?xml version="1.0" encoding="utf-8"?>
 <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
   xmlns:my="urn:uuid:53d7fd99-c665-4686-9b47-ba904a8270f7"
   xmlns:exsl="http://exslt.org/common"
   version="1.0">

   <xsl:output method="text"/>

   <xsl:key name="items-by-stringvalue" match="my:item" use="."/>

   <my:items>
     <my:item xml:id="i1">apple</my:item>
     <my:item xml:id="i2">apple</my:item>
     <my:item xml:id="i3">apple</my:item>
     <my:item xml:id="i4">orange</my:item>
     <my:item xml:id="i5">apple</my:item>
     <my:item xml:id="i6">orange</my:item>
     <my:item xml:id="i7">orange</my:item>
     <my:item xml:id="i8">apple</my:item>
   </my:items>

   <xsl:template match="/">
     <xsl:text>First instance of each item name:
</xsl:text>
     <xsl:for-each select="document('')/*/my:items/my:item[count(.|key('items-by-stringvalue',.)[1])=1]">
       <xsl:value-of select="concat(., '#', @xml:id, '
')"/>
     </xsl:for-each>
     <xsl:text>
Same thing:
</xsl:text>
     <xsl:variable name="items">
       <xsl:copy-of select="document('')/*/my:items/my:item"/>
     </xsl:variable>
     <xsl:for-each select="exsl:node-set($items)/my:item[count(.|key('items-by-stringvalue',.)[1])=1]">
       <xsl:value-of select="concat(., '#', @xml:id, '
')"/>
     </xsl:for-each>
   </xsl:template>

 </xsl:stylesheet>

Expected:

First instance of each item name:
apple#i1
orange#i4

Same thing:
apple#i1
orange#i4

Actual:

First instance of each item name:
apple#i1
orange#i4

Same thing:
apple#i1
apple#i2
apple#i3
orange#i4
apple#i5
orange#i6
orange#i7
apple#i8

Oracle XML DOM quirks

Document is a subclass of Element

In Oracle's XML DOM implementation, Document is a subclass of Element. I'm not entirely sure this is forbidden by W3C DOM, but it is counterintuitive.

It prevents Saxon from being used to process an Oracle XML DOM Document. This is partly a problem with Saxon, which doesn't use the DOM's node type attributes to figure out what it is processing; it instead uses "instanceof" to see if the document it was given is an Element (i.e., a fragment) or a Document (i.e., a complete doc), so it gets confused by Oracle's strange Documents-that-are-Elements. It is therefore impossible to feed an Oracle DOM to Saxon; you have to serialize it first.