This file describes the jaxp (xml processing) implementation of GNU Classpath.
GNU Classpath includes interfaces and implementations for basic XML processing
in in the java programming language, some general purpose SAX2 utilities, and
transformation.
These classes used to be maintained as part of an external project GNU JAXP
but are now integrated with the rest of the core class library provided by
GNU Classpath.
PACKAGES
. javax.xml.* ... JAXP 1.3 interfaces
. gnu.xml.aelfred2.* ... SAX2 parser + validator
. gnu.xml.dom.* ... DOM Level 3 Core, Traversal, XPath implementation
. gnu.xml.dom.ls.* ... DOM Level 3 Load & Save implementation
. gnu.xml.xpath.* ... JAXP XPath implementation
. gnu.xml.transform.* ... JAXP XSL transformer implementation
. gnu.xml.pipeline.* ... SAX2 event pipeline support
. gnu.xml.stream.* ... StAX pull parser and SAX-over-StAX driver
. gnu.xml.util.* ... various XML utility classes
. gnu.xml.libxmlj.dom.* ... libxmlj DOM Level 3 Core and XPath
. gnu.xml.libxmlj.sax.* ... libxmlj SAX parser
. gnu.xml.libxmlj.transform.* ... libxmlj XSL transformer
. gnu.xml.libxmlj.util.* ... libxmlj utility classes
In the external directory you can find the following packages.
They are not maintained as part of GNU Classpath, but are used by the
classes in the above packages.
. org.xml.sax.* ... SAX2 interfaces
. org.w3c.dom.* ... DOM Level 3 interfaces
. org.relaxng.datatype.* ... RELAX NG pluggable datatypes API
CONFORMANCE
The primary test resources are at http://xmlconf.sourceforge.net
and include:
SAX2/XML conformance tests
That the "xml.testing.Driver" addresses the core XML 1.0
specification requirements, which closely correspond to the
functionality SAX1 provides. The driver uses SAX2 APIs to
test that functionality It is used with a bugfixed version of
the NIST/OASIS XML conformance test cases.
The AElfred2 parser is highly conformant, though it still takes
a few implementation shortcuts. See its package documentation
for information about known XML conformance issues in AElfred2.
The primary issue is using Unicode character tables, rather than
those in the XML specification, for determining what names are
valid. Most applications won't notice the difference, and this
solution is smaller and faster than the alternative.
For validation, a secondary issue is that issues relating to
entity modularity are not validated; they can't all be cleanly
layered. For example, validity constraints related to standalone
declarations and PE nesting are not checked.
The current implementation has also been tested against Elliotte
Rusty Harold's SAXTest test suite (http://www.cafeconleche.org/SAXTest)
and achieves approximately 93% conformance to the SAX specification
according to these tests, higher than any other current Java parser.
SAX2
SAX2 API conformance currently has a minimal JUNIT (0.2) test suite,
which can be accessed at the xmlconf site listed above. It does
not cover namespaces or LexicalHandler and Declhandler extensions
anywhere as exhaustively as the SAX1 level functionality is
tested by the "xml.testing.Driver". However:
- Applying the DOM unit tests to this implementation gives
the LexicalHandler (comments, and boundaries of DTDs,
CDATA sections, and general entities) a workout, and
does the same for DeclHandler entity declarations.
- The pipeline package's layered validator demands that
element and attribute declarations are reported correctly.
By those metrics, SAX2 conformance for AElfred2 is also strong.
DOM Level 3 Core Tests
The DOM implementation has been tested against the W3C DOM Level 3
Core conformance test suite (http://www.w3.org/DOM/Test/). Current
conformance according to these tests is 72.3%. Many of the test
failures are due to the fact that GNU JAXP does not currently
provide any W3C XML Schema support.
XSL transformation
The transformer and XPath implementation have been tested against
the OASIS XSLT and XPath TC test suite. Conformance against the
Xalan tests is currently 77%.
libxmlj
========================================================================
libxmlj is an effort to create a 100% JAXP-compatible Java wrapper for
libxml2 and libxslt. JAXP is the Java API for XML processing, libxml2
is the XML C library for Gnome, and libxslt is the XSLT C library for
Gnome.
libxmlj currently supports most of the DOM Level 3 Core, Traversal, and
XPath APIs, SAX2, and XSLT transformations. There is no W3C XML Schema
support yet.
libxmlj can parse and transform XML documents extremely quickly in
comparison to Java-based JAXP implementations. DOM manipulations, however,
involve JNI overhead, so the speed of DOM tree construction and traversal
can be slower than the Java implementation.
libxmlj is highly experimental, doesn't always conform to the DOM
specification correctly, and may leak memory. Production use is not advised.
The implementation can be found in gnu/xml/libxmlj and native/jni/xmlj.
See the INSTALL file for the required versions of libxml2 and libxslt.
configure --enable-xmlj will build it.
Usage
------------------------------------------------------------------------
To enable the various GNU JAXP factories, set the following system properties
(command-line version shown, but they can equally be set programmatically):
AElfred2:
-Djavax.xml.parsers.SAXParserFactory=gnu.xml.aelfred2.JAXPFactory
GNU DOM (using DOM Level 3 Load & Save):
-Djavax.xml.parsers.DocumentBuilderFactory=gnu.xml.dom.DomDocumentBuilderFactory
GNU DOM (using AElfred-only pipeline classes):
-Djavax.xml.parsers.DocumentBuilderFactory=gnu.xml.dom.JAXPFactory
GNU XSL transformer:
-Djavax.xml.transform.TransformerFactory=gnu.xml.transform.TransformerFactoryImpl
GNU StAX:
-Djavax.xml.stream.XMLEventFactory=gnu.xml.stream.XMLEventFactoryImpl
-Djavax.xml.stream.XMLInputFactory=gnu.xml.stream.XMLInputFactoryImpl
-Djavax.xml.stream.XMLOutputFactory=gnu.xml.stream.XMLOutputFactoryImpl
GNU SAX-over-StAX:
-Djavax.xml.parsers.SAXParserFactory=gnu.xml.stream.SAXParserFactory
libxmlj SAX:
-Djavax.xml.parsers.SAXParserFactory=gnu.xml.libxmlj.sax.GnomeSAXParserFactory
libxmlj DOM:
-Djavax.xml.parsers.DocumentBuilderFactory=gnu.xml.libxmlj.dom.GnomeDocumentBuilderFactory
libxmlj XSL transformer:
-Djavax.xml.transform.TransformerFactory=gnu.xml.libxmlj.transform.GnomeTransformerFactory
When using libxmlj, the libxmlj shared library must be available.
In general it is picked up by the runtime using GNU Classpath. If not you
might want to try adding the directory where libxmlj.so is installed
(by default ${prefix}/lib/classpath/) with ldconfig or specifying in the
LD_LIBRARY_PATH environment variable. Additionally, you may need to specify
the location of your shared libraries to the runtime environment using the
java.library.path system property.
Missing (libxmlj) Features
------------------------------------------------------------------------
See BUGS in native/jni/xmlj for known bugs in the libxmlj native bindings.
This implementation should be thread-safe, but currently all
transformation requests are queued via Java synchronization, which
means that it effectively performs single-threaded. Long story short,
both libxml2 and libxslt are not fully reentrant.
Update: it may be possible to make libxmlj thread-safe nonetheless
using thread context variables.
Update: thread context variables have been introduced. This is very
untested though, libxmlj therefore still has the single thread
bottleneck.
Validation
===================================================
Pluggable datatypes
---------------------------------------------------
Validators should use the RELAX NG pluggable datatypes API to retrieve
datatype (XML Schema simple type) implementations in a schema-neutral
fashion. The following code demonstrates looking up a W3C XML Schema
nonNegativeInteger datatype:
DatatypeLibrary xsd = DatatypeLibraryLoader
.createDatatypeLibrary(XMLConstants.W3C_XML_SCHEMA_NS_URI);
Datatype nonNegativeInteger = xsd.createDatatype("nonNegativeInteger");
It is also possible to create new types by derivation. For instance,
to create a datatype that will match a US ZIP code:
DatatypeBuilder b = xsd.createDatatypeBuilder("string");
b.addParameter("pattern", "(^[0-9]{5}$)|(^[0-9]{5}-[0-9]{4}$)");
Datatype zipCode = b.createDatatype();
A datatype library implementation for XML Schema is provided; other
library implementations may be added.