997fc59aa7
From-SVN: r131893
205 lines
8.5 KiB
Plaintext
205 lines
8.5 KiB
Plaintext
This file describes the jaxp (xml processing) implementation of GNU Classpath.
|
|
GNU Classpath includes interfaces and implementations for basic XML processing
|
|
in in the java programming language, some general purpose SAX2 utilities, and
|
|
transformation.
|
|
|
|
These classes used to be maintained as part of an external project GNU JAXP
|
|
but are now integrated with the rest of the core class library provided by
|
|
GNU Classpath.
|
|
|
|
PACKAGES
|
|
|
|
. javax.xml.* ... JAXP 1.3 interfaces
|
|
|
|
. gnu.xml.aelfred2.* ... SAX2 parser + validator
|
|
. gnu.xml.dom.* ... DOM Level 3 Core, Traversal, XPath implementation
|
|
. gnu.xml.dom.ls.* ... DOM Level 3 Load & Save implementation
|
|
. gnu.xml.xpath.* ... JAXP XPath implementation
|
|
. gnu.xml.transform.* ... JAXP XSL transformer implementation
|
|
. gnu.xml.pipeline.* ... SAX2 event pipeline support
|
|
. gnu.xml.stream.* ... StAX pull parser and SAX-over-StAX driver
|
|
. gnu.xml.util.* ... various XML utility classes
|
|
. gnu.xml.libxmlj.dom.* ... libxmlj DOM Level 3 Core and XPath
|
|
. gnu.xml.libxmlj.sax.* ... libxmlj SAX parser
|
|
. gnu.xml.libxmlj.transform.* ... libxmlj XSL transformer
|
|
. gnu.xml.libxmlj.util.* ... libxmlj utility classes
|
|
|
|
In the external directory you can find the following packages.
|
|
They are not maintained as part of GNU Classpath, but are used by the
|
|
classes in the above packages.
|
|
|
|
. org.xml.sax.* ... SAX2 interfaces
|
|
. org.w3c.dom.* ... DOM Level 3 interfaces
|
|
. org.relaxng.datatype.* ... RELAX NG pluggable datatypes API
|
|
|
|
CONFORMANCE
|
|
|
|
The primary test resources are at http://xmlconf.sourceforge.net
|
|
and include:
|
|
|
|
SAX2/XML conformance tests
|
|
That the "xml.testing.Driver" addresses the core XML 1.0
|
|
specification requirements, which closely correspond to the
|
|
functionality SAX1 provides. The driver uses SAX2 APIs to
|
|
test that functionality It is used with a bugfixed version of
|
|
the NIST/OASIS XML conformance test cases.
|
|
|
|
The AElfred2 parser is highly conformant, though it still takes
|
|
a few implementation shortcuts. See its package documentation
|
|
for information about known XML conformance issues in AElfred2.
|
|
|
|
The primary issue is using Unicode character tables, rather than
|
|
those in the XML specification, for determining what names are
|
|
valid. Most applications won't notice the difference, and this
|
|
solution is smaller and faster than the alternative.
|
|
|
|
For validation, a secondary issue is that issues relating to
|
|
entity modularity are not validated; they can't all be cleanly
|
|
layered. For example, validity constraints related to standalone
|
|
declarations and PE nesting are not checked.
|
|
|
|
The current implementation has also been tested against Elliotte
|
|
Rusty Harold's SAXTest test suite (http://www.cafeconleche.org/SAXTest)
|
|
and achieves approximately 93% conformance to the SAX specification
|
|
according to these tests, higher than any other current Java parser.
|
|
|
|
SAX2
|
|
SAX2 API conformance currently has a minimal JUNIT (0.2) test suite,
|
|
which can be accessed at the xmlconf site listed above. It does
|
|
not cover namespaces or LexicalHandler and Declhandler extensions
|
|
anywhere as exhaustively as the SAX1 level functionality is
|
|
tested by the "xml.testing.Driver". However:
|
|
|
|
- Applying the DOM unit tests to this implementation gives
|
|
the LexicalHandler (comments, and boundaries of DTDs,
|
|
CDATA sections, and general entities) a workout, and
|
|
does the same for DeclHandler entity declarations.
|
|
|
|
- The pipeline package's layered validator demands that
|
|
element and attribute declarations are reported correctly.
|
|
|
|
By those metrics, SAX2 conformance for AElfred2 is also strong.
|
|
|
|
DOM Level 3 Core Tests
|
|
The DOM implementation has been tested against the W3C DOM Level 3
|
|
Core conformance test suite (http://www.w3.org/DOM/Test/). Current
|
|
conformance according to these tests is 72.3%. Many of the test
|
|
failures are due to the fact that GNU JAXP does not currently
|
|
provide any W3C XML Schema support.
|
|
|
|
XSL transformation
|
|
The transformer and XPath implementation have been tested against
|
|
the OASIS XSLT and XPath TC test suite. Conformance against the
|
|
Xalan tests is currently 77%.
|
|
|
|
|
|
libxmlj
|
|
========================================================================
|
|
|
|
libxmlj is an effort to create a 100% JAXP-compatible Java wrapper for
|
|
libxml2 and libxslt. JAXP is the Java API for XML processing, libxml2
|
|
is the XML C library for Gnome, and libxslt is the XSLT C library for
|
|
Gnome.
|
|
|
|
libxmlj currently supports most of the DOM Level 3 Core, Traversal, and
|
|
XPath APIs, SAX2, and XSLT transformations. There is no W3C XML Schema
|
|
support yet.
|
|
|
|
libxmlj can parse and transform XML documents extremely quickly in
|
|
comparison to Java-based JAXP implementations. DOM manipulations, however,
|
|
involve JNI overhead, so the speed of DOM tree construction and traversal
|
|
can be slower than the Java implementation.
|
|
|
|
libxmlj is highly experimental, doesn't always conform to the DOM
|
|
specification correctly, and may leak memory. Production use is not advised.
|
|
|
|
The implementation can be found in gnu/xml/libxmlj and native/jni/xmlj.
|
|
See the INSTALL file for the required versions of libxml2 and libxslt.
|
|
configure --enable-xmlj will build it.
|
|
|
|
Usage
|
|
------------------------------------------------------------------------
|
|
|
|
To enable the various GNU JAXP factories, set the following system properties
|
|
(command-line version shown, but they can equally be set programmatically):
|
|
|
|
AElfred2:
|
|
-Djavax.xml.parsers.SAXParserFactory=gnu.xml.aelfred2.JAXPFactory
|
|
|
|
GNU DOM (using DOM Level 3 Load & Save):
|
|
-Djavax.xml.parsers.DocumentBuilderFactory=gnu.xml.dom.DomDocumentBuilderFactory
|
|
|
|
GNU DOM (using AElfred-only pipeline classes):
|
|
-Djavax.xml.parsers.DocumentBuilderFactory=gnu.xml.dom.JAXPFactory
|
|
|
|
GNU XSL transformer:
|
|
-Djavax.xml.transform.TransformerFactory=gnu.xml.transform.TransformerFactoryImpl
|
|
|
|
GNU StAX:
|
|
-Djavax.xml.stream.XMLEventFactory=gnu.xml.stream.XMLEventFactoryImpl
|
|
-Djavax.xml.stream.XMLInputFactory=gnu.xml.stream.XMLInputFactoryImpl
|
|
-Djavax.xml.stream.XMLOutputFactory=gnu.xml.stream.XMLOutputFactoryImpl
|
|
|
|
GNU SAX-over-StAX:
|
|
-Djavax.xml.parsers.SAXParserFactory=gnu.xml.stream.SAXParserFactory
|
|
|
|
libxmlj SAX:
|
|
-Djavax.xml.parsers.SAXParserFactory=gnu.xml.libxmlj.sax.GnomeSAXParserFactory
|
|
|
|
libxmlj DOM:
|
|
-Djavax.xml.parsers.DocumentBuilderFactory=gnu.xml.libxmlj.dom.GnomeDocumentBuilderFactory
|
|
|
|
libxmlj XSL transformer:
|
|
-Djavax.xml.transform.TransformerFactory=gnu.xml.libxmlj.transform.GnomeTransformerFactory
|
|
|
|
When using libxmlj, the libxmlj shared library must be available.
|
|
In general it is picked up by the runtime using GNU Classpath. If not you
|
|
might want to try adding the directory where libxmlj.so is installed
|
|
(by default ${prefix}/lib/classpath/) with ldconfig or specifying in the
|
|
LD_LIBRARY_PATH environment variable. Additionally, you may need to specify
|
|
the location of your shared libraries to the runtime environment using the
|
|
java.library.path system property.
|
|
|
|
Missing (libxmlj) Features
|
|
------------------------------------------------------------------------
|
|
|
|
See BUGS in native/jni/xmlj for known bugs in the libxmlj native bindings.
|
|
|
|
This implementation should be thread-safe, but currently all
|
|
transformation requests are queued via Java synchronization, which
|
|
means that it effectively performs single-threaded. Long story short,
|
|
both libxml2 and libxslt are not fully reentrant.
|
|
|
|
Update: it may be possible to make libxmlj thread-safe nonetheless
|
|
using thread context variables.
|
|
|
|
Update: thread context variables have been introduced. This is very
|
|
untested though, libxmlj therefore still has the single thread
|
|
bottleneck.
|
|
|
|
|
|
Validation
|
|
===================================================
|
|
|
|
Pluggable datatypes
|
|
---------------------------------------------------
|
|
Validators should use the RELAX NG pluggable datatypes API to retrieve
|
|
datatype (XML Schema simple type) implementations in a schema-neutral
|
|
fashion. The following code demonstrates looking up a W3C XML Schema
|
|
nonNegativeInteger datatype:
|
|
|
|
DatatypeLibrary xsd = DatatypeLibraryLoader
|
|
.createDatatypeLibrary(XMLConstants.W3C_XML_SCHEMA_NS_URI);
|
|
Datatype nonNegativeInteger = xsd.createDatatype("nonNegativeInteger");
|
|
|
|
It is also possible to create new types by derivation. For instance,
|
|
to create a datatype that will match a US ZIP code:
|
|
|
|
DatatypeBuilder b = xsd.createDatatypeBuilder("string");
|
|
b.addParameter("pattern", "(^[0-9]{5}$)|(^[0-9]{5}-[0-9]{4}$)");
|
|
Datatype zipCode = b.createDatatype();
|
|
|
|
A datatype library implementation for XML Schema is provided; other
|
|
library implementations may be added.
|
|
|