ESPX - an ECMAScript Parser for (almost) XML, with namespaces

Version 20020313

TinyXSL - XML transform in-Script mini-Language

Version -0.73

(both for down-level web user agents without built-in XML support)

March 13, 2002 release

Here's the download.

See the file copying.txt for copying permission.

TinyXSL demo (uses the ESPX parser and TinyXSL processor)

Abstract

As its poorly-imaginative name suggests, "ESPX" is an ECMAScript-coded parser for a subset of XML 1.0 - that is, no DTD support yet (external nor internal subset).

However, since version 20010206 it comes with full support for the XML namespaces syntax and (name scoping-)semantics additions to XML. Also, as a main implementation goal, ESPX was written with strict ECMAScript compliance in mind - see "Tested user agents" below.

As far as performances are concerned, please see "The performance issue" below.

Anyway, this should be considered a beta release.

For the impatient, here's the source code as well as a simple demo. See also: the FAQ, which is a TinyXSL demo (as you know, Small is beautiful ;^)

Basic testing
Frequently Asked Questions
Changes from previous releases
ESPX / TinyXSL files supported
Description
The performance issue
Things to know for a proper use
Examples
From here ...
Pending ...
Wish list
Supported HTML 4.0 entities
Limitations
Tested user agents
Reporting bugs

Basic Testing

Here are the results of running ESPX against some of James Clark's [xmltests].

For convenience, there is also an all-in-one ZIP file.

Also, for comparison, here are the results of running, against these tests :

MSXML (in its version 3, Service Pack #1): msxml-test.txt
XML for <SCRIPT> (in its version 1.1): xml4script-test.txt

Frequently Asked Questions Here

Changes from previous releases

Note as ESPX is now in need of minimum user feedback, the pace of revisions should decrease (or even be null for a while). However, see "From here ..." below.

Changes in this release

This version fixes/adds support for the following (bug fixes and/or design changes first):

Fixed bug in XMLParser._lookForInvalidCharacters() (internal) utility function;
TinyXSLProcessor.getVersion() returns -0.73;
XMLParser.getVersion() now returns 20020313

Changes in version 20020212.075

Fixed bug in XMLDocument.getElementsByTagName() (thanx to Gaurav Pal ;o)
Related to the latter, changed the way ESPX keeps track of internal node IDs (see XMLNode.uniqueID(), XMLDocument.createWhatever());
TinyXSLProcessor.getVersion() returns -0.75;
XMLParser.getVersion() now returns 20020212

Changes in version 20020112.076

Bug fix : ]]> no more allowed in element content (per http://www.w3.org/TR/REC-xml#NT-CharData);
Root element tag name now properly checked against <!DOCTYPE ...>;
TinyXSLProcessor.getVersion() returns -0.76;
XMLParser.getVersion() now returns 20020112

Changes in version 20020110.077

<!DOCTYPE ...> is now silently ignored (no more "unsupported document type declaration" error);
TinyXSLProcessor.getVersion() returns -0.77;
XMLParser.getVersion() now returns 20020110

Changes in version 20020109.078

Fixed bug related to the white space character class (per http://www.w3.org/TR/REC-xml#NT-S);
Added "Basic Testing" section;
TinyXSLProcessor.getVersion() returns -0.78;
XMLParser.getVersion() now returns 20020109

Changes in version 20011228.079

Bug fix : (yet another) namespace handling-related bug (affected xml:space, lang, etc.) :^(
TinyXSLProcessor.getVersion() returns -0.79;
XMLParser.getVersion() now returns 20011228

Changes in version 20011205.080

Added support for xml:base;
related to the latter, the new XMLParser.xmlBase property, which you can use in a fashion similar to that of XMLParser.xmlLang;
TinyXSLProcessor.getVersion() returns -0.80;
XMLParser.getVersion() now returns 20011205

Changes in version 20011116.081

Namespace URIs, URLs cleanup for ESPX / TinyXSL : obsolete ones (e.g. /me/works/xml/espx/) are now marked / documented as such, on access;
TinyXSLProcessor.getVersion() returns -0.81;
XMLParser.getVersion() now returns 20011116;
Updated this page's "Tested user agents" section;
Updated "FAQ" using the version-independent, namespace URI

Changes in version 20010411.082

Major namespace changes for ESPX / TinyXSL:
- version-independent, namespace URI: http://www.cjandia.com/2001/espx-tinyxsl
- version-dependent URL: http://www.cjandia.com/2001/espx-tinyxsl/20010411.082/
TinyXSLProcessor.getVersion() returns -0.82;
XMLParser.getVersion() now returns 20010411

Changes in version 20010101

Initial release.

ESPX / TinyXSL files supported

Description

ESPX is not a validating parser. It does not read any form of internal DTD subset either. All it does at a minimum is check the document for basic well-formedness, of proper elements nesting, of attribute assignments and of character/predefined entity references (e.g.,  , &...) utilization. Eventually, it builds an unoptimized tree data structure in memory, to represent the parsed document.

Since there is no form of DTD declarations support, ESPX as no other choice than to treat attribute values as CDATA (all whitespace is kept).

Also, CR/LF sequences and CR characters alone are normalized to LF once for all on input, just before parsing.

What is built in memory

As it parses the document, ESPX's XMLParser object tries to build some kind of a DOM-like tree data structure. Note the latter is not compliant with the official DOM (see DOM Level 1). At most can you rely on more or less universal features semantics like nodeName, nodeValue, nodeType, parentNode and so on. But you won't find any equivalent for insertBefore() and the like.

The parse result tree is given by the XMLParser.document property. The same parser object may be reused multiple times to parse different documents. See the <script> tag at the top of simple.htm to know how all this is put to work.

The performance issue

On a 3-year old Pentium II, 350mhz, 64mb, running NS 4.7 over Win98, ESPX parses a 12kb document and builds the DOM-like tree in less than 0.6 second, while it is done in less than 1.5 second for a document twice as big (perf1.xml and perf2.xml where markup roughly represents 33% of document size).

However, for documents above 36kb, you must be aware that the parsing/tree building durations currently experienced are simply not acceptable (more than 2 seconds). So there is quite a big place for improvement in this area. FYI: as an order of magnitude and for the same small documents (< 50kb), ESPX appears on the average to be between fifty and one hundred times slower than Microsoft's C++-coded MSXML.

Note if markup is sparse, representing less, say, than 5% of document size, then ESPX performs better (1.3 second under IE for 90kb-size spars90k.xml for example).

Things to know for a proper use

Design choices

<?xml ... encoding="..." ... ?> is not supported (ECMAScript uses Unicode);
neither is <?xml ... standalone="..." ... ?> (ESPX always assumes standalone="yes");
also, neither is <!DOCTYPE ... >, in any form;
then, something like the non-breaking space ( ) has to be written   or   (however, see "Supported HTML 4.0 entities" below);
for that, of course, character entities references are properly handled, both in decimal and hexadecimal (and leading zeroes do not harm);
at least, the only general entities currently recognized are the 5 most important, predefined ones in XML: &, ', >, < and " (however, see "Supported HTML 4.0 entities" below);
there is no equivalent of DOM's NodeList: to access a node's children you have to do it the ECMAScript's preferred way, that is, theNode[theIndex] (of course, there is a theNode.childCount property to know the size of the family);
there is no NamedNodeMap for elements' attributes either, instead use: theElementNode.attributes[nameOfAttribute], or preferably, theElementNode.get/setAttribute(nameOfAttribute); since attributes is an ECMAScript array, you can also enumerate all attributes the usual way in ECMAScript:
```
  var attr;
  for(attr in theElement.attributes) {
    // do something with theElement.attributes[attr]
  }
```
xmlText() is an attempt to provide something similar to Microsoft's MSXML DOMDocument's xml property (implemented in ESPX as a simple recursive function returning the XML source text recomposed from the tree data structure) - this one helps to debug;
note that the standard xml:space and xml:lang pre-declared attributes are properly honored on a per-element basis - also, in ESPX the meaning of xml:space's "default" value is controlled by the XMLParser.preserveWhiteSpace boolean property (for which false is to strip insignificant white spaces, while true is to keep them all); as far as xml:lang's default is concerned, it is given by XMLParser.xmlLang (which is a string property);
also, XMLDocumentFactory plays a role similar to that of the DOM's DOMImplementation.

Implementation-related

to cope with down-level ECMAScript user agents which are still around, ESPX does not take advantage of throw/try ... catch-based error detection/handling - it seems even something like NS 4.7 doesn't know what to do with a try ... catch statement;
methods and properties prefixed with an underscore are supposed to be what is often qualified as private or protected in more serious (read: strongly-typed) OO languages; please do not use them unless you can't do your business an other way;
I did my best to have meaningfull error messages and accurate XML source line/column info; see XMLParser() constructor function for currently recognized error cases - test this part carefully;
more generally, please do test again and again before contemplating to put this in a production environment;
by the way, you should have read the copying/permission notice here above attached - please read it now if not done yet;
that should be it - enjoy.

Examples

For now, only simple.htm, databind.htm and the FAQ as TinyXSL sample. Others may follow.

From here ...

As far as future work directions are concerned, they are likely to include, for the most urgent in any order:

start optimizing seriously;
provide a SAX-like version of XMLParser, if only for use by TinyXSL;
for that latter also, implement more usable <txsl:template match="..." ...>, <txsl:apply-templates select="..." ...>, etc, pattern matching semantics, obviously modelled after a subset of XPath;
provide worth of the name a documentation;

see current limitations below.

Implementation note for an XPath-subset:: ideally, I guess, a smart implementation would use ECMAScript's property of being a reflexive language and thus would compile expressions written in such a subset into ECMAScript functions.

Of course, if you find yourself able to devise an interesting use of ESPX, and better yet, to implement any of the preceding, I can't do better than inviting you to join in the effort.

Pending ...

Tutorials !

Wish list

For ESPX

See "From here ..." above.

For TinyXSL

<txsl:attribute-set name="..." ...>;
<txsl:copy-of select="..." deep="no|yes" ...>;
anything else ?

Supported HTML 4.0 entities

Most of them, including:

From Latin-1 Entities:

Character Entity Decimal Hex Rendering in the browser

Entity Decimal

no-break space = non-breaking space      

inverted exclamation mark ¡ ¡ ¡ ¡ ¡

cent sign ¢ ¢ ¢ ¢ ¢

pound sign £ £ £ £ £

currency sign ¤ ¤ ¤ ¤ ¤

yen sign = yuan sign ¥ ¥ ¥ ¥ ¥

broken bar = broken vertical bar ¦ ¦ ¦ ¦ ¦

section sign § § § § §

diaeresis = spacing diaeresis ¨ ¨ ¨ ¨ ¨

copyright sign © © © © ©

feminine ordinal indicator ª ª ª ª ª

left-pointing double angle quotation mark = left pointing guillemet « « « « «

not sign = discretionary hyphen ¬ ¬ ¬ ¬ ¬

soft hyphen = discretionary hyphen   

registered sign = registered trade mark sign ® ® ® ® ®

macron = spacing macron = overline = APL overbar ¯ ¯ ¯ ¯ ¯

degree sign ° ° ° ° °

plus-minus sign = plus-or-minus sign ± ± ± ± ±

superscript two = superscript digit two = squared ² ² ² ² ²

superscript three = superscript digit three = cubed ³ ³ ³ ³ ³

acute accent = spacing acute ´ ´ ´ ´ ´

micro sign µ µ µ µ µ

pilcrow sign = paragraph sign ¶ ¶ ¶ ¶ ¶

middle dot = Georgian comma = Greek middle dot · · · · ·

cedilla = spacing cedilla ¸ ¸ ¸ ¸ ¸

superscript one = superscript digit one ¹ ¹ ¹ ¹ ¹

masculine ordinal indicator º º º º º

right-pointing double angle quotation mark = right pointing guillemet » » » » »

vulgar fraction one quarter = fraction one quarter ¼ ¼ ¼ ¼ ¼

vulgar fraction one half = fraction one half ½ ½ ½ ½ ½

vulgar fraction three quarters = fraction three quarters ¾ ¾ ¾ ¾ ¾

inverted question mark = turned question mark ¿ ¿ ¿ ¿ ¿

Latin capital letter A with grave = Latin capital letter A grave À À À À À

Latin capital letter A with acute Á Á Á Á Á

Latin capital letter A with circumflex Â Â Â Â Â

Latin capital letter A with tilde Ã Ã Ã Ã Ã

Latin capital letter A with diaeresis Ä Ä Ä Ä Ä

Latin capital letter A with ring above = Latin capital letter A ring Å Å Å Å Å

Latin capital letter AE = Latin capital ligature AE Æ Æ Æ Æ Æ

Latin capital letter C with cedilla Ç Ç Ç Ç Ç

Latin capital letter E with grave È È È È È

Latin capital letter E with acute É É É É É

Latin capital letter E with circumflex Ê Ê Ê Ê Ê

Latin capital letter E with diaeresis Ë Ë Ë Ë Ë

Latin capital letter I with grave Ì Ì Ì Ì Ì

Latin capital letter I with acute Í Í Í Í Í

Latin capital letter I with circumflex Î Î Î Î Î

Latin capital letter I with diaeresis Ï Ï Ï Ï Ï

Latin capital letter ETH Ð Ð Ð Ð Ð

Latin capital letter N with tilde Ñ Ñ Ñ Ñ Ñ

Latin capital letter O with grave Ò Ò Ò Ò Ò

Latin capital letter O with acute Ó Ó Ó Ó Ó

Latin capital letter O with circumflex Ô Ô Ô Ô Ô

Latin capital letter O with tilde Õ Õ Õ Õ Õ

Latin capital letter O with diaeresis Ö Ö Ö Ö Ö

multiplication sign × × × × ×

Latin capital letter O with stroke = Latin capital letter O slash Ø Ø Ø Ø Ø

Latin capital letter U with grave Ù Ù Ù Ù Ù

Latin capital letter U with acute Ú Ú Ú Ú Ú

Latin capital letter U with circumflex Û Û Û Û Û

Latin capital letter U with diaeresis Ü Ü Ü Ü Ü

Latin capital letter Y with acute Ý Ý Ý Ý Ý

Latin capital letter THORN Þ Þ Þ Þ Þ

Latin small letter sharp s = ess-zed ß ß ß ß ß

Latin small letter a with grave = Latin small letter a grave à à à à à

Latin small letter a with acute á á á á á

Latin small letter a with circumflex â â â â â

Latin small letter a with tilde ã ã ã ã ã

Latin small letter a with diaeresis ä ä ä ä ä

Latin small letter a with ring above = Latin small letter a ring å å å å å

Latin small letter ae = Latin small ligature ae æ æ æ æ æ

Latin small letter c with cedilla ç ç ç ç ç

Latin small letter e with grave è è è è è

Latin small letter e with acute é é é é é

Latin small letter e with circumflex ê ê ê ê ê

Latin small letter e with diaeresis ë ë ë ë ë

Latin small letter i with grave ì ì ì ì ì

Latin small letter i with acute í í í í í

Latin small letter i with circumflex î î î î î

Latin small letter i with diaeresis ï ï ï ï ï

Latin small letter eth ð ð ð ð ð

Latin small letter n with tilde ñ ñ ñ ñ ñ

Latin small letter o with grave ò ò ò ò ò

Latin small letter o with acute ó ó ó ó ó

Latin small letter o with circumflex ô ô ô ô ô

Latin small letter o with tilde õ õ õ õ õ

Latin small letter o with diaeresis ö ö ö ö ö

division sign ÷ ÷ ÷ ÷ ÷

Latin small letter o with stroke = Latin small letter o slash ø ø ø ø ø

Latin small letter u with grave ù ù ù ù ù

Latin small letter u with acute ú ú ú ú ú

Latin small letter u with circumflex û û û û û

Latin small letter u with diaeresis ü ü ü ü ü

Latin small letter y with acute ý ý ý ý ý

Latin small letter thorn þ þ þ þ þ

Latin small letter y with diaeresis ÿ ÿ ÿ ÿ ÿ

Character	Entity	Decimal	Hex	Rendering in the browser
no-break space = non-breaking space
inverted exclamation mark	¡	¡	¡	¡	¡
cent sign	¢	¢	¢	¢	¢
pound sign	£	£	£	£	£
currency sign	¤	¤	¤	¤	¤
yen sign = yuan sign	¥	¥	¥	¥	¥
broken bar = broken vertical bar	¦	¦	¦	¦	¦
section sign	§	§	§	§	§
diaeresis = spacing diaeresis	¨	¨	¨	¨	¨
copyright sign	©	©	©	©	©
feminine ordinal indicator	ª	ª	ª	ª	ª
left-pointing double angle quotation mark = left pointing guillemet	«	«	«	«	«
not sign = discretionary hyphen	¬	¬	¬	¬	¬
soft hyphen = discretionary hyphen
registered sign = registered trade mark sign	®	®	®	®	®
macron = spacing macron = overline = APL overbar	¯	¯	¯	¯	¯
degree sign	°	°	°	°	°
plus-minus sign = plus-or-minus sign	±	±	±	±	±
superscript two = superscript digit two = squared	²	²	²	²	²
superscript three = superscript digit three = cubed	³	³	³	³	³
acute accent = spacing acute	´	´	´	´	´
micro sign	µ	µ	µ	µ	µ
pilcrow sign = paragraph sign	¶	¶	¶	¶	¶
middle dot = Georgian comma = Greek middle dot	·	·	·	·	·
cedilla = spacing cedilla	¸	¸	¸	¸	¸
superscript one = superscript digit one	¹	¹	¹	¹	¹
masculine ordinal indicator	º	º	º	º	º
right-pointing double angle quotation mark = right pointing guillemet	»	»	»	»	»
vulgar fraction one quarter = fraction one quarter	¼	¼	¼	¼	¼
vulgar fraction one half = fraction one half	½	½	½	½	½
vulgar fraction three quarters = fraction three quarters	¾	¾	¾	¾	¾
inverted question mark = turned question mark	¿	¿	¿	¿	¿
Latin capital letter A with grave = Latin capital letter A grave	À	À	À	À	À
Latin capital letter A with acute	Á	Á	Á	Á	Á
Latin capital letter A with circumflex	Â	Â	Â	Â	Â
Latin capital letter A with tilde	Ã	Ã	Ã	Ã	Ã
Latin capital letter A with diaeresis	Ä	Ä	Ä	Ä	Ä
Latin capital letter A with ring above = Latin capital letter A ring	Å	Å	Å	Å	Å
Latin capital letter AE = Latin capital ligature AE	Æ	Æ	Æ	Æ	Æ
Latin capital letter C with cedilla	Ç	Ç	Ç	Ç	Ç
Latin capital letter E with grave	È	È	È	È	È
Latin capital letter E with acute	É	É	É	É	É
Latin capital letter E with circumflex	Ê	Ê	Ê	Ê	Ê
Latin capital letter E with diaeresis	Ë	Ë	Ë	Ë	Ë
Latin capital letter I with grave	Ì	Ì	Ì	Ì	Ì
Latin capital letter I with acute	Í	Í	Í	Í	Í
Latin capital letter I with circumflex	Î	Î	Î	Î	Î
Latin capital letter I with diaeresis	Ï	Ï	Ï	Ï	Ï
Latin capital letter ETH	Ð	Ð	Ð	Ð	Ð
Latin capital letter N with tilde	Ñ	Ñ	Ñ	Ñ	Ñ
Latin capital letter O with grave	Ò	Ò	Ò	Ò	Ò
Latin capital letter O with acute	Ó	Ó	Ó	Ó	Ó
Latin capital letter O with circumflex	Ô	Ô	Ô	Ô	Ô
Latin capital letter O with tilde	Õ	Õ	Õ	Õ	Õ
Latin capital letter O with diaeresis	Ö	Ö	Ö	Ö	Ö
multiplication sign	×	×	×	×	×
Latin capital letter O with stroke = Latin capital letter O slash	Ø	Ø	Ø	Ø	Ø
Latin capital letter U with grave	Ù	Ù	Ù	Ù	Ù
Latin capital letter U with acute	Ú	Ú	Ú	Ú	Ú
Latin capital letter U with circumflex	Û	Û	Û	Û	Û
Latin capital letter U with diaeresis	Ü	Ü	Ü	Ü	Ü
Latin capital letter Y with acute	Ý	Ý	Ý	Ý	Ý
Latin capital letter THORN	Þ	Þ	Þ	Þ	Þ
Latin small letter sharp s = ess-zed	ß	ß	ß	ß	ß
Latin small letter a with grave = Latin small letter a grave	à	à	à	à	à
Latin small letter a with acute	á	á	á	á	á
Latin small letter a with circumflex	â	â	â	â	â
Latin small letter a with tilde	ã	ã	ã	ã	ã
Latin small letter a with diaeresis	ä	ä	ä	ä	ä
Latin small letter a with ring above = Latin small letter a ring	å	å	å	å	å
Latin small letter ae = Latin small ligature ae	æ	æ	æ	æ	æ
Latin small letter c with cedilla	ç	ç	ç	ç	ç
Latin small letter e with grave	è	è	è	è	è
Latin small letter e with acute	é	é	é	é	é
Latin small letter e with circumflex	ê	ê	ê	ê	ê
Latin small letter e with diaeresis	ë	ë	ë	ë	ë
Latin small letter i with grave	ì	ì	ì	ì	ì
Latin small letter i with acute	í	í	í	í	í
Latin small letter i with circumflex	î	î	î	î	î
Latin small letter i with diaeresis	ï	ï	ï	ï	ï
Latin small letter eth	ð	ð	ð	ð	ð
Latin small letter n with tilde	ñ	ñ	ñ	ñ	ñ
Latin small letter o with grave	ò	ò	ò	ò	ò
Latin small letter o with acute	ó	ó	ó	ó	ó
Latin small letter o with circumflex	ô	ô	ô	ô	ô
Latin small letter o with tilde	õ	õ	õ	õ	õ
Latin small letter o with diaeresis	ö	ö	ö	ö	ö
division sign	÷	÷	÷	÷	÷
Latin small letter o with stroke = Latin small letter o slash	ø	ø	ø	ø	ø
Latin small letter u with grave	ù	ù	ù	ù	ù
Latin small letter u with acute	ú	ú	ú	ú	ú
Latin small letter u with circumflex	û	û	û	û	û
Latin small letter u with diaeresis	ü	ü	ü	ü	ü
Latin small letter y with acute	ý	ý	ý	ý	ý
Latin small letter thorn	þ	þ	þ	þ	þ
Latin small letter y with diaeresis	ÿ	ÿ	ÿ	ÿ	ÿ

From Entities for Symbols and Greek Letters:

Character Entity Decimal Hex Rendering in the browser

Entity Decimal

Latin small f with hook = function = florin &fnof; ƒ ƒ ƒ ƒ

bullet = black small circle • • • • •

horizontal ellipsis = three dot leader … … … … …

trade mark sign ™ ™ ™ ™ ™

Character	Entity	Decimal	Hex	Rendering in the browser
Latin small f with hook = function = florin	&fnof;	ƒ	ƒ	ƒ	ƒ
bullet = black small circle	•	•	•	•	•
horizontal ellipsis = three dot leader	…	…	…	…	…
trade mark sign	™	™	™	™	™

From Special Entities:

Character Entity Decimal Hex Rendering in the browser

Entity Decimal

quotation mark = APL quote " " " " "

ampersand & & & & &

less-than sign < < < < <

greater-than sign > > > > >

Latin capital ligature OE &OElig; Œ Œ Œ Œ

Latin small ligature oe &oelig; œ œ œ œ

Latin capital letter S with caron &Scaron; Š Š Š Š

Latin small letter s with caron &scaron; š š š š

Latin capital letter Y with diaeresis &Yuml; Ÿ Ÿ Ÿ Ÿ

modifier letter circumflex accent &circ; ˆ ˆ ˆ ˆ

small tilde &tilde; ˜ ˜ ˜ ˜

en space &ensp;    

em space &emsp;    

thin space      

zero width non-joiner &zwnj; ‌ ‌ ‌ ‌

zero width joiner &zwj; ‍ ‍ ‍ ‍

left-to-right mark &lrm; ‎ ‎ ‎ ‎

right-to-left mark &rlm; ‏ ‏ ‏ ‏

en dash – – – – –

em dash — — — — —

left single quotation mark ‘ ‘ ‘ ‘ ‘

right single quotation mark ’ ’ ’ ’ ’

single low-9 quotation mark &sbquo; ‚ ‚ ‚ ‚

left double quotation mark “ “ “ “ “

right double quotation mark ” ” ” ” ”

double low-9 quotation mark &bdquo; „ „ „ „

dagger &dagger; † † † †

double dagger &Dagger; ‡ ‡ ‡ ‡

per mille sign &permil; ‰ ‰ ‰ ‰

single left-pointing angle quotation mark &lsaquo; ‹ ‹ ‹ ‹

single right-pointing angle quotation mark &rsaquo; › › › ›

euro sign € € € € €

Character	Entity	Decimal	Hex	Rendering in the browser
quotation mark = APL quote	"	"	"	"	"
ampersand	&	&	&	&	&
less-than sign	<	<	<	<	<
greater-than sign	>	>	>	>	>
Latin capital ligature OE	&OElig;	Œ	Œ	Œ	Œ
Latin small ligature oe	&oelig;	œ	œ	œ	œ
Latin capital letter S with caron	&Scaron;	Š	Š	Š	Š
Latin small letter s with caron	&scaron;	š	š	š	š
Latin capital letter Y with diaeresis	&Yuml;	Ÿ	Ÿ	Ÿ	Ÿ
modifier letter circumflex accent	&circ;	ˆ	ˆ	ˆ	ˆ
small tilde	&tilde;	˜	˜	˜	˜
en space	&ensp;
em space	&emsp;
thin space
zero width non-joiner	&zwnj;	‌	‌	‌	‌
zero width joiner	&zwj;	‍	‍	‍	‍
left-to-right mark	&lrm;	‎	‎	‎	‎
right-to-left mark	&rlm;	‏	‏	‏	‏
en dash	–	–	–	–	–
em dash	—	—	—	—	—
left single quotation mark	‘	‘	‘	‘	‘
right single quotation mark	’	’	’	’	’
single low-9 quotation mark	&sbquo;	‚	‚	‚	‚
left double quotation mark	“	“	“	“	“
right double quotation mark	”	”	”	”	”
double low-9 quotation mark	&bdquo;	„	„	„	„
dagger	&dagger;	†	†	†	†
double dagger	&Dagger;	‡	‡	‡	‡
per mille sign	&permil;	‰	‰	‰	‰
single left-pointing angle quotation mark	&lsaquo;	‹	‹	‹	‹
single right-pointing angle quotation mark	&rsaquo;	›	›	›	›
euro sign	€	€	€	€	€

Limitations

There is no documentation except this document and comments in the source code.

Also, and apart from bugs to discover, the implementation is in need of improvement in several areas, including:

there is no error recovery after an error is reported;
no effort has yet been devoted to optimization;
a few method/property names are inconsistent;
by not using throw/try ... catch at all, the code is too much tricky and/or boring sometimes.

Tested user agents (to be updated regularly)

These have been successfully tested with ESPX / TinyXSL:

Platform Product name Version(s) Built-in XML support versions ? ECMAScript implementation level used for ESPX

Mac Microsoft Internet Explorer 5.0 5.x and above ???

Windows Microsoft Internet Explorer 4.x, 5.x 5.x and above JScript 3 in 4.x browsers (latest is JScript 5.5 (?))

Windows, Linux Netscape Navigator 4.x, 6.0 6.0 and above JavaScript1.2 in 4.x browsers (latest is JavaScript1.5 (?))

Windows Opera 5.0 ??? JavaScript1.2 (?)

Platform	Product name	Version(s)	Built-in XML support versions ?	`ECMAScript` implementation level used for ESPX
Mac	Microsoft Internet Explorer	5.0	5.x and above	???
Windows	Microsoft Internet Explorer	4.x, 5.x	5.x and above	JScript 3 in 4.x browsers (latest is JScript 5.5 (?))
Windows, Linux	Netscape Navigator	4.x, 6.0	6.0 and above	JavaScript1.2 in 4.x browsers (latest is JavaScript1.5 (?))
Windows	Opera	5.0	???	JavaScript1.2 (?)

Reporting bugs

Please report bugs to me. When reporting bugs please be sure to include easy-to-reproduce test cases for, either, IE 4.x or 5.x, or NS 4.x or 6.0. I'm also interested in the Linux platform- and WMLScript-testing feedback, if applicable. Create a zip file containing all the necessary files, and attach the zip file to your email.

Ideas, comments, suggestions for improvements, especially bug fixes, are always welcome, as usual. Thanks in advance.

March 13, 2002

Cyril Jandia