WAP Binary XML differs from XML mainly in the effort to compress the data in several ways. Repetive strings can be moved into a string table and are only used by their table index. Numbers also take as little bytes as possible. XML tags are represented by tokens which are just predefined numbers depending on the actual application. Sometimes there is a mixture of string tables indexes, tokens and actual string fragments which makes it quite hard to parse WAP Binary XML.
For that reason ulxmlrpcpp uses a very small subset of this definition. Since there are no XML attributes involved and no repetitive strings in use, it is mainly a conversion of the XML-RPC tags into tokens. The big benefit is higher processing speed because parsing means merely reading a short number instead of searching for opening and closing tags as in XML. Another positive effect is the smaller packet size that is transfered with the XML-RPC requests.
ulxmlrpcpp uses a SAX
-like approach to parse
the WAP Binary XML structure, similar to the built-in xml
parser which is based on expat
. This also involves a
simple state machine which models nested elements by deriving parser classes: "outer" elements
derive from "inner" elements.
Parsing of a start tag is done in the following steps:
Character data between the XML tags is stored for later use.
An ending token is handled similarly: