1.14. Using the WAP Binary XML Parser

1.14.1. What is WAP Binary XML?

WAP Binary XML differs from XML mainly in the effort to compress the data in several ways. Repetive strings can be moved into a string table and are only used by their table index. Numbers also take as little bytes as possible. XML tags are represented by tokens which are just predefined numbers depending on the actual application. Sometimes there is a mixture of string tables indexes, tokens and actual string fragments which makes it quite hard to parse WAP Binary XML.

For that reason ulxmlrpcpp uses a very small subset of this definition. Since there are no XML attributes involved and no repetitive strings in use, it is mainly a conversion of the XML-RPC tags into tokens. The big benefit is higher processing speed because parsing means merely reading a short number instead of searching for opening and closing tags as in XML. Another positive effect is the smaller packet size that is transfered with the XML-RPC requests.

ulxmlrpcpp uses a SAX-like approach to parse the WAP Binary XML structure, similar to the built-in xml parser which is based on expat. This also involves a simple state machine which models nested elements by deriving parser classes: "outer" elements derive from "inner" elements.

Parsing of a start tag is done in the following steps:

  1. Check if the current object is able to handle the current element
  2. Otherwise delegate to the parent (which may as well call the former parent)
  3. Process the current element and remember the state for later use on a stack

Character data between the XML tags is stored for later use.

An ending token is handled similarly:

  1. Check if the current object is able to handle the current element
  2. Otherwise delegate to the parent
  3. Process stored character data.