Documentation

Building the Library

The library is released as an object file (.o or .obj) instead of a dynamic linking library. It will increase the size of your application by about 20K. I'm sure that if you are working with XML, you won't worry about 20K :-)

The binaries are already available in the distribution in the MS-COFF, ELF and Mach-O formats, so you shouldn't have to build the library from sources unless you want to do modifications.

Adding the Library to Your Project

  1. Include the "include/asm-xml.h" file in your source file.
  2. Link your project with the AsmXml object file.

Here are some tips to use it with various configurations:

Defining a Schema

Like DOM, the parser creates an in-memory tree structure from the source document. Attributes and text-only elements can be directly accessed: you don't need lookup to find the value of an attribute or text element.

To decode attributes and elements, the parser needs a description of the structure of the document. AsmXml does not support (yet) DTD, XSD or Relax NG. It uses instead a simple XML schema file.

The object that define the possible attributes and child elements of a given element is referred to as a class.

Example:

  <schema>
    <document name="employees">
      <collection name="employee">
        <attribute name="id"/>
        <attribute name="managerId"/>
        <text name="firstName"/>
        <text name="lastName"/>
      </collection>
    </document>
  </schema>

This schema describes a document where the root element is <employees> and include a list of <employee> elements.

Example of document matching this schema:

  <?xml version="1.0" encoding="UTF-8"?>
  <!-- List of employees of the company -->
  <employees>
    <employee id="001">
      <firstName>Brian</firstName>
      <lastName>Williams</lastName>
    </employee>
    <employee managerId="001" id="123">
      <lastName>Smith</lastName>
      <firstName>John</firstName>
    </employee>
  </employees>

A parsed document will give an indexed access to attributes and text elements. The index depends on the order of definition of properties of elements. For instance, in the previous example, the id attribute will be at position 0, managerId at 1, firstName at 2 and lastName at 3.

The parser won't take care of the order of the text elements since they will be assigned to a particular slot depending on the class definition.

Child elements will be added in a linked list in their original order.

The schema element

The root element is the <schema> element and it includes:

The collection element

A collection defines an element that can occur zero or more time in a parent element. These elements will be accessibles from the parent's link list of child elements.

Attributes

NameComment
nameThe name of the element.
typeThe type of the element. It defines what should be found between the tags:
  • 'container' - (default) the element includes other child elements, text between tags is ignored.
  • 'mixed' - like container, but text between elements are added as elements where the attribute[0] is the text.
  • 'text' - includes only text. the attribute[0] is this text, other attributes start at 1.
idAn integer that uniquely identifies the class of the element against its siblings. This id is useless when an element can contain only one kind of children, but it becomes mandatory if there is several type of children.

Elements

NameComment
attributeAn attribute.
textA text element.
collectionA list of child elements.
elementA single element.
referenceA reference to a collection defined under the <schema> element.
includeInclude the content of a group defined under the <schema> element.

The element element

The element is similar to a collection except that it can occur at most once in the parent element and you have a direct access to its value instead of enumerating the list of children.

In fact, a cell of the attribute[] array holds an AXElement* instead of an AXAttribute.

The attribute element

Adds an attribute to the class.

Attributes are directly accessed from an array, their index corresponds to their order of definition starting to 0, or 1 if the element is of type 'text'.

Attributes

NameComment
nameThe name of the atrtribute.
ignoreForces the parser to ignore this attribute. The attribute is just skipped. This improves performances and saves memory.
  • 'no' - (default) the attribute is not ignored.
  • 'yes' - the attribute is ignored.

The text element

Adds a child element without attribute and containing only text.

text elements are directly accessed from an array, their index corresponds to their order of definition (including attributes) starting to 0.

Attributes

NameComment
nameThe name of the element.
ignoreForces the parser to ignore this text element. The element is just skipped. This improves performances and saves memory.
  • 'no' - (default) the element is not ignored.
  • 'yes' - the element is ignored.

The reference element

Allows to include a collection that is defined under the <schema> element.

The reference can appear before the target definition as well as afterward.

Attributes

NameComment
nameThe name of the collection defined under the <schema> element.

The include element

Allows to include the content of a group that is defined under the <schema> element.

the target must be defined before the include.

Attributes

NameComment
nameThe name of the group defined under the <schema> element.

The group element

A group is just a container to be included in a collection or an element. It support the same child elements as a collection and is identified by a name.

Exploring the Document

The parse function will return, if succeeded, a pointer to an AXElement object. All you need to read the parse document are the AXElement and AXAttribute structures.

AXElement

NameTypeComment
idintThe id (the type) of the element
nextSiblingAXElement*The next sibling element
firstChild AXElement*The first child element
attributesAXAttribute[]The array of attributes and text element

The first attribute corresponds to the first <attribute>, <text> or <element> declared in the class definition, the second attribute corresponds to the next <attribute>, <text> or <element>, etc...

The id is an integer that uniquely identifies the element. It is defined in a <collection> element. The id is required when you need to discriminate between one element type and another.

Example:

  <schema>
    <document name="body">
      <collection name="b" id="1" type="text"/>
      <collection name="i" id="2" type="text"/>
    </document>
  </schema>

The id '0' is reserved for text elements appearing in mixed content.

AXAttribute

NameTypeComment
beginchar*Beginning of the value
limitchar*Last char + 1

Functions

Error Codes

When an error occurs, check the errorCode attribute of the AXClassContext or AXParseContext for more information on the type of error.

NameComment
RC_OKEverything is ok
RC_MEMORYOut of memory
RC_EMPTY_NAMEname empty or not defined
RC_ATTR_DEFINEDAttribute already defined
RC_INVALID_ENTITY_REFERENCEMust be amp, quot, lt, gt, or apos
RC_UNEXPECTED_ENDFound last char too early
RC_INVALID_CHARWrong char
RC_OVERFLOWNumber to big in char reference
RC_NO_START_TAGXML does not start with a tag
RC_TAG_MISMATCHInvalid close tag
RC_INVALID_TAGInvalid root element
RC_INVALID_ATTRIBUTEAttribute not defined in schema
RC_INVALID_PI Invalid processing instruction (<?xml...)
RC_INVALID_DOCTYPEDuplicate doctype or doctype after main element
RC_VERSION_EXPECTED'version' is missing in xml declaration