Documentation

Building the Library

The library is released as an object file (.o or .obj) instead of a dynamic linking library. It will increase the size of your application by about 20K. I'm sure that if you are working with XML, you won't worry about 20K :-)

The binaries are already available in the distribution in the MS-COFF, ELF and Mach-O formats, so you shouldn't have to build the library from sources unless you want to do modifications.

Adding the Library to Your Project

Include the "include/asm-xml.h" file in your source file.
Link your project with the AsmXml object file.

Here are some tips to use it with various configurations:

MSVC 6: Add the obj\ms-coff\asm-xml.obj file to your project.
MinGW: Link your project with obj\ms-coff\asm-xml.obj.
Linux: Link your project with obj/elf/asm-xml.o.
Mac OS X: Link your project with obj/mach-o/asm-xml.o.

Defining a Schema

Like DOM, the parser creates an in-memory tree structure from the source document. Attributes and text-only elements can be directly accessed: you don't need lookup to find the value of an attribute or text element.

To decode attributes and elements, the parser needs a description of the structure of the document. AsmXml does not support (yet) DTD, XSD or Relax NG. It uses instead a simple XML schema file.

The object that define the possible attributes and child elements of a given element is referred to as a class.

Example:

  <schema>
    <document name="employees">
      <collection name="employee">
        <attribute name="id"/>
        <attribute name="managerId"/>
        <text name="firstName"/>
        <text name="lastName"/>
      </collection>
    </document>
  </schema>

This schema describes a document where the root element is <employees> and include a list of <employee> elements.

Example of document matching this schema:

  <?xml version="1.0" encoding="UTF-8"?>
  <!-- List of employees of the company -->
  <employees>
    <employee id="001">
      <firstName>Brian</firstName>
      <lastName>Williams</lastName>
    </employee>
    <employee managerId="001" id="123">
      <lastName>Smith</lastName>
      <firstName>John</firstName>
    </employee>
  </employees>

A parsed document will give an indexed access to attributes and text elements. The index depends on the order of definition of properties of elements. For instance, in the previous example, the id attribute will be at position 0, managerId at 1, firstName at 2 and lastName at 3.

The parser won't take care of the order of the text elements since they will be assigned to a particular slot depending on the class definition.

Child elements will be added in a linked list in their original order.

The schema element

The root element is the <schema> element and it includes:

Zero or more <collection> elements that can be reused. This element cannot contain <attribute> or <element> items.
One <document> element. This element defines the root class, i.e. the definition of the root element. It supports the same attributes and elements as a <collection>, see below.

The collection element

A collection defines an element that can occur zero or more time in a parent element. These elements will be accessibles from the parent's link list of child elements.

Attributes

Name	Comment
name	The name of the element.
type	The type of the element. It defines what should be found between the tags: `'container'` - (default) the element includes other child elements, text between tags is ignored. `'mixed'` - like container, but text between elements are added as elements where the attribute[0] is the text. `'text'` - includes only text. the attribute[0] is this text, other attributes start at 1.
id	An integer that uniquely identifies the class of the element against its siblings. This id is useless when an element can contain only one kind of children, but it becomes mandatory if there is several type of children.

Elements

Name	Comment
attribute	An attribute.
text	A text element.
collection	A list of child elements.
element	A single element.
reference	A reference to a collection defined under the `<schema>` element.
include	Include the content of a group defined under the `<schema>` element.

The element element

The element is similar to a collection except that it can occur at most once in the parent element and you have a direct access to its value instead of enumerating the list of children.

In fact, a cell of the attribute[] array holds an AXElement* instead of an AXAttribute.

The attribute element

Adds an attribute to the class.

Attributes are directly accessed from an array, their index corresponds to their order of definition starting to 0, or 1 if the element is of type 'text'.

Attributes

Name	Comment
name	The name of the atrtribute.
ignore	Forces the parser to ignore this attribute. The attribute is just skipped. This improves performances and saves memory. `'no'` - (default) the attribute is not ignored. `'yes'` - the attribute is ignored.

The text element

Adds a child element without attribute and containing only text.

text elements are directly accessed from an array, their index corresponds to their order of definition (including attributes) starting to 0.

Attributes

Name	Comment
name	The name of the element.
ignore	Forces the parser to ignore this text element. The element is just skipped. This improves performances and saves memory. `'no'` - (default) the element is not ignored. `'yes'` - the element is ignored.

The reference element

Allows to include a collection that is defined under the <schema> element.

The reference can appear before the target definition as well as afterward.

Attributes

Name	Comment
name	The name of the collection defined under the `<schema>` element.

The include element

Allows to include the content of a group that is defined under the <schema> element.

the target must be defined before the include.

Attributes

Name	Comment
name	The name of the group defined under the `<schema>` element.

The group element

A group is just a container to be included in a collection or an element. It support the same child elements as a collection and is identified by a name.

Exploring the Document

The parse function will return, if succeeded, a pointer to an AXElement object. All you need to read the parse document are the AXElement and AXAttribute structures.

AXElement

Name	Type	Comment
id	int	The id (the type) of the element
nextSibling	AXElement*	The next sibling element
firstChild	AXElement*	The first child element
attributes	AXAttribute[]	The array of attributes and text element

The first attribute corresponds to the first <attribute>, <text> or <element> declared in the class definition, the second attribute corresponds to the next <attribute>, <text> or <element>, etc...

The id is an integer that uniquely identifies the element. It is defined in a <collection> element. The id is required when you need to discriminate between one element type and another.

Example:

  <schema>
    <document name="body">
      <collection name="b" id="1" type="text"/>
      <collection name="i" id="2" type="text"/>
    </document>
  </schema>

The id '0' is reserved for text elements appearing in mixed content.

AXAttribute

Name	Type	Comment
begin	char*	Beginning of the value
limit	char*	Last char + 1

Functions

ax_initialize

Initializes the library.
```
void ax_initialize(malloc, free)
```
Arguments

malloc The memory alllocation function.

free The free memory function.

Description
This function initializes the library, it must be the first invoked function of the library.
ax_initializeParser

Initializes the parse context.
```
int  ax_initializeParser(AXParseContext* context, uint chunkSize)
```
Arguments

context The parse context.

chunkSize The default size of chunk.

Return Value
The error code, 0 if ok.
ax_releaseParser

Releases the parse context.
```
void ax_releaseParser(AXParseContext* context)
```
Arguments

context The parse context.

chunkSize The default size of chunk.

Description
This function will release all memory resources allocated by this context, i.e. all documents parsed with this context will be deleted.

malloc	The memory alllocation function.
free	The free memory function.

ax_parse

Parses and decodes an XML string.

AXElement* ax_parse(AXParseContext* context, 
                    const char* source, 
                    AXElementClass* type, 
                    int strict)

Arguments

context	The parse context.
source	The source of the document to parse.
type	The type of the document to parse.
strict	If this value is zero, attributes and elements not defined in the schema will be ignored without error. Otherwise, the function will stop parsing at the first unknown element or attribute.

Description

Returns the root element or NULL if the parsing fails. In case of failure, checks the context->errorCode value.

ax_initializeClassParser

Initializes the class parser.
```
int ax_initializeClassParser(AXClassContext* context)
```
Arguments

context The class parser.

Return Value
Returns error code, 0 if ok.
ax_releaseClassParser

Releases the class parser.
```
ax_releaseClassParser(AXClassContext* context)
```
Arguments

context The class context.

Description
Releases all resources allocated by this context. All classes created with this context will be deleted.
ax_classFromElement

Creates a class from an element.
```
AXElementClass* ax_classFromElement(AXElement* element,
                                    AXClassContext* context)
```
Arguments

element The class definition.

context The class context.

Return Value
Returns the created class or null if an error occurred.
ax_classFromString

Creates a class from a string.
```
AXElementClass* ax_classFromString(const char* string,
                                   AXClassContext* context)
```
Arguments

string The source of the class definition.

context The class context.

Return Value
Returns the created class or null if an error occurred.

context	The class parser.

context	The class context.

element	The class definition.
context	The class context.

string	The source of the class definition.
context	The class context.

Error Codes

When an error occurs, check the errorCode attribute of the AXClassContext or AXParseContext for more information on the type of error.

Name	Comment
RC_OK	Everything is ok
RC_MEMORY	Out of memory
RC_EMPTY_NAME	name empty or not defined
RC_ATTR_DEFINED	Attribute already defined
RC_INVALID_ENTITY_REFERENCE	Must be amp, quot, lt, gt, or apos
RC_UNEXPECTED_END	Found last char too early
RC_INVALID_CHAR	Wrong char
RC_OVERFLOW	Number to big in char reference
RC_NO_START_TAG	XML does not start with a tag
RC_TAG_MISMATCH	Invalid close tag
RC_INVALID_TAG	Invalid root element
RC_INVALID_ATTRIBUTE	Attribute not defined in schema
RC_INVALID_PI	Invalid processing instruction (<?xml...)
RC_INVALID_DOCTYPE	Duplicate doctype or doctype after main element
RC_VERSION_EXPECTED	'version' is missing in xml declaration

context	The parse context.
chunkSize	The default size of chunk.

Documentation

Building the Library

Adding the Library to Your Project

Defining a Schema

The schema element

The collection element

The element element

The attribute element

The text element

The reference element

The include element

The group element

Exploring the Document

AXElement

AXAttribute

Functions

ax_initialize

ax_initializeParser

ax_releaseParser

ax_parse

ax_initializeClassParser

ax_releaseClassParser

ax_classFromElement

ax_classFromString

Error Codes