This is a simple comparison of the parsing of a 10MB XML file between AsmXml, Expat and Xerces-C. The XML file used can be found in the download archive.
- The time includes only parsing, not file loading.
- Athlon XP 1800+ - 1533 MHz
- 512 MB RAM
- Windows XP SP1
- Expat 2.0.0
- Xerces-C 2.7.0
Time to parse employees-big.xml (10MB) 24 times:
The equivalent throughput chart:
It is difficult to do a fair benchmark because these parsers don't do the same job:
- AsmXml does not copy attributes and text unless necessary (when it includes entity or char references), so, if you need a zero terminated string, you must copy the value first.
- On the other hand, it does more work since it also decodes attributes (O(1) access time), saving a lot of time to the caller of the library.
It seems that keeping only essential features of XML and implementing the parser in assembler can be very profitable. So, if your XML documents can be parsed with AsmXml and you are looking for a very fast XML parser, you might consider trying AsmXml.
I've performed other tests on different machines and observed that, on a Pentium 4, the difference between AsmXml and Expat is smaller (next generation processors seem to be more profitable to C programs). I also did tests with my Arch Linux but I couldn't build xerces-c so I don't publish results, but Expat was faster, may be because of 686 optimizations.
Finally, if you're not convinced, do your own benchmarks with a more appropriate machine and more appropriate XML files that better fit your needs. The source code of the benchmark is available in the download archive.