More on XML performance: seeking limits

From discussed results one may start to wonder what is highest performance that it can be ever achieved. Therefore i wrote dummy parser that simply scan input file for start tags (it is not a real XML parser by any means...) and I checked different possible decisions affecting raw performance. Surprisingly direct access to string with with XML content is slower than using Reader with pre allocated buffer (though when allocating buffer for each run is more expensive). I contribute it to cost of virtual call String.charAt() against transferring data to read buffer and accessing XML content from char array.
  scanCountBuf(Reader) buffer allocated for each parsing, buf.length=1
  tp600_jdk13     count   dummy_parser    list10k 3.164000034
  tp600_jdk13     count   dummy_parser    list100 0.025030000
  tp600_jdk13     count   dummy_parser    list1   0.000409600

  scanCountSimle(Reader) - using int read()
  tp600_jdk13     count   no_parser       list10k 1.192000031
  tp600_jdk13     count   no_parser       list100 0.011220000
  tp600_jdk13     count   no_parser       list1   0.000194300

  scanCountBuf(Reader) buffer allocated for each parsing,  buf.length=64KB
  tp600_jdk13     count   dummy_parser    list10k 0.480999947
  tp600_jdk13     count   dummy_parser    list100 0.006409999
  tp600_jdk13     count   dummy_parser    list1   0.001932800

  scanCountBuf(Reader) buffer allocated for each parsing, buf.length=8096
  tp600_jdk13     count   dummy_parser    list10k 0.459999919
  tp600_jdk13     count   dummy_parser    list100 0.003700000
  tp600_jdk13     count   dummy_parser    list1   0.000294500

  scanCountBuf(Reader) buffer allocated for each parsing, buf.length=1024
  tp600_jdk13     count   dummy_parser    list10k 0.389999986
  tp600_jdk13     count   dummy_parser    list100 0.003610001
  tp600_jdk13     count   dummy_parser    list1   0.000101100

  scanCount(String) - simple String scanning with charAt()
  tp600_jdk13     count   dummy_parser    list10k 0.470999956
  tp600_jdk13     count   dummy_parser    list100 0.003310000
  tp600_jdk13     count   dummy_parser    list1   0.000098100

  scanCountBuf(Reader, buf) buf.length=1024 and buffer reused in all tests
  tp600_jdk13     count   dummy_parser    list10k 0.450999975
  tp600_jdk13     count   dummy_parser    list100 0.003000000
  tp600_jdk13     count   dummy_parser    list1   0.000073100
From this test it is easy to estimate that hand written XML parser can be maximum 10x times faster than AElfred or KXML.

Go back to discussion.
 

Aleksander Slominski