Open
Description
Woodstox already features impressive performance optimizations, to which I would like to add small ones. I will submit in individual PRs, so we can discuss the changes separately.
I benchmarked with a JMH test using namespace-aware StAX-parsing with very little value extraction, other use-cases may show less significant improvements (but still benefit slightly).
- Profilers identify
WstxInputData.isNameStartChar(char)
andWstxInputData.isNameStartChar(char)
as hotspots. I suggest to make the common case (ASCII) slightly faster there by using aboolean
lookup table. This takes ~256 bytes of extra memory, but I suppose that this a reasonable trade-off. I benchmarked with different JVM versions and results varied, but my JMH test showed improvements of e.g. ~3%, sometimes better. - A low hanging fruit is replacing
StringBuffer
withStringbuilder
. The latter is well-known to be faster due to lack of synchronization. If my analysis is correct,Stringbuffer
is never used in a place where thread-safety is relevant, so it can safely be replaced. This has already been done instax2-api
, just a leftover in woodstox I guess. - Direct field access in (now private) inner class
Bucket
should be fine and faster than calling getters.
Approaches that did not work (maybe someone else tries with different results?):
- Use
Arrays.copyOf()
or just.clone()
(called byArrays.copyOf()
in newer Java versions) instead ofSystem.arrayCopy()
for cloning arrays: While theoretically.clone()
should be fastest (also according to Joshua Bloch in a quite old statement), my microbenchmarking showed no significant difference, oftenSystem.arrayCopy()
was even faster. But the code reads nicer (shorter, easier to understand), I have to admit. (more details see comment below) - Use bit mask trick (
(1 << currToken) & MASK_GET_TEXT_XXX
) also for masks with just 2 different types. This seems to perform worse than justtype == A || type == B
. Attribute.getQName()
seems to have optimization potential, but it's not a hotspot.
Metadata
Metadata
Assignees
Labels
No labels