Skip to content

Replace name and entity regular expressions with specific functions for ~15% performance improvement#216

Open
lovell wants to merge 1 commit intoisaacs:mainfrom
lovell:perf-name-start-name-body
Open

Replace name and entity regular expressions with specific functions for ~15% performance improvement#216
lovell wants to merge 1 commit intoisaacs:mainfrom
lovell:perf-name-start-name-body

Conversation

@lovell
Copy link
Contributor

@lovell lovell commented Jun 28, 2017

Hello,

Regular expressions are generally fast and become increasingly more efficient with longer strings.

However this module tests character-by-character, so extracting the character code and using equality and range checks greatly increases the performance of element and entity name detection.

Using the node-expat benchmark tests reveals this gain is around 15%.

sax v1.2.4:

sax x 174,389 ops/sec ±1.67% (86 runs sampled)
node-xml x 138,412 ops/sec ±1.25% (88 runs sampled)
libxmljs x 240,261 ops/sec ±1.00% (84 runs sampled)
node-expat x 468,442 ops/sec ±0.88% (90 runs sampled)

with this change:

sax x 208,245 ops/sec ±0.77% (88 runs sampled)
node-xml x 138,796 ops/sec ±1.07% (88 runs sampled)
libxmljs x 253,781 ops/sec ±0.98% (85 runs sampled)
node-expat x 469,078 ops/sec ±0.84% (90 runs sampled)

The existing test suite continues to pass after this change.

This is the third and most likely final performance improvement I'm going to be able to make to sax, at least in the short term. When this change is viewed with #204 and #208 it appears we've been able to improve performance by at least a factor of 3x since v1.2.1.

Once again thank you for all your time maintaining this highly depended upon module.

with specific functions using char code equality or range checks

Increases performance by 10-15%
@jacktuck
Copy link

jacktuck commented Jun 28, 2017

LGTM - more readable too :)

Nice work @lovell!

Curious what sax would yield now on https://github.com/AndreasMadsen/htmlparser-benchmark

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants