-
Notifications
You must be signed in to change notification settings - Fork 188
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make regex dependency optional #385
Conversation
17d3c6f
to
687c640
Compare
`regex` was used only in four trivial cases that could be implemented more simply, either naively or using memchr, without losing performance. As such the dependency needlessly increases build time, size of binary and attack surface. This change makes `regex` optional and defaults to `naive`/`memchr` implementations. This *improves* performance a bit. The dependency could've been removed entirely but was kept in case regression is discovered on another platform and to make comparing the performance easier. It can be removed in the future if the code is proven to be reliable. Signed-off-by: Martin Habovstiak <[email protected]>
687c640
to
1017b4a
Compare
Oh, I wasn't aware of In the meantime, would you mind posting the difference of the |
Great! Forgot to mention: Attempted to use critcmp but hit an error $ cargo +stable bench --features=regex --bench text_encoder -- --save-baseline regex
$ cargo +stable bench --bench text_encoder -- --save-baseline noregex Outputs: regex
noregex
BTW I noticed even bigger difference in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change makes regex optional and defaults to naive/memchr implementations. This improves performance a bit. The dependency could've been removed entirely but was kept in case regression is discovered on another platform and to make comparing the performance easier. It can be removed in the future if the code is proven to be reliable.
I would advocate for removing regex
entirely. In case it does turn out to be less performant than using its parent crate (regex
) on some platforms, it is not exposed on the API surface and thus can be switched back without a breaking change if we don't find the actual cause.
Benchmarks look great to me.
$ critcmp regex no_regex
group no_regex regex
----- -------- -----
text_encoder_with_escaping 1.00 88.4±2.01ms ? B/sec 1.08 95.7±1.79ms ? B/sec
text_encoder_without_escaping 1.00 59.4±1.22ms ? B/sec 1.18 70.2±1.98ms ? B/sec
$ cat /proc/cpuinfo | grep "model name" | head -1
model name : Intel(R) Core(TM) i7-8550U CPU @ 1.80GHz
Attempted to use critcmp but hit an error
Haven't seen the issue before even though I am running the same version.
$ critcmp --version
critcmp 0.1.4
$ cargo tree | grep criterion
├── criterion v0.3.3
│ ├── criterion-plot v0.4.3
Sorry for the long delays here @Kixunil. Again, I would be in favor of removing the |
Sure, I wanted to do it sooner, just got distracted by ton of other stuff. |
Done, also added two more comments that seemed like a good idea. |
54f90d4
to
7a3936d
Compare
Hmm, something must have confused DCO. I added the line into the commit yet it fails. |
@Kixunil the bot needs to see the sign-off on every single commit in the PR. To that extent, it looks like the first commit is fine while the remaining ones are missing that. |
Co-authored-by: Max Inden <[email protected]> Signed-off-by: Martin Habovstiak <[email protected]>
This removes `regex` as suggested in the PR. Signed-off-by: Martin Habovstiak <[email protected]>
7a3936d
to
59eb670
Compare
@lucab ah, I thought Co-authored-by serves as replacement. Fixed. |
Thanks @Kixunil for bearing with us! |
Thank you too! Any specific plans for the date of next release? |
Nothing planned as far as I know, happy to do one though, if it can be of some help. I still need to review #387 which seems worth including. What do you think? |
Definitely worth including! |
regex
was used only in four trivial cases that could be implementedmore simply, either naively or using memchr, without losing performance.
As such the dependency needlessly increases build time, size of binary
and attack surface.
This change makes
regex
optional and defaults tonaive
/memchr
implementations. This improves performance a bit. The dependency
could've been removed entirely but was kept in case regression is
discovered on another platform and to make comparing the performance
easier. It can be removed in the future if the code is proven to be
reliable.
Benchmarked on x86_64. Thumbs up for providing convenient benches. :)