MM SPM, bench tool and general Simd/v8 #14295

victorjulien · 2025-11-08T10:27:23Z

Replaces #11617

The bundled compare.py can be used to compare results from 2 branches or the spms in a single result csv file.

Here is the result of the new mm vs bm in

It generally performs a lot better, but it seems to have a slightly higher start up cost which shows in the tests that take the shortest time.

Deduplicate counter registration.

Rename to match coding style. Update callers.

Systems with SSE 4.1 as the highest SSE version are getting pretty rare, so it's hard to test.

AVX2 implementation that compares 32 bytes at a time. Rearrange code to make parts reusable. Fall back to smaller SIMD for remaining buffer. When (remaining) buffer is smaller than 32 bytes fall back to other SIMD implementations that deal with 16 bytes of data per iteration. Add 16/32/64 byte implementations using AVX512.

Implement for AVX512, AVX2 and SSE42.

Wrapper around `memmem`. The case sensitive search is implemented by directly calling `memmem`. As there is no case insensitieve variant available, a wrapper around memmem is created, that takes a sliding window approach: 1. take a slice of the haystack 2. convert it to lowercase 3. search it using memmem 4. move window forward

Tool to benchmark detection engine content inspection, which is the inspection of individual groups of content, etc matches for a buffer. Also add a set of basic tests for the various single pattern matching implementation. Output is in csv. To files for the rule based tests. To stdout for the spm tests.

To show differences betweeen 2 result files or between spm algos in a single result file.

codecov · 2025-11-08T11:24:47Z

Codecov Report

❌ Patch coverage is 67.49226% with 105 lines in your changes missing coverage. Please review.
✅ Project coverage is 84.12%. Comparing base (6bd3605) to head (1b9c77e).

Additional details and impacted files

@@            Coverage Diff             @@
##             main   #14295      +/-   ##
==========================================
- Coverage   84.17%   84.12%   -0.06%     
==========================================
  Files        1012     1013       +1     
  Lines      261868   262201     +333     
==========================================
+ Hits       220421   220570     +149     
- Misses      41447    41631     +184

Flag	Coverage Δ
fuzzcorpus	`63.18% <53.40%> (-0.14%)`	⬇️
livemode	`18.71% <36.05%> (-0.07%)`	⬇️
pcap	`44.55% <53.03%> (-0.10%)`	⬇️
suricata-verify	`64.86% <54.54%> (-0.06%)`	⬇️
unittests	`59.22% <68.81%> (-0.01%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

suricata-qa · 2025-11-08T12:50:03Z

Information: QA ran without warnings.

Pipeline = 28416

suricata-qa · 2025-11-08T15:15:03Z

WARNING:

field	baseline	test	%
	SURI_TLPR1_stats_chk
.uptime	654	634	96.94%

Pipeline = 28417

jasonish · 2025-11-09T17:58:16Z

src/util-memcpy.h

+}
+
+#if defined(__AVX2__)
+static inline void MemcmpyToLowerAVX2(uint8_t *dst, const uint8_t *src, size_t n);


victorjulien · 2025-11-10T10:48:12Z

Going to split this up as not all parts are as useful. Esp the memcmp stuff is not always faster than the libc implementation, sometimes a lot slower. So that will need some more research.

inashivb · 2025-11-12T02:45:10Z

Esp the memcmp stuff is not always faster than the libc implementation, sometimes a lot slower.

For smaller data set?

victorjulien · 2025-11-12T07:59:50Z

The opposite actually. I can get better perf with large data (~9k) with avx512+loop unrolls, but between the various systems I have the results are inconsistent. With small data I think the start up cost of my code is somewhat better. But also, with small data it all matters somewhat less :)

Will focus on getting the mm and bench tool merged first, then we can also reason about further changes with the bench results.

victorjulien requested review from a team and jasonish as code owners November 8, 2025 10:27

victorjulien mentioned this pull request Nov 8, 2025

MM SPM, bench tool and general Simd/v3 #11617

Closed

victorjulien added 3 commits November 8, 2025 11:31

detect/engine: minor thread init cleanup

8a986bb

Deduplicate counter registration.

memcpy: rename memcpy_tolower

2b94ddc

Rename to match coding style. Update callers.

memcmp: remove SSE 4.1 implementation

e7f344d

Systems with SSE 4.1 as the highest SSE version are getting pretty rare, so it's hard to test.

victorjulien force-pushed the simd/v8 branch from c3c7565 to d552424 Compare November 8, 2025 10:31

victorjulien added 7 commits November 8, 2025 11:54

memcpy: implement tolower using SIMD

d541cdc

Implement for AVX512, AVX2 and SSE42.

spm: minor unittest cleanup

ea537a8

github-actions: build and run bench tool

6625ef0

tools/bench-content-inspect: add python compare script

1b9c77e

To show differences betweeen 2 result files or between spm algos in a single result file.

victorjulien force-pushed the simd/v8 branch from d552424 to 1b9c77e Compare November 8, 2025 10:54

jasonish reviewed Nov 9, 2025

View reviewed changes

src/util-memcpy.h

}

#if defined(__AVX2__)

static inline void MemcmpyToLowerAVX2(uint8_t *dst, const uint8_t *src, size_t n);

Copy link

Member

jasonish Nov 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Memcpy?

victorjulien marked this pull request as draft November 10, 2025 10:44

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

MM SPM, bench tool and general Simd/v8 #14295

MM SPM, bench tool and general Simd/v8 #14295

victorjulien commented Nov 8, 2025

Uh oh!

codecov bot commented Nov 8, 2025 •

edited

Loading

Uh oh!

suricata-qa commented Nov 8, 2025

Uh oh!

suricata-qa commented Nov 8, 2025

Uh oh!

jasonish Nov 9, 2025

Uh oh!

victorjulien commented Nov 10, 2025

Uh oh!

inashivb commented Nov 12, 2025

Uh oh!

victorjulien commented Nov 12, 2025

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

4 participants

MM SPM, bench tool and general Simd/v8 #14295

Are you sure you want to change the base?

MM SPM, bench tool and general Simd/v8 #14295

Conversation

victorjulien commented Nov 8, 2025

Uh oh!

codecov bot commented Nov 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

suricata-qa commented Nov 8, 2025

Uh oh!

suricata-qa commented Nov 8, 2025

Uh oh!

jasonish Nov 9, 2025

Choose a reason for hiding this comment

Uh oh!

victorjulien commented Nov 10, 2025

Uh oh!

inashivb commented Nov 12, 2025

Uh oh!

victorjulien commented Nov 12, 2025

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

4 participants

codecov bot commented Nov 8, 2025 •

edited

Loading