-
Notifications
You must be signed in to change notification settings - Fork 450
Optimize events pipeline #4829
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
yanivagman
wants to merge
15
commits into
aquasecurity:main
Choose a base branch
from
yanivagman:optimize_events_pipeline
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Optimize events pipeline #4829
yanivagman
wants to merge
15
commits into
aquasecurity:main
from
yanivagman:optimize_events_pipeline
+377
−332
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
c1f2e76
to
a671563
Compare
… lookups - Reuse already-fetched eventDefinition when syscall ID matches event ID - Eliminates redundant map lookups with read locking in hot path - Improves performance when processing events with syscall information
- Add decoderPool using sync.Pool - Add SetBuffer method to EbpfDecoder for efficient reuse - Modify decodeEvents to use pooled decoders with proper cleanup - Significantly reduces memory allocations in event decoding hot path
Optimize string conversion in decodeEvents function by pre-computing trimmed byte slices before string allocation. This reduces string allocation overhead by avoiding repeated operations on the same data. The optimization separates TrimTrailingNUL operations from string conversion, allowing the byte slices to be computed once and then converted to strings only when needed.
- Add early exit when bitmap is 0 to avoid unnecessary processing - Cache frequently accessed event fields (UID, PID, return value) - Reorder filters by efficiency: UID/PID ranges → return value → scope → data - Cache rule lookups to avoid repeated map access - Add early exit when bitmap becomes 0 during iteration - Significantly improves policy matching performance through early exits
…line - Move struct creation closer to where structs are used - Remove unnecessary intermediate variables - Improve code readability by reducing variable scope
- Pre-compute submittable events map in engineEvents goroutine - Replace IsEventToSubmit() policy manager call with fast map lookup - Eliminates mutex locking and map traversal overhead for each event - Significant performance improvement for high-volume event processing - Cache is created once per engineEvents lifecycle and reused
- Combine 4 separate loops into a single loop to reduce iterations - Pre-compute all selector patterns to avoid repeated struct creation
- Replace slices.Clone() with make() and copy() for exact capacity allocation - Avoid over-allocation that can occur with slices.Clone() - Add nil check for empty args slice to avoid unnecessary allocation - Reduces memory overhead for argument cloning in signature engine - Maintains identical functionality while improving memory efficiency
Add metadata cache to Engine struct and populate it during signature loading. Use cached metadata in filterDispatchInPipeline instead of calling GetMetadata() for every event dispatch. Clean up cache entries when signatures are unloaded.
Pre-fetch signature metadata once at the start of signatureStart goroutine instead of calling GetMetadata() on every error. Move wg.Done() to defer for cleaner resource management.
Increase LRU cache size by 10x to improve cache hit rates for file hashing. This reduces the frequency of expensive SHA256 computations by keeping more file hashes in memory.
…tention Replace the shared hashBuffer protected by bufferMutex with a sync.Pool of buffers. This eliminates the serialization bottleneck where all hash computations had to wait for the single shared buffer. Key improvements: - Removes mutex contention from hash computations - Allows parallel hashing of multiple files - Maintains 32KB buffer size for optimal I/O performance - Reduces lock contention contributing to 9.16% CPU in runtime.futex
…nt copying This commit implements a comprehensive optimization for event derivation: Core Changes: - Updated DeriveFunction signature: func(trace.Event) -> func(*trace.Event) - Modified DeriveEvent method: removed separate args parameter, now takes pointer directly - Eliminated expensive event struct copying in deriveEvents pipeline - Removed argument slice cloning (slices.Clone) - no longer needed Pipeline Optimization: - Changed: derivatives, errors := t.eventDerivations.DeriveEvent(eventCopy, argsCopy) - To: derivatives, errors := t.eventDerivations.DeriveEvent(event) - Eliminates: eventCopy := *event and argsCopy := slices.Clone(event.Args) Safety Measures: - Reordered pipeline: derivation now happens BEFORE sending event downstream to prevent race conditions - Deep copy shared slices: buildDerivedEvent uses slices.Clone() for MatchedPolicies to prevent data corruption - Derive functions are read-only: they only read from the original event, never modify it - Any modifications are isolated to new derived events, preserving original event integrity - No mutable state sharing between original and derived events Performance Benefits: - Eliminates struct and slice copying in hot path - Reduces memory allocations - Maintains data integrity and thread safety
- Remove argument parsing logic from signature engine stage - Sink stage now handles all argument parsing for output formatting - Preserve event copying for concurrency safety between engine and sink - Keep ToProtocol optimization for performance - Update comments to reflect new architecture This improves separation of concerns: - Engine stage: Pure event routing (signatures get raw arguments) - Sink stage: Output formatting (handles parsing for display) The event copy mechanism remains necessary to protect signature engine from concurrent modifications (argument parsing and policy updates) that happen in the sink stage.
- Add ParseArgsSlice() and ParseArgsFDsSlice() functions that operate on argument slices instead of full events for better separation of concerns - Refactor parseArguments() to use the new slice-based parsing functions - Update comments to accurately reflect the current implementation - Maintain backward compatibility with existing ParseArgs() and ParseArgsFDs() This refactoring provides a cleaner API where parsing functions work directly on argument data without requiring full event objects, improving modularity and making the code easier to test and maintain.
a671563
to
56af6fb
Compare
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
1. Explain what the PR does
See commits
2. Explain how to test it
3. Other comments