-
Notifications
You must be signed in to change notification settings - Fork 18
Feat/implement set filter state #131
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Feat/implement set filter state #131
Conversation
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Summary of ChangesHello @Eeshu-Yadav, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request introduces the Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR implements support for the SetFilterState HTTP filter, which allows dynamically setting filter state objects based on request data. This is particularly useful for Istio's HBONE protocol where the connect authority needs to be extracted from CONNECT requests and propagated to upstream connections.
- Adds new
FilterStatemodule for thread-safe per-request metadata storage with read-only protection and upstream sharing modes - Implements
FormatStringEvaluatorfor Envoy-style format string substitution (e.g.,%REQ(:authority)%) - Integrates SetFilterState filter into the HTTP connection manager request processing pipeline
Reviewed Changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| orion-lib/src/filter_state.rs | Adds FilterState module for storing dynamic per-request metadata with read-only semantics and upstream propagation support |
| orion-lib/src/format_string.rs | Implements Envoy-style format string evaluator with support for request headers, downstream addresses, and other dynamic values |
| orion-lib/src/lib.rs | Exports new FilterState and FormatStringEvaluator types |
| orion-lib/src/listeners/http_connection_manager.rs | Integrates SetFilterState filter processing before routing, adds new filter variant, and includes comprehensive test coverage |
| orion-configuration/src/config/network_filters/http_connection_manager/http_filters/set_filter_state.rs | Defines configuration structures and Envoy protobuf conversions for the SetFilterState filter |
| orion-configuration/src/config/network_filters/http_connection_manager/http_filters.rs | Registers SetFilterState as a supported HTTP filter type |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| let host = host.clone(); | ||
| ctx = ctx.with_header(":authority", host); |
Copilot
AI
Nov 4, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The clone operation on line 296 is unnecessary. The with_header method accepts impl Into<CompactString>, so you can pass the borrowed host string slice directly without cloning. Change to ctx = ctx.with_header(\":authority\", host.as_str()); to avoid the extra allocation.
| let host = host.clone(); | |
| ctx = ctx.with_header(":authority", host); | |
| ctx = ctx.with_header(":authority", host.as_str()); |
| &manager.format_evaluator, | ||
| ); | ||
|
|
||
| // process request, get the response and calcuate the first byte time |
Copilot
AI
Nov 4, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Corrected spelling of 'calcuate' to 'calculate'.
| // process request, get the response and calcuate the first byte time | |
| // process request, get the response and calculate the first byte time |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces the set_filter_state HTTP filter, a significant feature for dynamic request metadata handling, inspired by Envoy. The implementation is comprehensive, adding new configuration definitions, a runtime storage mechanism for filter state, and a format string evaluator, complete with good test coverage. My review focuses on improving robustness in concurrent scenarios, fixing a subtle bug in string handling, and increasing idiomaticity and efficiency in a few areas.
orion-lib/src/filter_state.rs
Outdated
| pub fn get_upstream_shared( | ||
| &self, | ||
| mode: SharedWithUpstream, | ||
| ) -> HashMap<CompactString, CompactString> { | ||
| self.storage | ||
| .read() | ||
| .ok() | ||
| .map(|storage| { | ||
| storage | ||
| .iter() | ||
| .filter(|(_, entry)| match mode { | ||
| SharedWithUpstream::None => false, | ||
| SharedWithUpstream::Once => { | ||
| entry.shared_with_upstream == SharedWithUpstream::Once | ||
| || entry.shared_with_upstream == SharedWithUpstream::Transitive | ||
| } | ||
| SharedWithUpstream::Transitive => { | ||
| entry.shared_with_upstream == SharedWithUpstream::Transitive | ||
| } | ||
| }) | ||
| .map(|(k, v)| (k.clone(), v.value.clone())) | ||
| .collect() | ||
| }) | ||
| .unwrap_or_default() | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's an inconsistency in handling poisoned RwLocks. While methods like get and set correctly propagate a LockPoisoned error, this method (get_upstream_shared) and others (contains_key, len, is_empty, clear) silently ignore it by using .ok() or if let Ok(...) and returning a default value.
A poisoned lock indicates that a thread panicked while holding the lock, which may leave the data in an inconsistent state. Silently ignoring this can hide serious bugs and lead to unpredictable behavior.
It's recommended to change the signature of these methods to return a Result<..., FilterStateError> to consistently propagate lock poisoning errors, allowing the caller to handle them appropriately.
pub fn get_upstream_shared(
&self,
mode: SharedWithUpstream,
) -> Result<HashMap<CompactString, CompactString>, FilterStateError> {
let storage = self
.storage
.read()
.map_err(|_| FilterStateError::LockPoisoned)?;
Ok(storage
.iter()
.filter(|(_, entry)| match mode {
SharedWithUpstream::None => false,
SharedWithUpstream::Once => {
entry.shared_with_upstream == SharedWithUpstream::Once
|| entry.shared_with_upstream == SharedWithUpstream::Transitive
}
SharedWithUpstream::Transitive => {
entry.shared_with_upstream == SharedWithUpstream::Transitive
}
})
.map(|(k, v)| (k.clone(), v.value.clone()))
.collect())
}| let factory_key = if envoy.factory_key.is_empty() { | ||
| None | ||
| } else { | ||
| Some(envoy.factory_key.into()) | ||
| }; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The initialization of factory_key can be made more idiomatic and concise. Instead of an if/else block, you can use (!envoy.factory_key.is_empty()).then(...). This is a common pattern in Rust for conditional Option creation.
let factory_key = (!envoy.factory_key.is_empty()).then(|| envoy.factory_key.into());| fn apply_max_length(&self, value: CompactString, max_len: Option<usize>) -> CompactString { | ||
| if let Some(max_len) = max_len { | ||
| if value.len() > max_len { | ||
| return value.chars().take(max_len).collect::<String>().into(); | ||
| } | ||
| } | ||
| value | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The current implementation for string truncation has a subtle bug and can be made more efficient.
- Correctness: The check
value.len() > max_lencompares byte length with character count, which is not reliable for strings with multi-byte characters. This can lead to incorrect truncation decisions. - Performance: Collecting characters into a new
String(.collect::<String>()) involves a new allocation, which can be avoided by slicing the original string.
A more correct and idiomatic approach is to find the byte index of the max_len-th character and slice up to that point. This avoids both the bug and the unnecessary allocation.
| fn apply_max_length(&self, value: CompactString, max_len: Option<usize>) -> CompactString { | |
| if let Some(max_len) = max_len { | |
| if value.len() > max_len { | |
| return value.chars().take(max_len).collect::<String>().into(); | |
| } | |
| } | |
| value | |
| } | |
| fn apply_max_length(&self, value: CompactString, max_len: Option<usize>) -> CompactString { | |
| if let Some(max_len) = max_len { | |
| if let Some((end_index, _)) = value.char_indices().nth(max_len) { | |
| return value[..end_index].into(); | |
| } | |
| } | |
| value | |
| } |
| for (name, value) in request.headers() { | ||
| if let Ok(value_str) = value.to_str() { | ||
| ctx = ctx.with_header(name.as_str(), value_str); | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The loop to extract headers silently ignores headers with non-UTF8 values because value.to_str() returns an Err. While many headers are ASCII/UTF8, HTTP headers can technically contain arbitrary byte sequences. Envoy's format string evaluation often operates on byte sequences.
Ignoring these headers without any notification could hide issues or lead to unexpected behavior if downstream logic relies on them. Consider logging a warning when a header cannot be converted to a string, to make this behavior more transparent during debugging.
| for (name, value) in request.headers() { | |
| if let Ok(value_str) = value.to_str() { | |
| ctx = ctx.with_header(name.as_str(), value_str); | |
| } | |
| } | |
| for (name, value) in request.headers() { | |
| match value.to_str() { | |
| Ok(value_str) => ctx = ctx.with_header(name.as_str(), value_str), | |
| Err(_) => tracing::warn!(header_name = %name.as_str(), "Skipping non-UTF8 header value in format string context"), | |
| } | |
| } |
| if ctx.get_header(":authority").is_none() { | ||
| if let Some(host) = ctx.get_header("host") { | ||
| let host = host.clone(); | ||
| ctx = ctx.with_header(":authority", host); | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The logic to fall back from the :authority header to the host header can be written more concisely. The current implementation with nested if let and a clone is a bit verbose.
You can make this more idiomatic by chaining Option methods.
if ctx.get_header(":authority").is_none() {
if let Some(host) = ctx.get_header("host").cloned() {
ctx = ctx.with_header(":authority", host);
}
}| ClusterLoadAssignmentBuilder, | ||
| }; | ||
| pub use filter_state::{FilterState, FilterStateError, SharedWithUpstream}; | ||
| pub use format_string::{FormatStringEvaluator, RequestContext}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where is this used ?
Ideally we don't want to use any regex type stuff on the hot path as it is very slow.
We also have similar logic for audit logging which doesn't rely on regex.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed all regex usage (Regex, OnceLock, lazy initialization),
Implemented fast character-by-character parser that scans the template string once
dawid-nowak
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I left some comments.
I would like to understand better what we are trying to achieve. I think we should re-think how the state is stored/propagated.
Ideally, we wouldn't be storing globally anything per request. If downstream needs data from the request to make a decision this should be propagated with the request and not stored globally.
orion-lib/src/filter_state.rs
Outdated
| #[derive(Debug, Clone)] | ||
| pub struct FilterState { | ||
| /// Internal storage protected by RwLock | ||
| storage: Arc<RwLock<HashMap<CompactString, FilterStateEntry>>>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generally speaking we should be avoiding locks as much as possible.
I would like to know how is this suppose to work. If this is propagated in request then we should just send it alongside the request something like Hyper request extensions.
For example we might need something like that to propagate metadata etc.
I think hyper provides an extension mechanims to Request/Response for similar purposes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed Arc<RwLock<HashMap>> global storage ,implemented per-request storage using HTTP request.extensions() following the same pattern as ResponseFlags and EventKind
03b2874 to
eb0f17c
Compare
- Add SetFilterState config structures with Envoy conversions - Support format strings (text/structured) and upstream sharing modes - Integrate with HTTP filter chain (runtime support pending) - Add comprehensive implementation documentation Signed-off-by: Eeshu-Yadav <[email protected]>
Add FilterState module with thread-safe RwLock-based storage for dynamic per-request metadata. Supports read-only enforcement, upstream sharing modes (NONE/ONCE/TRANSITIVE), and concurrent access. Signed-off-by: Eeshu-Yadav <[email protected]>
- Add FormatStringEvaluator with regex-based command operator parser - Support 9 core operators: REQ, DOWNSTREAM_REMOTE_ADDRESS, UPSTREAM_HOST, etc. - Implement RequestContext for evaluation with headers and metadata - Add max length truncation and missing value handling (default '-') - Support case-insensitive header lookup with :authority fallback Signed-off-by: Eeshu-Yadav <[email protected]>
- Add SetFilterState variant to HttpFilterValue enum - Implement apply_set_filter_state_filters() to evaluate format strings - Build RequestContext from HTTP request (headers, method, path, addresses) - Integrate FilterState and FormatStringEvaluator into HttpConnectionManager - Hook into request pipeline before routing decisions - Store evaluated values with proper metadata (read_only, shared_with_upstream) Signed-off-by: Eeshu-Yadav <[email protected]>
eb0f17c to
71cb4b3
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
Copilot reviewed 6 out of 6 changed files in this pull request and generated 2 comments.
Comments suppressed due to low confidence (2)
orion-lib/src/listeners/http_connection_manager.rs:193
- Duplicate enum variant
SetFilterStatedefined on lines 188 and 193. The variant on line 193 should be removed as it appears to be leftover from before the filter was actually implemented. Line 188 contains the actual implementation that holds the configuration.
SetFilterState(orion_configuration::config::network_filters::http_connection_manager::http_filters::set_filter_state::SetFilterState),
Ignored,
/// Istio peer metadata filter - parsed but not executed (metadata/telemetry only)
PeerMetadata,
/// Envoy set filter state - parsed but not executed (metadata only)
SetFilterState,
orion-lib/src/listeners/http_connection_manager.rs:206
- Duplicate match arm for
HttpFilterType::SetFilterState(_). This pattern is matched on both lines 202 and 206, making line 206 unreachable. Remove the match arm on line 206 which appears to be a leftover from when SetFilterState was a stub.
HttpFilterType::SetFilterState(sfs) => HttpFilterValue::SetFilterState(sfs),
HttpFilterType::Ingored => HttpFilterValue::Ignored,
// Istio-specific filters: parsed but not executed (metadata/telemetry only)
HttpFilterType::PeerMetadata(_) => HttpFilterValue::PeerMetadata,
HttpFilterType::SetFilterState(_) => HttpFilterValue::SetFilterState,
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| }, | ||
| HttpFilterValue::Ignored => FilterDecision::Continue, | ||
| // Istio-specific filters: no-op execution (metadata/telemetry only) | ||
| HttpFilterValue::PeerMetadata | HttpFilterValue::SetFilterState => FilterDecision::Continue, |
Copilot
AI
Nov 9, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Duplicate match arm for HttpFilterValue::SetFilterState. This pattern is matched on both lines 217-221 and 224. The pattern on line 224 is unreachable and should be removed (it appears to be a leftover from when SetFilterState was a stub).
| HttpFilterValue::PeerMetadata | HttpFilterValue::SetFilterState => FilterDecision::Continue, | |
| HttpFilterValue::PeerMetadata => FilterDecision::Continue, |
| | HttpFilterValue::SetFilterState(_) | ||
| | HttpFilterValue::Ignored => FilterDecision::Continue, | ||
| // Istio-specific filters: no-op on response path | ||
| HttpFilterValue::PeerMetadata | HttpFilterValue::SetFilterState => FilterDecision::Continue, |
Copilot
AI
Nov 9, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Duplicate match arm for HttpFilterValue::SetFilterState. This pattern is matched on both lines 232 and 235. The pattern on line 235 is unreachable and should be removed (it appears to be a leftover from when SetFilterState was a stub).
| HttpFilterValue::PeerMetadata | HttpFilterValue::SetFilterState => FilterDecision::Continue, | |
| HttpFilterValue::PeerMetadata => FilterDecision::Continue, |
- Add 3 integration tests for xDS config parsing (Istio waypoint) - Add 5 HTTP pipeline tests for filter application - Fix skip_if_empty to handle '-' default values - Add :authority and :method pseudo-headers to RequestContext Signed-off-by: Eeshu-Yadav <[email protected]>
71cb4b3 to
ad9576c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
Copilot reviewed 8 out of 8 changed files in this pull request and generated no new comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
fixes : #126