-
Notifications
You must be signed in to change notification settings - Fork 1.9k
[TST] add proptest for config & schema reconciliation #5847
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: jai/schema-prop-tests
Are you sure you want to change the base?
Conversation
Reviewer ChecklistPlease leverage this checklist to ensure your code review is thorough before approving Testing, Bugs, Errors, Logs, Documentation
System Compatibility
Quality
|
|
Warning This pull request is not mergeable via GitHub because a downstack PR is open. Once all requirements are satisfied, merge this PR as a stack on Graphite.
This stack of pull requests is managed by Graphite. Learn more about stacking. |
This comment has been minimized.
This comment has been minimized.
fe605a3 to
67d62e4
Compare
|
Introduce property-based tests for config⇄schema reconciliation and patch edge-case bugs This PR adds a comprehensive proptest suite that randomly generates CollectionConfiguration / CollectionSchema pairs to ensure the conversion and reconciliation logic behaves correctly across thousands of permutations. While wiring up the tests, several latent corner-case bugs were surfaced and fixed—most notably around distance function mismatches, default embedding dimensions, metadata validation, and stricter TryFrom error handling. The change strengthens the type layer of the Rust vector-store backend without altering the public API and raises overall confidence in future schema-evolution work. Key Changes• New proptest modules (strategies.rs, proptest_utils.rs) with Arbitrary implementations for CollectionConfiguration and CollectionSchema Affected Areas• rust/types/src/collection_schema.rs This summary was automatically generated by @propel-code-bot |
f1b64d9 to
95c36fd
Compare
6f9aad9 to
fa48c9c
Compare
95c36fd to
6cd92fc
Compare
| ) -> impl Strategy<Value = InternalSpannConfiguration> { | ||
| ( | ||
| ( | ||
| 1u32..=128, // search_nprobe |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[BestPractice]
The validation ranges in comments like // search_nprobe (max 128) and // split_threshold (min 50, max 200) are excellent documentation, but they should match the actual validation constraints. Consider adding compile-time assertions or centralizing these constants to ensure the test ranges stay in sync with the actual validation rules.
// Ensure test ranges match validation constraints
const MAX_SEARCH_NPROBE: u32 = 128;
const MIN_SPLIT_THRESHOLD: u32 = 50;
const MAX_SPLIT_THRESHOLD: u32 = 200;
// In strategy:
1u32..=MAX_SEARCH_NPROBE, // search_nprobe
MIN_SPLIT_THRESHOLD..=MAX_SPLIT_THRESHOLD, // split_thresholdContext for Agents
[**BestPractice**]
The validation ranges in comments like `// search_nprobe (max 128)` and `// split_threshold (min 50, max 200)` are excellent documentation, but they should match the actual validation constraints. Consider adding compile-time assertions or centralizing these constants to ensure the test ranges stay in sync with the actual validation rules.
```rust
// Ensure test ranges match validation constraints
const MAX_SEARCH_NPROBE: u32 = 128;
const MIN_SPLIT_THRESHOLD: u32 = 50;
const MAX_SPLIT_THRESHOLD: u32 = 200;
// In strategy:
1u32..=MAX_SEARCH_NPROBE, // search_nprobe
MIN_SPLIT_THRESHOLD..=MAX_SPLIT_THRESHOLD, // split_threshold
```
File: rust/types/src/collection_configuration.rs
Line: 1208fa48c9c to
5c80b55
Compare
6cd92fc to
a021c46
Compare
5c80b55 to
31b9e5d
Compare
| fn partial_spann_index_config_strategy() -> BoxedStrategy<SpannIndexConfig> { | ||
| // Use internal strategy and convert, allowing None values by randomly setting some to None | ||
| internal_spann_configuration_strategy() | ||
| .prop_map(|config| { | ||
| // Randomly set some fields to None to test partial configs | ||
| let spann = SpannIndexConfig { | ||
| search_nprobe: Some(config.search_nprobe), | ||
| search_rng_factor: Some(config.search_rng_factor), | ||
| search_rng_epsilon: Some(config.search_rng_epsilon), | ||
| nreplica_count: Some(config.nreplica_count), | ||
| write_rng_factor: Some(config.write_rng_factor), | ||
| write_rng_epsilon: Some(config.write_rng_epsilon), | ||
| split_threshold: Some(config.split_threshold), | ||
| num_samples_kmeans: Some(config.num_samples_kmeans), | ||
| initial_lambda: Some(config.initial_lambda), | ||
| reassign_neighbor_count: Some(config.reassign_neighbor_count), | ||
| merge_threshold: Some(config.merge_threshold), | ||
| num_centers_to_merge_to: Some(config.num_centers_to_merge_to), | ||
| write_nprobe: Some(config.write_nprobe), | ||
| ef_construction: Some(config.ef_construction), | ||
| ef_search: Some(config.ef_search), | ||
| max_neighbors: Some(config.max_neighbors), | ||
| }; | ||
| // For property testing, we'll test with full configs - the merge logic handles None correctly | ||
| spann | ||
| }) | ||
| .boxed() | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[TestCoverage]
The partial_spann_index_config_strategy function is intended to generate partial SpannIndexConfig objects where some fields can be None. However, the current implementation always sets all fields to Some(...), which means tests like merge_spann_configs_preserves_user_overrides are not correctly testing the fallback logic for individual fields when a user-provided configuration is partial.
The implementation should be updated to use proptest::option::of for each field, similar to partial_hnsw_index_config_strategy, to ensure that partial configurations are actually generated for testing the merge logic correctly.
Context for Agents
[**TestCoverage**]
The `partial_spann_index_config_strategy` function is intended to generate partial `SpannIndexConfig` objects where some fields can be `None`. However, the current implementation always sets all fields to `Some(...)`, which means tests like `merge_spann_configs_preserves_user_overrides` are not correctly testing the fallback logic for individual fields when a user-provided configuration is partial.
The implementation should be updated to use `proptest::option::of` for each field, similar to `partial_hnsw_index_config_strategy`, to ensure that partial configurations are actually generated for testing the merge logic correctly.
File: rust/types/src/collection_schema.rs
Line: 569731b9e5d to
4913df6
Compare
|
|
||
| match (&internal_config.vector_index, &opposite_result.vector_index) { | ||
| (VectorIndexConfiguration::Hnsw(original), VectorIndexConfiguration::Spann(spann)) => { | ||
| let expected_space = original.space.clone(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[BestPractice]
The Space enum derives Copy, so calling .clone() on it is redundant. You can remove the .clone() calls for cleaner, more idiomatic code. This applies to several places in this file where .clone() is called on a value of type Space.
For example, this line:
let expected_space = original.space.clone();can be simplified to:
let expected_space = original.space;Other instances are on lines 1249, 1282, 1283, 1340, 1343, 1344, 1372, 1375, and 1376.
Context for Agents
[**BestPractice**]
The `Space` enum derives `Copy`, so calling `.clone()` on it is redundant. You can remove the `.clone()` calls for cleaner, more idiomatic code. This applies to several places in this file where `.clone()` is called on a value of type `Space`.
For example, this line:
```rust
let expected_space = original.space.clone();
```
can be simplified to:
```rust
let expected_space = original.space;
```
Other instances are on lines 1249, 1282, 1283, 1340, 1343, 1344, 1372, 1375, and 1376.
File: rust/types/src/collection_configuration.rs
Line: 1245
Description of changes
Summarize the changes made by this PR.
Test plan
How are these changes tested?
pytestfor python,yarn testfor js,cargo testfor rustMigration plan
Are there any migrations, or any forwards/backwards compatibility changes needed in order to make sure this change deploys reliably?
Observability plan
What is the plan to instrument and monitor this change?
Documentation Changes
Are all docstrings for user-facing APIs updated if required? Do we need to make documentation changes in the docs section?