-
Notifications
You must be signed in to change notification settings - Fork 1.9k
[ENH] Add validation on js client if sparse source key provided->ef must be provided #5865
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Reviewer ChecklistPlease leverage this checklist to ensure your code review is thorough before approving Testing, Bugs, Errors, Logs, Documentation
System Compatibility
Quality
|
This stack of pull requests is managed by Graphite. Learn more about stacking. |
|
Stricter JS-client schema validation for sparse vector configs + block key-wide enable/disable operations The PR tightens configuration rules inside the JS client’s Implementation touches the core schema class and updates/extends the test-suite to reflect the new rules. No server-side code changes are involved. Key Changes• Added Affected Areas• This summary was automatically generated by @propel-code-bot |
This comment has been minimized.
This comment has been minimized.
668f943 to
47d43c1
Compare
| if (!configProvided && keyProvided && key) { | ||
| this.enableAllIndexesForKey(key); | ||
| return this; | ||
| throw new Error( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[BestPractice]
The enableAllIndexesForKey() method is now unused and should be removed. It was only called from the createIndex() method at line 375, which has been replaced with an error throw. The method definition starting around line 614 is now dead code.
Consider removing:
- The
enableAllIndexesForKey()method - The
disableAllIndexesForKey()method (also unused after line 421 change)
Context for Agents
[**BestPractice**]
The `enableAllIndexesForKey()` method is now unused and should be removed. It was only called from the `createIndex()` method at line 375, which has been replaced with an error throw. The method definition starting around line 614 is now dead code.
Consider removing:
- The `enableAllIndexesForKey()` method
- The `disableAllIndexesForKey()` method (also unused after line 421 change)
File: clients/new-js/packages/chromadb/src/schema.ts
Line: 375| private validateSparseVectorConfig(config: SparseVectorIndexConfig): void { | ||
| // Validate that if source_key is provided then embedding_function is also provided | ||
| // since there is no default embedding function | ||
| if (config.sourceKey !== null && config.sourceKey !== undefined && !config.embeddingFunction) { | ||
| throw new Error( | ||
| `If sourceKey is provided then embeddingFunction must also be provided since there is no default embedding function. Config: ${JSON.stringify(config)}`, | ||
| ); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[BestPractice]
Missing validation in constructor: The validateSparseVectorConfig() is only called when enabling an index via setIndexForKey() (line 567), but not when a SparseVectorIndexConfig is instantiated directly. If someone creates new SparseVectorIndexConfig({ sourceKey: 'foo' }) without an embedding function, the error won't be caught until the index is enabled.
Consider validating in the SparseVectorIndexConfig constructor itself:
constructor(options: SparseVectorIndexConfigOptions = {}) {
this.embeddingFunction = options.embeddingFunction;
this.sourceKey =
options.sourceKey instanceof Key
? options.sourceKey.name
: (options.sourceKey ?? null);
this.bm25 = options.bm25 ?? null;
// Validate sourceKey requires embeddingFunction
if (this.sourceKey !== null && this.sourceKey !== undefined && !this.embeddingFunction) {
throw new Error(
`If sourceKey is provided then embeddingFunction must also be provided since there is no default embedding function.`
);
}
}This provides fail-fast behavior and catches configuration errors at construction time rather than later during index creation.
Context for Agents
[**BestPractice**]
**Missing validation in constructor**: The `validateSparseVectorConfig()` is only called when enabling an index via `setIndexForKey()` (line 567), but not when a `SparseVectorIndexConfig` is instantiated directly. If someone creates `new SparseVectorIndexConfig({ sourceKey: 'foo' })` without an embedding function, the error won't be caught until the index is enabled.
Consider validating in the `SparseVectorIndexConfig` constructor itself:
```typescript
constructor(options: SparseVectorIndexConfigOptions = {}) {
this.embeddingFunction = options.embeddingFunction;
this.sourceKey =
options.sourceKey instanceof Key
? options.sourceKey.name
: (options.sourceKey ?? null);
this.bm25 = options.bm25 ?? null;
// Validate sourceKey requires embeddingFunction
if (this.sourceKey !== null && this.sourceKey !== undefined && !this.embeddingFunction) {
throw new Error(
`If sourceKey is provided then embeddingFunction must also be provided since there is no default embedding function.`
);
}
}
```
This provides fail-fast behavior and catches configuration errors at construction time rather than later during index creation.
File: clients/new-js/packages/chromadb/src/schema.ts
Line: 680
Description of changes
Summarize the changes made by this PR.
Test plan
How are these changes tested?
pytestfor python,yarn testfor js,cargo testfor rustMigration plan
Are there any migrations, or any forwards/backwards compatibility changes needed in order to make sure this change deploys reliably?
Observability plan
What is the plan to instrument and monitor this change?
Documentation Changes
Are all docstrings for user-facing APIs updated if required? Do we need to make documentation changes in the docs section?