Skip to content

Conversation

@jairad26
Copy link
Contributor

@jairad26 jairad26 commented Nov 14, 2025

Description of changes

Summarize the changes made by this PR.

  • Improvements & Bug fixes
    • we don't validate that if a source key is provided on the sparse index config then ef must also be provided for the js client. this pr fixes that
    • it also disallows creating or deleting all indexes on a key
  • New functionality
    • ...

Test plan

How are these changes tested?

  • Tests pass locally with pytest for python, yarn test for js, cargo test for rust

Migration plan

Are there any migrations, or any forwards/backwards compatibility changes needed in order to make sure this change deploys reliably?

Observability plan

What is the plan to instrument and monitor this change?

Documentation Changes

Are all docstrings for user-facing APIs updated if required? Do we need to make documentation changes in the docs section?

@github-actions
Copy link

Reviewer Checklist

Please leverage this checklist to ensure your code review is thorough before approving

Testing, Bugs, Errors, Logs, Documentation

  • Can you think of any use case in which the code does not behave as intended? Have they been tested?
  • Can you think of any inputs or external events that could break the code? Is user input validated and safe? Have they been tested?
  • If appropriate, are there adequate property based tests?
  • If appropriate, are there adequate unit tests?
  • Should any logging, debugging, tracing information be added or removed?
  • Are error messages user-friendly?
  • Have all documentation changes needed been made?
  • Have all non-obvious changes been commented?

System Compatibility

  • Are there any potential impacts on other parts of the system or backward compatibility?
  • Does this change intersect with any items on our roadmap, and if so, is there a plan for fitting them together?

Quality

  • Is this code of a unexpectedly high quality (Readability, Modularity, Intuitiveness)

Copy link
Contributor Author

This stack of pull requests is managed by Graphite. Learn more about stacking.

@propel-code-bot
Copy link
Contributor

propel-code-bot bot commented Nov 14, 2025

Stricter JS-client schema validation for sparse vector configs + block key-wide enable/disable operations

The PR tightens configuration rules inside the JS client’s Schema implementation. A new validation guarantees that a SparseVectorIndexConfig providing a sourceKey must also specify an embeddingFunction, preventing silent mis-configurations. In addition, the generic shortcuts that previously enabled or disabled all index types for an arbitrary key have been removed—createIndex(undefined, key) and deleteIndex(undefined, key) now throw.

Implementation touches the core schema class and updates/extends the test-suite to reflect the new rules. No server-side code changes are involved.

Key Changes

• Added validateSparseVectorConfig in clients/new-js/packages/chromadb/src/schema.ts to throw if sourceKey is set but embeddingFunction is missing.
• Hooked the new validation into createIndex and setIndexForKey paths and called it alongside existing single-sparse-index check.
• Removed auto enableAllIndexesForKey / disableAllIndexesForKey execution: these code paths now raise explicit errors, enforced in both createIndex and deleteIndex.
• Initialization now marks default per-key sparse vector index as disabled (enabled=false) instead of auto-enabled.
• Updated unit tests in clients/new-js/packages/chromadb/test/schema.test.ts to: (a) pass a mock sparse embedding function wherever a sourceKey is supplied, (b) assert that attempting to enable/disable all indexes for a key throws.
• Minor housekeeping: added { } braces in several constructors, clarified comments, fixed cloneObject indentation.

Affected Areas

clients/new-js/packages/chromadb/src/schema.ts core schema logic
clients/new-js/packages/chromadb/test/schema.test.ts extensive test suite

This summary was automatically generated by @propel-code-bot

@jairad26 jairad26 requested a review from itaismith November 14, 2025 19:08
@blacksmith-sh

This comment has been minimized.

@jairad26 jairad26 force-pushed the jai/validate-sparse-config-js branch from 668f943 to 47d43c1 Compare November 14, 2025 19:21
Comment on lines 374 to +375
if (!configProvided && keyProvided && key) {
this.enableAllIndexesForKey(key);
return this;
throw new Error(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[BestPractice]

The enableAllIndexesForKey() method is now unused and should be removed. It was only called from the createIndex() method at line 375, which has been replaced with an error throw. The method definition starting around line 614 is now dead code.

Consider removing:

  • The enableAllIndexesForKey() method
  • The disableAllIndexesForKey() method (also unused after line 421 change)
Context for Agents
[**BestPractice**]

The `enableAllIndexesForKey()` method is now unused and should be removed. It was only called from the `createIndex()` method at line 375, which has been replaced with an error throw. The method definition starting around line 614 is now dead code.

Consider removing:
- The `enableAllIndexesForKey()` method
- The `disableAllIndexesForKey()` method (also unused after line 421 change)

File: clients/new-js/packages/chromadb/src/schema.ts
Line: 375

Comment on lines +674 to +680
private validateSparseVectorConfig(config: SparseVectorIndexConfig): void {
// Validate that if source_key is provided then embedding_function is also provided
// since there is no default embedding function
if (config.sourceKey !== null && config.sourceKey !== undefined && !config.embeddingFunction) {
throw new Error(
`If sourceKey is provided then embeddingFunction must also be provided since there is no default embedding function. Config: ${JSON.stringify(config)}`,
);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[BestPractice]

Missing validation in constructor: The validateSparseVectorConfig() is only called when enabling an index via setIndexForKey() (line 567), but not when a SparseVectorIndexConfig is instantiated directly. If someone creates new SparseVectorIndexConfig({ sourceKey: 'foo' }) without an embedding function, the error won't be caught until the index is enabled.

Consider validating in the SparseVectorIndexConfig constructor itself:

constructor(options: SparseVectorIndexConfigOptions = {}) {
  this.embeddingFunction = options.embeddingFunction;
  this.sourceKey =
    options.sourceKey instanceof Key
      ? options.sourceKey.name
      : (options.sourceKey ?? null);
  this.bm25 = options.bm25 ?? null;
  
  // Validate sourceKey requires embeddingFunction
  if (this.sourceKey !== null && this.sourceKey !== undefined && !this.embeddingFunction) {
    throw new Error(
      `If sourceKey is provided then embeddingFunction must also be provided since there is no default embedding function.`
    );
  }
}

This provides fail-fast behavior and catches configuration errors at construction time rather than later during index creation.

Context for Agents
[**BestPractice**]

**Missing validation in constructor**: The `validateSparseVectorConfig()` is only called when enabling an index via `setIndexForKey()` (line 567), but not when a `SparseVectorIndexConfig` is instantiated directly. If someone creates `new SparseVectorIndexConfig({ sourceKey: 'foo' })` without an embedding function, the error won't be caught until the index is enabled.

Consider validating in the `SparseVectorIndexConfig` constructor itself:

```typescript
constructor(options: SparseVectorIndexConfigOptions = {}) {
  this.embeddingFunction = options.embeddingFunction;
  this.sourceKey =
    options.sourceKey instanceof Key
      ? options.sourceKey.name
      : (options.sourceKey ?? null);
  this.bm25 = options.bm25 ?? null;
  
  // Validate sourceKey requires embeddingFunction
  if (this.sourceKey !== null && this.sourceKey !== undefined && !this.embeddingFunction) {
    throw new Error(
      `If sourceKey is provided then embeddingFunction must also be provided since there is no default embedding function.`
    );
  }
}
```

This provides fail-fast behavior and catches configuration errors at construction time rather than later during index creation.

File: clients/new-js/packages/chromadb/src/schema.ts
Line: 680

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants