Skip to content

[CHORE] Build fixes for new js client #4752

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 10 commits into from
Jun 6, 2025
Merged

[CHORE] Build fixes for new js client #4752

merged 10 commits into from
Jun 6, 2025

Conversation

itaismith
Copy link
Contributor

@itaismith itaismith commented Jun 4, 2025

Description of changes

Add EmbeddingFunctionSpace to DefaultEmbedFunction, and type for getCollections

Copy link
Contributor

propel-code-bot bot commented Jun 4, 2025

Comprehensive Build Fixes and Typing Cleanups for Chroma New JS Client

This major PR refactors the Chroma JS client, removing legacy code paths, streamlining collection and client APIs, improving interoperability with the updated backend API, and harmonizing type signatures throughout the codebase. It introduces architectural cleanups such as consolidating collection interfaces, simplifying row-format query/results methods, enhancing type safety for metadata (including nullability), provides unified container/server start logic, and applies post-generation fixes for OpenAPI client types. Documentation and tests were also updated to reflect the revised public APIs.

Key Changes:
• Removed 'getCollectionById' and related legacy APIs; 'getCollection(s)' now always issue API calls and provide full collection objects.
• Refined types for metadata, allowing null values and aligning with server-side representations.
• Standardized collection and client interfaces; unified and simplified row-format methods for get/query results.
• Refactored server/container startup scripts to support both ESM and CommonJS test environments, and extracted shared logic.
• Improved API client code generation-auto-patching the HashMap type to ensure correct null handling and deduplication.
• Reworked main client construction to better handle deprecated connection arguments, auto-resolve cloud/tenant/database context, and clarified documentation.
• Upgraded package versions in all affected packages to reflect breaking and non-breaking changes.
• Documentation and test updates in sync with the removal of the thin collection API and the revised row methods.

Affected Areas:
• clients/new-js/packages/chromadb/src/collection.ts
• chroma-client.ts
• types.ts
• api/types.gen.ts
• scripts/start-chroma.ts
• scripts/start-chroma-common.ts
• test code & utils
• docs
• package metadata (package.json)
• ai-embeddings/default-embed

Potential Impact:

Functionality: Existing apps relying on 'getCollectionById' or 'getCollectionsById' must migrate to the full-API client approach. Stronger type safety and up-to-date metadata type alignment help prevent subtle runtime errors.

Performance: Some calls (such as 'getCollections') now always perform an API roundtrip and may incur more API traffic versus direct/assumed-thin calls.

Security: API connection argument handling is made stricter; legacy methods that may inadvertently leak auth or allow incorrect usage are now removed.

Scalability: Unified code paths and centralized startup logic reduce code duplication and improve maintainability for future features or environments.

Review Focus:
• Removal of legacy/thin collection APIs: double-check for edge case regression or incomplete migration.
• Row-format conversion methods in GetResult/QueryResult for correct shape and compatibility.
• Correctness of metadata typing, especially in presence of nulls.
• Backwards compatibility in public methods and error handling for deprecated arguments.
• Automated codegen/script robustness for post-processing OpenAPI artifacts.

Testing Needed

• Verify collection retrieval, addition, update, and fork operations for correct type, metadata, and config propagation in all supported environments.
• Test container/server startup via all (esm/cjs/jest) scripts.
• Run suite with metadata fields containing null values, and check row-format methods on various result types.
• Ensure that OpenAPI types are properly fixed in CI (e.g., HashMap type in api/types.gen.ts).

Code Quality Assessment

clients/new-js/packages/chromadb/scripts/start-chroma.ts: Now thin wrappers for shared ESM/CJS-aware startup logic; more DRY.

clients/new-js/packages/chromadb/src/types.ts: More precise, aligns well with backend, null-handling clarified.

clients/new-js/packages/chromadb/src/api/types.gen.ts: Patch applied to correct post-generation issues; should reduce downstream type bugs.

clients/new-js/packages/chromadb/test/: Tests modified to validate new approach; coverage remains high.

clients/new-js/packages/chromadb/src/collection.ts: Refactored and unified, more maintainable; significant duplicate logic removed.

Best Practices

Testing:
• Refactored startup logic allows for easier multi-environment support in tests
• Updated tests reflect new public API

Types:
• Comprehensive type-safety for collection metadata, strong null handling
• Elimination of legacy/duplicate interfaces

Doc/Api:
• Docs and public interface descriptions kept current with code
• Breaking changes documented in user-facing and testing code

Upgrade:
• All affected packages have their versions bumped to signal breaking and non-breaking changes

Potential Issues

• Potential for breakage in external code relying on 'getCollectionById' or similar thin APIs.
• Automated type fix for HashMap could mask future upstream OpenAPI schema drifts-should be tested with future releases.
• Edge cases if any test scripts or user code indirectly depend on non-standard Node.js ESM/CJS semantics.
• Migration friction for users expecting full control of embeddingFunction injection on partial collection representations.

This summary was automatically generated by @propel-code-bot

Copy link

github-actions bot commented Jun 4, 2025

Reviewer Checklist

Please leverage this checklist to ensure your code review is thorough before approving

Testing, Bugs, Errors, Logs, Documentation

  • Can you think of any use case in which the code does not behave as intended? Have they been tested?
  • Can you think of any inputs or external events that could break the code? Is user input validated and safe? Have they been tested?
  • If appropriate, are there adequate property based tests?
  • If appropriate, are there adequate unit tests?
  • Should any logging, debugging, tracing information be added or removed?
  • Are error messages user-friendly?
  • Have all documentation changes needed been made?
  • Have all non-obvious changes been commented?

System Compatibility

  • Are there any potential impacts on other parts of the system or backward compatibility?
  • Does this change intersect with any items on our roadmap, and if so, is there a plan for fitting them together?

Quality

  • Is this code of a unexpectedly high quality (Readability, Modularity, Intuitiveness)

@itaismith itaismith requested a review from philipithomas June 4, 2025 23:09
@itaismith itaismith requested a review from philipithomas June 5, 2025 23:37
@itaismith itaismith merged commit 4c5f9cb into main Jun 6, 2025
71 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants