Skip to content

Conversation

@jairad26
Copy link
Contributor

@jairad26 jairad26 commented Nov 14, 2025

Description of changes

Summarize the changes made by this PR.

Test plan

How are these changes tested?

  • Tests pass locally with pytest for python, yarn test for js, cargo test for rust

Migration plan

Are there any migrations, or any forwards/backwards compatibility changes needed in order to make sure this change deploys reliably?

Observability plan

What is the plan to instrument and monitor this change?

Documentation Changes

Are all docstrings for user-facing APIs updated if required? Do we need to make documentation changes in the docs section?

@jairad26 jairad26 marked this pull request as ready for review November 14, 2025 18:27
Copy link
Contributor Author

This stack of pull requests is managed by Graphite. Learn more about stacking.

@github-actions
Copy link

Reviewer Checklist

Please leverage this checklist to ensure your code review is thorough before approving

Testing, Bugs, Errors, Logs, Documentation

  • Can you think of any use case in which the code does not behave as intended? Have they been tested?
  • Can you think of any inputs or external events that could break the code? Is user input validated and safe? Have they been tested?
  • If appropriate, are there adequate property based tests?
  • If appropriate, are there adequate unit tests?
  • Should any logging, debugging, tracing information be added or removed?
  • Are error messages user-friendly?
  • Have all documentation changes needed been made?
  • Have all non-obvious changes been commented?

System Compatibility

  • Are there any potential impacts on other parts of the system or backward compatibility?
  • Does this change intersect with any items on our roadmap, and if so, is there a plan for fitting them together?

Quality

  • Is this code of a unexpectedly high quality (Readability, Modularity, Intuitiveness)

@propel-code-bot
Copy link
Contributor

propel-code-bot bot commented Nov 14, 2025

Add documentation pages for eight previously undocumented embedding functions

This PR introduces eight new markdown files under docs/docs.trychroma.com/markdoc/content/integrations/embedding-models describing how to use additional embedding functions in both Python and TypeScript. Each page includes installation instructions, example code snippets, configurable parameters, and "tip" banners pointing to upstream resources. No application code is touched; the change set is purely additive documentation (409 LOC added, 0 removed).

Key Changes

• Added amazon-bedrock.md, chroma-bm25.md, chroma-cloud-qwen.md, chroma-cloud-splade.md, nomic.md, open-clip.md, sentence-transformer.md, and text2vec.md
• Each page follows the Markdoc front-matter format with id and name, code examples inside {% Tabs %} for Python and/or TypeScript, and parameter explanations
• Introduces consistent usage examples, default parameter values, dependency installation commands, and cross-links to external documentation

Affected Areas

• Documentation site content (docs.trychroma.com) under integrations/embedding-models

This summary was automatically generated by @propel-code-bot

@jairad26 jairad26 changed the title [DOCS] Add docs for missing embedding functions in python and typescript [DOC] Add docs for missing embedding functions in python and typescript Nov 14, 2025
embeddings = bedrock_ef(texts)
```

You can pass in an optional `model_name` argument, which lets you choose which Amazon Bedrock embedding model to use. By default, Chroma uses `amazon.titan-embed-text-v1`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Documentation]

In several of the new documentation files, the example code explicitly sets a parameter to its default value, while the following text describes it as optional. This could be slightly confusing for users, as they might think the parameter is required. To improve clarity, consider rephrasing the explanation to acknowledge that the example shows the default being set explicitly, or remove the parameter from the example to demonstrate that it's optional.

For example, you could change this line to something like:

The model_name argument is optional and defaults to "amazon.titan-embed-text-v1". The example above shows how to set it explicitly, but it can be omitted to use the default.

This pattern also appears in:

  • docs/docs.trychroma.com/markdoc/content/integrations/embedding-models/chroma-cloud-splade.md (line 31)
  • docs/docs.trychroma.com/markdoc/content/integrations/embedding-models/text2vec.md (line 27)
Context for Agents
[**Documentation**]

In several of the new documentation files, the example code explicitly sets a parameter to its default value, while the following text describes it as optional. This could be slightly confusing for users, as they might think the parameter is required. To improve clarity, consider rephrasing the explanation to acknowledge that the example shows the default being set explicitly, or remove the parameter from the example to demonstrate that it's optional.

For example, you could change this line to something like:
> The `model_name` argument is optional and defaults to `"amazon.titan-embed-text-v1"`. The example above shows how to set it explicitly, but it can be omitted to use the default.

This pattern also appears in:
- `docs/docs.trychroma.com/markdoc/content/integrations/embedding-models/chroma-cloud-splade.md` (line 31)
- `docs/docs.trychroma.com/markdoc/content/integrations/embedding-models/text2vec.md` (line 27)

File: docs/docs.trychroma.com/markdoc/content/integrations/embedding-models/amazon-bedrock.md
Line: 30

Comment on lines +42 to +60
// npm install @chroma-core/chroma-bm25

import { ChromaBm25EmbeddingFunction } from "@chroma-core/chroma-bm25";

const embedder = new ChromaBm25EmbeddingFunction({
k: 1.2,
b: 0.75,
avgDocLength: 256.0,
tokenMaxLength: 40,
});

// use directly
const sparseEmbeddings = await embedder.generate(["document1", "document2"]);

// pass documents to query for .add and .query
const collection = await client.createCollection({
name: "name",
embeddingFunction: embedder,
});
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Documentation]

The TypeScript code snippet uses a client variable without defining it, which can be confusing for users. To make the example self-contained and runnable, it's best to include the client initialization. Additionally, using a more descriptive collection name like "my_collection" instead of "name" would make the example clearer.

Context for Agents
[**Documentation**]

The TypeScript code snippet uses a `client` variable without defining it, which can be confusing for users. To make the example self-contained and runnable, it's best to include the client initialization. Additionally, using a more descriptive collection name like `"my_collection"` instead of `"name"` would make the example clearer.

File: docs/docs.trychroma.com/markdoc/content/integrations/embedding-models/chroma-bm25.md
Line: 60


{% Tab label="python" %}

This embedding function relies on the `boto3` python package, which you can install with `pip install boto3`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Documentation]

Fix capitalization: 'python' should be 'Python'.

Context for Agents
[**Documentation**]

Fix capitalization: 'python' should be 'Python'.

File: docs/docs.trychroma.com/markdoc/content/integrations/embedding-models/amazon-bedrock.md
Line: 14


{% Tab label="python" %}

This embedding function relies on the `httpx` python package, which you can install with `pip install httpx`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Documentation]

Fix capitalization: 'python' should be 'Python'.

Context for Agents
[**Documentation**]

Fix capitalization: 'python' should be 'Python'.

File: docs/docs.trychroma.com/markdoc/content/integrations/embedding-models/chroma-cloud-qwen.md
Line: 14


{% Tab label="python" %}

This embedding function relies on the `httpx` python package, which you can install with `pip install httpx`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Documentation]

Fix capitalization: 'python' should be 'Python'.

Context for Agents
[**Documentation**]

Fix capitalization: 'python' should be 'Python'.

File: docs/docs.trychroma.com/markdoc/content/integrations/embedding-models/chroma-cloud-splade.md
Line: 16


{% Tab label="python" %}

This embedding function relies on the `nomic` python package, which you can install with `pip install nomic`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Documentation]

Fix capitalization: 'python' should be 'Python'.

Context for Agents
[**Documentation**]

Fix capitalization: 'python' should be 'Python'.

File: docs/docs.trychroma.com/markdoc/content/integrations/embedding-models/nomic.md
Line: 14


{% Tab label="python" %}

This embedding function relies on several python packages:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Documentation]

Fix capitalization: 'python' should be 'Python'.

Context for Agents
[**Documentation**]

Fix capitalization: 'python' should be 'Python'.

File: docs/docs.trychroma.com/markdoc/content/integrations/embedding-models/open-clip.md
Line: 14


{% Tab label="python" %}

This embedding function relies on the `sentence_transformers` python package, which you can install with `pip install sentence_transformers`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Documentation]

Fix capitalization: 'python' should be 'Python'.

Context for Agents
[**Documentation**]

Fix capitalization: 'python' should be 'Python'.

File: docs/docs.trychroma.com/markdoc/content/integrations/embedding-models/sentence-transformer.md
Line: 14


{% Tab label="python" %}

This embedding function relies on the `text2vec` python package, which you can install with `pip install text2vec`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Documentation]

Fix capitalization: 'python' should be 'Python'.

Context for Agents
[**Documentation**]

Fix capitalization: 'python' should be 'Python'.

File: docs/docs.trychroma.com/markdoc/content/integrations/embedding-models/text2vec.md
Line: 14

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants