Skip to content

[Bug]: Memory is not freed when using PersistentClient #5843

@dani29

Description

@dani29

What happened?

We are running ChromaDB with PersistentClient and an index we've pre-built.

Due to the nature of our application, we create an instance per request we are handling, download the tar.gz of the pre-built index, and instantiate a PersistentClient object with the data. However, we observe that when the request is completed, the memory is not freed. The memory leak seem to happen in the native memory, rather than in Python process, and as consumers of the library we can not really mitigate it.

I asked Claude Code to provide the reproduction steps and analysis, and they are attached below.

=======

ChromaDB Memory Leak Report

Issue Summary

ChromaDB 1.3.0 has a severe memory leak when using PersistentClient with temporary directories. Each unique persist_directory creates a new System singleton that caches HNSW indexes indefinitely in native C++ memory, with no API to release them. This causes unbounded memory growth in applications that create multiple short-lived PersistentClient instances.

Environment

  • ChromaDB Version: 1.3.0 (from uv.lock)
  • Python Version: 3.11
  • Platform: macOS (Darwin 25.0.0)
  • Usage Pattern: Creating PersistentClient with unique temp directories (e.g., /tmp/chroma_XXXXX)

Reproduction Scenario

import chromadb
import tempfile
import psutil
import os

process = psutil.Process(os.getpid())

for i in range(4):
    # Create unique temp directory for each iteration
    temp_dir = tempfile.mkdtemp(prefix="chroma_")

    # Create PersistentClient with ~16,500 embeddings (1536 dimensions)
    client = chromadb.PersistentClient(path=temp_dir)
    collection = client.get_or_create_collection("my_collection")

    # Use the collection...
    results = collection.query(query_embeddings=[[0.1] * 1536], n_results=10)

    # Client goes out of scope, but memory is NOT freed
    del client
    del collection

    print(f"RSS after iteration {i+1}: {process.memory_info().rss / 1024**2:.2f} MB")

# Expected: Memory should stabilize or decrease
# Actual: Memory grows ~150-200MB per iteration and NEVER decreases

Observed Memory Growth:

  • Iteration 1: 303 MB → 487 MB (+184 MB)
  • Iteration 2: 487 MB → 645 MB (+158 MB)
  • Iteration 3: 645 MB → 803 MB (+158 MB)
  • Iteration 4: 803 MB → 961 MB (+158 MB)

Total leak: ~630 MB for 4 iterations that never gets freed until process termination.

Root Cause Analysis

1. System Singleton Cache Never Evicts

File: chromadb/api/shared_system_client.py
Line: 11

class SharedSystemClient:
    _identifier_to_system: ClassVar[Dict[str, System]] = {}

Problem: This class variable caches System instances by persist_directory (line 56) and never evicts them. Each unique temp directory creates a new System that lives forever.

2. LocalSegmentManager Holds HNSW Indexes Indefinitely

File: chromadb/segment/impl/manager/local.py
Lines: 54, 68, 246-251

class LocalSegmentManager(SegmentManager):
    _instances: Dict[UUID, SegmentImplementation]  # Line 54

    def __init__(self, system: System):
        # ...
        self._instances = {}  # Line 68

    def _instance(self, segment: Segment) -> SegmentImplementation:
        if segment["id"] not in self._instances:
            cls = self._cls(segment)
            instance = cls(self._system, segment)
            instance.start()
            self._instances[segment["id"]] = instance  # Stored forever
        return self._instances[segment["id"]]

Problem: HNSW segment instances (containing C++ hnswlib.Index objects) are stored in _instances dict and only removed on explicit collection deletion or system reset. They are never garbage collected when the client goes out of scope.

3. BasicCache Has No Eviction Policy

File: chromadb/segment/impl/manager/local.py
Lines: 69-82

self.segment_cache: Dict[SegmentScope, SegmentCache] = {
    SegmentScope.METADATA: BasicCache()
}
if (
    system.settings.chroma_segment_cache_policy == "LRU"
    and system.settings.chroma_memory_limit_bytes > 0
):
    self.segment_cache[SegmentScope.VECTOR] = SegmentLRUCache(...)
else:
    self.segment_cache[SegmentScope.VECTOR] = BasicCache()  # Default: unbounded

Problem: By default, BasicCache is used for VECTOR segments (HNSW indexes). This cache never evicts, accumulating segments indefinitely.

4. Native C++ Memory Cannot Be Freed from Python

File: chromadb/segment/impl/vector/local_hnsw.py
Lines: 45, 208-219

class LocalHnswSegment(VectorReader):
    _index: Optional[hnswlib.Index]  # Line 45

    def _init_index(self, dimensionality: int) -> None:
        index = hnswlib.Index(space=self._params.space, dim=dimensionality)
        index.init_index(
            max_elements=DEFAULT_CAPACITY,
            ef_construction=self._params.construction_ef,
            M=self._params.M,
        )
        # ...
        self._index = index  # C++ object stored as instance variable

Problem: hnswlib.Index allocates native C++ memory that Python's garbage collector cannot track or reclaim. Even when Python objects are deleted, the C++ HNSW index memory remains allocated.

Memory Leak Chain

  1. Create PersistentClient with unique temp path /tmp/chroma_abc123
  2. System singleton created and cached in SharedSystemClient._identifier_to_system["/tmp/chroma_abc123"] (shared_system_client.py:25-27)
  3. LocalSegmentManager created as part of System (local.py:62)
  4. HNSW index loaded into LocalSegmentManager._instances[segment_id] (local.py:250)
  5. HNSW index also cached in segment_cache[VECTOR] BasicCache (local.py:214)
  6. PersistentClient deleted by user code
  7. System singleton REMAINS in class variable dict ❌
  8. LocalSegmentManager REMAINS as part of System ❌
  9. HNSW index REMAINS in _instances dict ❌
  10. Native C++ memory NEVER freed

Memory Composition Per Instance

Diagnostic using tracemalloc + psutil shows:

Component Size Type
HNSW Index (hnswlib) ~100-120 MB Native C++
NumPy arrays ~10-20 MB Native C (NumPy)
SQLite metadata ~5 MB Native C (SQLite)
httpx client (embedding function) ~5-10 MB Native C++
ChromaDB metadata ~5-10 MB Python/C
Total per unique persist_directory ~125-165 MB

Python heap growth: Near zero
RSS (Resident Set Size) growth: ~158 MB per iteration that NEVER decreases

Proposed Fixes

Fix 1: Add Explicit Cleanup API (Recommended)

Add a .close() or .cleanup() method to PersistentClient:

# In chromadb/api/client.py
class Client(SharedSystemClient, ClientAPI):
    def close(self) -> None:
        """Release resources held by this client."""
        # Stop and delete segment instances
        if hasattr(self, '_server'):
            segment_manager = self._server._manager
            if isinstance(segment_manager, LocalSegmentManager):
                for instance in list(segment_manager._instances.values()):
                    instance.stop()
                segment_manager._instances.clear()
                segment_manager.segment_cache[SegmentScope.VECTOR].reset()
                segment_manager.segment_cache[SegmentScope.METADATA].reset()

        # Remove System singleton if this is the last client
        if self._identifier in SharedSystemClient._identifier_to_system:
            system = SharedSystemClient._identifier_to_system[self._identifier]
            system.stop()
            del SharedSystemClient._identifier_to_system[self._identifier]

Fix 2: Make System Cache Weak References

File: chromadb/api/shared_system_client.py

import weakref
from typing import ClassVar, Dict

class SharedSystemClient:
    _identifier_to_system: ClassVar[Dict[str, weakref.ref[System]]] = {}

    @classmethod
    def _create_system_if_not_exists(cls, identifier: str, settings: Settings) -> System:
        if identifier not in cls._identifier_to_system or cls._identifier_to_system[identifier]() is None:
            new_system = System(settings)
            cls._identifier_to_system[identifier] = weakref.ref(new_system)
            # ...

This allows System instances to be garbage collected when no clients reference them.

Fix 3: Enable LRU Cache by Default

File: chromadb/segment/impl/manager/local.py

# Set reasonable defaults instead of unbounded BasicCache
if system.settings.chroma_segment_cache_policy is None:
    system.settings.chroma_segment_cache_policy = "LRU"
if system.settings.chroma_memory_limit_bytes <= 0:
    system.settings.chroma_memory_limit_bytes = 1024 * 1024 * 1024  # 1GB default

if (
    system.settings.chroma_segment_cache_policy == "LRU"
    and system.settings.chroma_memory_limit_bytes > 0
):
    self.segment_cache[SegmentScope.VECTOR] = SegmentLRUCache(...)

Fix 4: Add Context Manager Support

# In chromadb/api/client.py
class Client(SharedSystemClient, ClientAPI):
    def __enter__(self) -> "Client":
        return self

    def __exit__(self, exc_type, exc_val, exc_tb) -> None:
        self.close()

# Usage:
with chromadb.PersistentClient(path=temp_dir) as client:
    collection = client.get_collection("my_collection")
    # ... use collection
# Automatically cleaned up on exit

Versions

Chroma v1.3.4, Python 3.11, MacOs Darwin 25.0.0

Relevant log output

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions