[Bug]: Memory is not freed when using PersistentClient

### What happened?

We are running ChromaDB with PersistentClient and an index we've pre-built. 

Due to the nature of our application,  we create an instance per request we are handling, download the tar.gz of the pre-built index, and instantiate a PersistentClient object with the data. However, we observe that when the request is completed, the memory is not freed. The memory leak seem to happen in the native memory, rather than in Python process, and as consumers of the library we can not really mitigate it.

I asked Claude Code to provide the reproduction steps and analysis, and they are attached below.

=======

# ChromaDB Memory Leak Report

## Issue Summary

**ChromaDB 1.3.0 has a severe memory leak when using `PersistentClient` with temporary directories.** Each unique `persist_directory` creates a new System singleton that caches HNSW indexes indefinitely in native C++ memory, with no API to release them. This causes unbounded memory growth in applications that create multiple short-lived PersistentClient instances.

## Environment

- **ChromaDB Version**: 1.3.0 (from `uv.lock`)
- **Python Version**: 3.11
- **Platform**: macOS (Darwin 25.0.0)
- **Usage Pattern**: Creating `PersistentClient` with unique temp directories (e.g., `/tmp/chroma_XXXXX`)

## Reproduction Scenario

```python
import chromadb
import tempfile
import psutil
import os

process = psutil.Process(os.getpid())

for i in range(4):
    # Create unique temp directory for each iteration
    temp_dir = tempfile.mkdtemp(prefix="chroma_")

    # Create PersistentClient with ~16,500 embeddings (1536 dimensions)
    client = chromadb.PersistentClient(path=temp_dir)
    collection = client.get_or_create_collection("my_collection")

    # Use the collection...
    results = collection.query(query_embeddings=[[0.1] * 1536], n_results=10)

    # Client goes out of scope, but memory is NOT freed
    del client
    del collection

    print(f"RSS after iteration {i+1}: {process.memory_info().rss / 1024**2:.2f} MB")

# Expected: Memory should stabilize or decrease
# Actual: Memory grows ~150-200MB per iteration and NEVER decreases
```

**Observed Memory Growth**:
- Iteration 1: 303 MB → 487 MB (+184 MB)
- Iteration 2: 487 MB → 645 MB (+158 MB)
- Iteration 3: 645 MB → 803 MB (+158 MB)
- Iteration 4: 803 MB → 961 MB (+158 MB)

**Total leak: ~630 MB for 4 iterations that never gets freed until process termination.**

## Root Cause Analysis

### 1. System Singleton Cache Never Evicts

**File**: `chromadb/api/shared_system_client.py`
**Line**: 11

```python
class SharedSystemClient:
    _identifier_to_system: ClassVar[Dict[str, System]] = {}
```

**Problem**: This class variable caches `System` instances by `persist_directory` (line 56) and **never evicts them**. Each unique temp directory creates a new System that lives forever.

### 2. LocalSegmentManager Holds HNSW Indexes Indefinitely

**File**: `chromadb/segment/impl/manager/local.py`
**Lines**: 54, 68, 246-251

```python
class LocalSegmentManager(SegmentManager):
    _instances: Dict[UUID, SegmentImplementation]  # Line 54

    def __init__(self, system: System):
        # ...
        self._instances = {}  # Line 68

    def _instance(self, segment: Segment) -> SegmentImplementation:
        if segment["id"] not in self._instances:
            cls = self._cls(segment)
            instance = cls(self._system, segment)
            instance.start()
            self._instances[segment["id"]] = instance  # Stored forever
        return self._instances[segment["id"]]
```

**Problem**: HNSW segment instances (containing C++ `hnswlib.Index` objects) are stored in `_instances` dict and only removed on explicit collection deletion or system reset. They are **never** garbage collected when the client goes out of scope.

### 3. BasicCache Has No Eviction Policy

**File**: `chromadb/segment/impl/manager/local.py`
**Lines**: 69-82

```python
self.segment_cache: Dict[SegmentScope, SegmentCache] = {
    SegmentScope.METADATA: BasicCache()
}
if (
    system.settings.chroma_segment_cache_policy == "LRU"
    and system.settings.chroma_memory_limit_bytes > 0
):
    self.segment_cache[SegmentScope.VECTOR] = SegmentLRUCache(...)
else:
    self.segment_cache[SegmentScope.VECTOR] = BasicCache()  # Default: unbounded
```

**Problem**: By default, `BasicCache` is used for VECTOR segments (HNSW indexes). This cache **never evicts**, accumulating segments indefinitely.

### 4. Native C++ Memory Cannot Be Freed from Python

**File**: `chromadb/segment/impl/vector/local_hnsw.py`
**Lines**: 45, 208-219

```python
class LocalHnswSegment(VectorReader):
    _index: Optional[hnswlib.Index]  # Line 45

    def _init_index(self, dimensionality: int) -> None:
        index = hnswlib.Index(space=self._params.space, dim=dimensionality)
        index.init_index(
            max_elements=DEFAULT_CAPACITY,
            ef_construction=self._params.construction_ef,
            M=self._params.M,
        )
        # ...
        self._index = index  # C++ object stored as instance variable
```

**Problem**: `hnswlib.Index` allocates native C++ memory that Python's garbage collector cannot track or reclaim. Even when Python objects are deleted, the C++ HNSW index memory remains allocated.

## Memory Leak Chain

1. **Create PersistentClient** with unique temp path `/tmp/chroma_abc123`
2. **System singleton created** and cached in `SharedSystemClient._identifier_to_system["/tmp/chroma_abc123"]` (shared_system_client.py:25-27)
3. **LocalSegmentManager created** as part of System (local.py:62)
4. **HNSW index loaded** into `LocalSegmentManager._instances[segment_id]` (local.py:250)
5. **HNSW index also cached** in `segment_cache[VECTOR]` BasicCache (local.py:214)
6. **PersistentClient deleted** by user code
7. **System singleton REMAINS** in class variable dict ❌
8. **LocalSegmentManager REMAINS** as part of System ❌
9. **HNSW index REMAINS** in `_instances` dict ❌
10. **Native C++ memory NEVER freed** ❌

## Memory Composition Per Instance

Diagnostic using `tracemalloc` + `psutil` shows:

| Component | Size | Type |
|-----------|------|------|
| HNSW Index (hnswlib) | ~100-120 MB | Native C++ |
| NumPy arrays | ~10-20 MB | Native C (NumPy) |
| SQLite metadata | ~5 MB | Native C (SQLite) |
| httpx client (embedding function) | ~5-10 MB | Native C++ |
| ChromaDB metadata | ~5-10 MB | Python/C |
| **Total per unique persist_directory** | **~125-165 MB** | |

**Python heap growth**: Near zero
**RSS (Resident Set Size) growth**: ~158 MB per iteration that NEVER decreases

## Proposed Fixes

### Fix 1: Add Explicit Cleanup API (Recommended)

Add a `.close()` or `.cleanup()` method to `PersistentClient`:

```python
# In chromadb/api/client.py
class Client(SharedSystemClient, ClientAPI):
    def close(self) -> None:
        """Release resources held by this client."""
        # Stop and delete segment instances
        if hasattr(self, '_server'):
            segment_manager = self._server._manager
            if isinstance(segment_manager, LocalSegmentManager):
                for instance in list(segment_manager._instances.values()):
                    instance.stop()
                segment_manager._instances.clear()
                segment_manager.segment_cache[SegmentScope.VECTOR].reset()
                segment_manager.segment_cache[SegmentScope.METADATA].reset()

        # Remove System singleton if this is the last client
        if self._identifier in SharedSystemClient._identifier_to_system:
            system = SharedSystemClient._identifier_to_system[self._identifier]
            system.stop()
            del SharedSystemClient._identifier_to_system[self._identifier]
```

### Fix 2: Make System Cache Weak References

**File**: `chromadb/api/shared_system_client.py`

```python
import weakref
from typing import ClassVar, Dict

class SharedSystemClient:
    _identifier_to_system: ClassVar[Dict[str, weakref.ref[System]]] = {}

    @classmethod
    def _create_system_if_not_exists(cls, identifier: str, settings: Settings) -> System:
        if identifier not in cls._identifier_to_system or cls._identifier_to_system[identifier]() is None:
            new_system = System(settings)
            cls._identifier_to_system[identifier] = weakref.ref(new_system)
            # ...
```

This allows System instances to be garbage collected when no clients reference them.

### Fix 3: Enable LRU Cache by Default

**File**: `chromadb/segment/impl/manager/local.py`

```python
# Set reasonable defaults instead of unbounded BasicCache
if system.settings.chroma_segment_cache_policy is None:
    system.settings.chroma_segment_cache_policy = "LRU"
if system.settings.chroma_memory_limit_bytes <= 0:
    system.settings.chroma_memory_limit_bytes = 1024 * 1024 * 1024  # 1GB default

if (
    system.settings.chroma_segment_cache_policy == "LRU"
    and system.settings.chroma_memory_limit_bytes > 0
):
    self.segment_cache[SegmentScope.VECTOR] = SegmentLRUCache(...)
```

### Fix 4: Add Context Manager Support

```python
# In chromadb/api/client.py
class Client(SharedSystemClient, ClientAPI):
    def __enter__(self) -> "Client":
        return self

    def __exit__(self, exc_type, exc_val, exc_tb) -> None:
        self.close()

# Usage:
with chromadb.PersistentClient(path=temp_dir) as client:
    collection = client.get_collection("my_collection")
    # ... use collection
# Automatically cleaned up on exit
```



### Versions

Chroma v1.3.4, Python 3.11, MacOs Darwin 25.0.0

### Relevant log output

```shell

```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Bug]: Memory is not freed when using PersistentClient #5843

What happened?

ChromaDB Memory Leak Report

Issue Summary

Environment

Reproduction Scenario

Root Cause Analysis

1. System Singleton Cache Never Evicts

2. LocalSegmentManager Holds HNSW Indexes Indefinitely

3. BasicCache Has No Eviction Policy

4. Native C++ Memory Cannot Be Freed from Python

Memory Leak Chain

Memory Composition Per Instance

Proposed Fixes

Fix 1: Add Explicit Cleanup API (Recommended)

Fix 2: Make System Cache Weak References

Fix 3: Enable LRU Cache by Default

Fix 4: Add Context Manager Support

Versions

Relevant log output

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Component	Size	Type
HNSW Index (hnswlib)	~100-120 MB	Native C++
NumPy arrays	~10-20 MB	Native C (NumPy)
SQLite metadata	~5 MB	Native C (SQLite)
httpx client (embedding function)	~5-10 MB	Native C++
ChromaDB metadata	~5-10 MB	Python/C
Total per unique persist_directory	~125-165 MB

[Bug]: Memory is not freed when using PersistentClient #5843

Description

What happened?

ChromaDB Memory Leak Report

Issue Summary

Environment

Reproduction Scenario

Root Cause Analysis

1. System Singleton Cache Never Evicts

2. LocalSegmentManager Holds HNSW Indexes Indefinitely

3. BasicCache Has No Eviction Policy

4. Native C++ Memory Cannot Be Freed from Python

Memory Leak Chain

Memory Composition Per Instance

Proposed Fixes

Fix 1: Add Explicit Cleanup API (Recommended)

Fix 2: Make System Cache Weak References

Fix 3: Enable LRU Cache by Default

Fix 4: Add Context Manager Support

Versions

Relevant log output

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions