feat: global_id mapping serialization #6475

BorysTheDev · 2026-01-26T15:53:44Z

Part of HNSW index replication task: added save/load for hash/ json globalId mapping

Copilot

Pull request overview

This PR adds serialization and deserialization support for search index global_id mappings in the RDB format. The feature enables replication of vector search indices by preserving document ID mappings across master-replica synchronization.

Changes:

Introduces RDB_OPCODE_GLOBAL_ID (221) to store index_name and global_id pairs before key entries
Adds serialization logic in SliceSnapshot::SerializeEntry to save global_ids for HASH/JSON keys indexed by search
Implements deserialization in RdbLoader to restore global_id mappings on the replica
Adds infrastructure methods SetMasterDocId, GetMasterDocId, and ClearMasterMappings to ShardDocIndex

Reviewed changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 2 comments.

Show a summary per file

File	Description
src/server/rdb_extensions.h	Defines new RDB opcode RDB_OPCODE_GLOBAL_ID (221) for search index global_id storage
src/server/rdb_save.h	Declares SaveGlobalId method for serializing global_id entries
src/server/rdb_save.cc	Implements SaveGlobalId to write opcode, index name, and 8-byte global_id
src/server/rdb_load.h	Adds global_ids vector to Item struct for storing loaded mappings
src/server/rdb_load.cc	Parses RDB_OPCODE_GLOBAL_ID and stores mappings; transfers them to search indices
src/server/snapshot.cc	Serializes global_ids for indexed HASH/JSON keys during snapshot creation
src/server/search/doc_index.h	Adds master_doc_ids_ map and related methods to ShardDocIndex
src/server/search/doc_index.cc	Implements ForEachGlobalDocId callback iterator and master mapping methods

src/server/search/doc_index.h

src/server/rdb_extensions.h

augmentcode · 2026-01-26T16:00:22Z

🤖 Augment PR Summary

Summary: Adds RDB-level serialization for search index global_id mappings so replicas can restore master document id relationships.

Changes:

Introduced a new DF RDB opcode RDB_OPCODE_GLOBAL_ID (221) to store (index_name, global_id) pairs before a key entry
Extended RDB load/save paths to emit and parse these global-id records and attach them to loaded items
During snapshot serialization, emits global-id records for HASH/JSON keys that are indexed by search
Added search-side helpers to iterate per-key global ids and to store “master doc id” mappings during RDB load

Technical Notes: Global ids are stored as little-endian uint64_t and may appear multiple times per key (one per matching index).

_{🤖 Was this summary useful? React with 👍 or 👎}

augmentcode

Review completed. 1 suggestions posted.

Comment augment review to trigger a new review at any time.

src/server/search/doc_index.h

Copilot

Pull request overview

Copilot reviewed 9 out of 9 changed files in this pull request and generated 5 comments.

src/server/search/doc_index.cc

src/server/snapshot.cc

src/server/rdb_save.cc

src/server/search/doc_index.h

dranikpg · 2026-01-27T20:41:52Z

src/server/rdb_load.cc

+  // Store master doc_id mappings for search indices
+  if (!item->global_ids.empty()) {
+    if (auto* search_indices = db_slice->shard_owner()->search_indices(); search_indices) {
+      for (const auto& [index_name, global_id] : item->global_ids) {
+        search_indices->SetMasterDocId(index_name, item->key, global_id);
+      }
+    }
+  }


What if the indices haven't been created yet. They're loaded from an aux field on shard 0 snapshot

In the next PR I will create index automatically if we have such fields

Then it is better to add some proper utilities for loading indices rather than abusing the indices themself

didn't get your point

dranikpg · 2026-01-27T20:45:21Z

src/server/rdb_save.h

                                uint32_t mc_flags, DbIndex dbid);

+  // Write a single global_id entry for search-indexed keys.
+  // Format: RDB_OPCODE_GLOBAL_ID + index_name (string) + global_id (8 bytes).


imagine how wasteful it's to send the index name each time

this is why I wanted to have global_id for all indexes

dranikpg · 2026-01-27T22:06:20Z

src/server/rdb_load.cc

+    if (type == RDB_OPCODE_GLOBAL_ID) {
+      /* GLOBAL_ID: search index global document id (index_name + global_id) */
+      string index_name;
+      SET_OR_RETURN(FetchGenericString(), index_name);
+      uint64_t global_id;
+      SET_OR_RETURN(FetchInt<uint64_t>(), global_id);
+      settings.global_ids.emplace_back(std::move(index_name), global_id);
+      continue; /* Read next opcode. */


I don't understand it from a consistency point of view. If we plan to support writes in the future we have to plan ahead. I'll message in a private group

I plan to implement the same mechanism as we use for replication.

feat: global_id mapping serialization

a7b7ade

Copilot AI review requested due to automatic review settings January 26, 2026 15:53

Copilot started reviewing on behalf of BorysTheDev January 26, 2026 15:54 View session

BorysTheDev changed the title ~~feat: global_id mapping serialization~~ feat: global_id mapping serialization NOT READY FOR REVIEW Jan 26, 2026

Copilot AI reviewed Jan 26, 2026

View reviewed changes

src/server/search/doc_index.h Outdated Show resolved Hide resolved

src/server/rdb_extensions.h Show resolved Hide resolved

augmentcode bot reviewed Jan 26, 2026

View reviewed changes

src/server/search/doc_index.h Show resolved Hide resolved

BorysTheDev added 3 commits January 27, 2026 10:50

refactor: address comments

3890038

refactor: address comments

7bb6455

refactor: update comments

ca4720d

Copilot AI review requested due to automatic review settings January 27, 2026 09:21

BorysTheDev changed the title ~~feat: global_id mapping serialization NOT READY FOR REVIEW~~ feat: global_id mapping serialization Jan 27, 2026

BorysTheDev requested a review from dranikpg January 27, 2026 09:22

Copilot started reviewing on behalf of BorysTheDev January 27, 2026 09:22 View session

BorysTheDev requested a review from mkaruza January 27, 2026 09:22

Copilot AI reviewed Jan 27, 2026

View reviewed changes

src/server/search/doc_index.cc Show resolved Hide resolved

src/server/snapshot.cc Show resolved Hide resolved

src/server/rdb_save.cc Show resolved Hide resolved

src/server/search/doc_index.h Show resolved Hide resolved

src/server/search/doc_index.h Show resolved Hide resolved

dranikpg reviewed Jan 27, 2026

View reviewed changes

BorysTheDev closed this Jan 30, 2026

feat: global_id mapping serialization #6475

feat: global_id mapping serialization #6475

Conversation

BorysTheDev commented Jan 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

augmentcode bot commented Jan 26, 2026

Uh oh!

augmentcode bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

BorysTheDev commented Jan 26, 2026 •

edited

Loading