-
Notifications
You must be signed in to change notification settings - Fork 108
Serialize Vamana index with SSD sector alignment per MSFT DiskANN format, generate quantized dataset for integration with DiskANN #846
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: branch-25.08
Are you sure you want to change the base?
Changes from 13 commits
8c650ab
bad5463
ac4199f
4b6a4d4
c931f87
765ffcd
d8b1101
0bf35ee
d0aabc6
6e5d908
90eaf18
42630fe
38646cf
81cbd56
62b1392
44f479b
a40237e
673c087
f78ebcb
31a1ddb
b0feec1
912a651
40beb58
f38fd45
9118ad2
957fee1
5b1bc2e
bc1c546
7ba37ab
7901903
ce7b0d6
f4b3d9d
ed1e90d
6a36cfc
a716f7d
8464d13
868e6cb
910b465
c8660f2
e6078db
3fec92e
57d8554
950e5ca
7a92b2e
26feaaf
d36f91c
e84cbf0
111b7c7
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -72,6 +72,10 @@ struct index_params : cuvs::neighbors::index_params { | |
uint32_t queue_size = 127; | ||
/** Max batchsize of reverse edge processing (reduces memory footprint) */ | ||
uint32_t reverse_batchsize = 1000000; | ||
/** Path prefix to pq pivots and rotation matrix files. Expects pq pivots file at | ||
* "${codebook_prefix}_pq_pivots.bin" and rotation matrix file at | ||
* "${codebook_prefix}_pq_pivots.bin_rotation_matrix.bin". */ | ||
std::string codebook_prefix = ""; | ||
}; | ||
|
||
/** | ||
|
@@ -127,6 +131,13 @@ struct index : cuvs::neighbors::index { | |
return *dataset_; | ||
} | ||
|
||
/** Quantized dataset [size, codes_rowlen] */ | ||
[[nodiscard]] inline auto quantized_data() const -> const cuvs::neighbors::dataset<int64_t>& | ||
{ | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I wanted to model this function after There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The return type of There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. While I agree that it would be nice to let users avoid the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Ah I did not realize that data() returns a There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Updated to const |
||
RAFT_EXPECTS(quantized_dataset_, "Invalid quantized dataset"); | ||
return *quantized_dataset_; | ||
} | ||
|
||
/** vamana graph [size, graph-degree] */ | ||
[[nodiscard]] inline auto graph() const noexcept | ||
-> raft::device_matrix_view<const IdxT, int64_t, raft::row_major> | ||
|
@@ -212,11 +223,28 @@ struct index : cuvs::neighbors::index { | |
graph_view_ = graph_.view(); | ||
} | ||
|
||
/** | ||
* Replace the quantized dataset with a new dataset. | ||
* | ||
* If `force_ownership` is set, we create a copy of the quantized dataset on the device, | ||
* and the index manages the lifetime of this copy. | ||
* Otherwise, we store a reference to the device data in `new_quantized_dataset`, and it | ||
* is the caller's responsibility to ensure that the data stays alive as long as the index. | ||
*/ | ||
void update_quantized_dataset( | ||
raft::resources const& res, | ||
raft::device_matrix_view<uint8_t, int64_t, raft::row_major> new_quantized_dataset, | ||
bool force_ownership) | ||
{ | ||
quantized_dataset_ = make_aligned_dataset(res, new_quantized_dataset, 16, force_ownership); | ||
} | ||
|
||
private: | ||
cuvs::distance::DistanceType metric_; | ||
raft::device_matrix<IdxT, int64_t, raft::row_major> graph_; | ||
raft::device_matrix_view<const IdxT, int64_t, raft::row_major> graph_view_; | ||
std::unique_ptr<neighbors::dataset<int64_t>> dataset_; | ||
std::unique_ptr<neighbors::dataset<int64_t>> quantized_dataset_; | ||
IdxT medoid_id_; | ||
}; | ||
/** | ||
|
@@ -457,13 +485,15 @@ auto build(raft::resources const& res, | |
* @param[in] file_prefix prefix of path and name of index files | ||
* @param[in] index Vamana index | ||
* @param[in] include_dataset whether or not to serialize the dataset | ||
* @param[in] sector_aligned whether output file should be aligned to disk sectors of 4096 bytes | ||
* | ||
*/ | ||
|
||
void serialize(raft::resources const& handle, | ||
const std::string& file_prefix, | ||
const cuvs::neighbors::vamana::index<float, uint32_t>& index, | ||
bool include_dataset = true); | ||
bool include_dataset = true, | ||
bool sector_aligned = false); | ||
|
||
/** | ||
* Save the index to file. | ||
|
@@ -486,12 +516,14 @@ void serialize(raft::resources const& handle, | |
* @param[in] file_prefix prefix of path and name of index files | ||
* @param[in] index Vamana index | ||
* @param[in] include_dataset whether or not to serialize the dataset | ||
* @param[in] sector_aligned whether output file should be aligned to disk sectors of 4096 bytes | ||
* | ||
*/ | ||
void serialize(raft::resources const& handle, | ||
const std::string& file_prefix, | ||
const cuvs::neighbors::vamana::index<int8_t, uint32_t>& index, | ||
bool include_dataset = true); | ||
bool include_dataset = true, | ||
bool sector_aligned = false); | ||
|
||
/** | ||
* Save the index to file. | ||
|
@@ -514,12 +546,14 @@ void serialize(raft::resources const& handle, | |
* @param[in] file_prefix prefix of path and name of index files | ||
* @param[in] index Vamana index | ||
* @param[in] include_dataset whether or not to serialize the dataset | ||
* @param[in] sector_aligned whether output file should be aligned to disk sectors of 4096 bytes | ||
* | ||
*/ | ||
void serialize(raft::resources const& handle, | ||
const std::string& file_prefix, | ||
const cuvs::neighbors::vamana::index<uint8_t, uint32_t>& index, | ||
bool include_dataset = true); | ||
bool include_dataset = true, | ||
bool sector_aligned = false); | ||
|
||
/** | ||
* @} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not opposed to this flag, but what is the reasoning behind the
force_ownership
flag? From my understanding, if the stride matches and the ptr is device accessible it should be okay to make it non-owning. And it is the calling process's responsibility to ensure the dataset is live until the end if the strided dataset is non-owning.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As you might've suspected, setting this flag is only needed in specific situations like this: during
vamana::index::index()
, the function optionally computes a matrix for the quantized vectors using additional input files pointed to byindex_params::codebook_prefix
. Since this matrix is created withinindex()
, its lifetime is limited to the index build (as opposed to the full dataset matrix which is provided toindex()
as an input) so we need to force ownership of the matrix. It may be possible to improve this by somehow moving the data instead of copying but that is not the focus for now.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with @tarang-jain here. I think there's a cleaner way to do this. Please don't treat public APIs like this as a "just need to get this done". These APIs tend to stick around for a long time and we need to make sure they are as clean and flexible as they can be. In fact, we often spend more time discussing our public APIs than we do the impl details because once these are exposed they are very hard to change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have talked a bit about how to handle this, and without the ability to force ownership here, we would need to have the
quantized_data
device matrix as a member of create/destroy it during thevamana::index
constructor/destructor. Does this seem like a reasonable solution to everyone?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That seems like a reasonable solution to me.
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated as discussed.