-
Notifications
You must be signed in to change notification settings - Fork 7
[BUG] KNNMapperSearcherIT Failure When Building Index Remotely #30
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
I would like to work on this issue, thanks |
The doc_count is causing the error. When doc_count is 3 or 4, the error occurs. But when the doc_count is 5, 6 or 2, the remote build works and test pass. So if update here https://github.com/anntians/k-NN/blob/main/src/test/java/org/opensearch/knn/index/KNNMapperSearcherIT.java#L40-L52 to 5 docs, test passes This issue doesn't seem to be related to vector dimension as the same behavior shows when testing vector dimensions 2, 3, and 10. |
The remote index builder is freezing at the method Looking at the logs below, the hyper parameters of the failed and successful build are the same, so the difference is potentially in the Logs for test run with doc_count 4, build fails
Logs for test run with doc_count 5, build succeeds
|
Faiss Github Issue: facebookresearch/faiss#4260 |
What is the bug?
While testing the KNN plugin
KNNMapperSearcherIT
against the remote index builder, the teststestKNNResultsUpdateDocAndForceMerge
andtestKNNResultsWithForceMerge
failed due to a read time out after sending a build request to the remote index builder.Looking at the logs for remote index builder, it rejects (422) several build requests coming from the
addKnnDoc
method when adding test data, which is expected as remote builder will reject build requests withdoc_count <= 1
(ref: #26). Then it accepts the build request coming from theforceMergeKnnIndex
method, and then is stuck, causing the tests to time out and fail.Previously the index builder freezes due to incorrect test data, such as wrong dimensions/data type. However, the
.knndid
and.knnvec
file size look correct with.knnvec
2x the size of.knndid
due to the test vector dimensions are 2.Remote index builder logs:
How can one reproduce the bug?
What is the expected behavior?
The expected behavior is the remote index builds successfully and uploads the
.faiss
file to S3 bucket for KNN to downloadWhat is your host/environment?
g5.4xlarge EC2 instance, with Deep Learning Base OSS Nvidia Driver GPU AMI (Amazon Linux 2023) AMI
Do you have any screenshots?
If applicable, add screenshots to help explain your problem.
Do you have any additional context?
The other tests such as
testKNNResultsWithoutForceMerge
passes because remote build was not triggered without theforceMergeKnnIndex
method call. Calls toaddKnnDoc
will cause 422 build request errors from remote index builder because_refresh
is called, leading to build requests withdoc_count <= 1
The text was updated successfully, but these errors were encountered: