Vector indexer support in GigaMap? #478
hrstoyanov
started this conversation in
General
Replies: 1 comment
-
|
We have been working on vector storage for ES/GigaMap for a while now. Currently, it's only used internally; it has not yet been decided how we will deliver it. The LEANN stuff sounds very interesting. We will definitely check it out. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
ES GigaMap already provides bitmap and lucene indexing. With AI these days, another type of indexing is required - vector embedding indexing. This is somewhat similar to Lucene, but allows for "search by similarity" and is used in RAG systems for LLMs.
There is the excellent JVector java library that can be accommodated. However, If this is of interest to the ES team, I would highly recommend looking into the latest technology - LEANN. It uses a clever trick/delayed vectorization that reduces storage requirements by 97% , because traditional engines (like JVector) require a lot of storage for multidimensional embedding (where the only option to reduce storage size is vector quantization with lose of accuracy) . It might be easier to fit into GigaMap.
This should be generic enough for many uses cases (text, image, genetic sequences, media files, etc.), and not just for AI RAG
Beta Was this translation helpful? Give feedback.
All reactions