-
Notifications
You must be signed in to change notification settings - Fork 71
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ERROR: pgvecto.rs: IPC connection is closed unexpected (v0.2.1) #500
Comments
Can you try manuually deleta all the files under |
Sorry for the late reply. I don't have a I also don't understand what you mean by "reindex all the vector column". Thanks! |
I am currently testing a PostgreSQL setup in a container (Kubernetes) with an 8-core, 32GB pod, where I have installed the pgvectors extension ( image tag is While continuously inserting data into the tables, I noticed that although the PostgreSQL pod itself did not terminate, its logs indicated that it had shut down. It restarted shortly afterward, but for a long time, the following log messages were repeatedly output, causing the service that was inserting data to stop functioning:
After this, a considerable number of connection errors occurred. After waiting for a longer period, the log messages stating "LOG: Find directory pg_vectors/indexes/xxxxx" stopped appearing, and I confirmed that the service could resume inserting data again. Below is a portion of the PostgreSQL logs:
postgresql.conf
While I am proficient in programming development, my knowledge of databases (PostgreSQL) is limited. I suspect that the issue could be due to attempting to insert such a large amount of data with too little CPU and memory. However, is there any way to determine whether "there is a problem with PostgreSQL"? Alternatively, is there a way to know if pgvectors is in the process of indexing, has completed the indexing, and is now ready to accept more data? Alternatively, is there a general guideline on the amount of CPU and RAM needed to handle the volume of data I mentioned? I would appreciate even a simple response like "It depends on the situation, but it seems too small" or "Generally, it should be sufficient." Since I lack expertise, I plan to use Citus or StackGres for sharding to handle large volumes of data. However, if a single instance can manage sufficiently large data, I will only set up an HA (High Availability) configuration. Thank you for reading my novice question. |
@cho-thinkfree-com Sorry for the problem. We haven't tested such large scale tables with indexes so far. The problem you met might due to insufficient memory. 30M dense vector+30M sparse vector needs about 160GB for 1024dim vectors. And index needs some extra memory for index structures. In the latest version, it should be able to accept new data immediately after |
Thank you so very much for your kind and detailed response. You mentioned that approximately 160GB is required to handle 1024-dimensional vectors when processing 30M dense vectors and 30M sparse vectors. Could you please clarify whether this 160GB refers to memory (RAM) or disk size? Also, does the 160GB estimate come from calculating the size of dense vectors (30M * 1024 dim * 4 bytes) + sparse vectors (...), or is there another method for this calculation? Is it necessary to load all vectors into memory in order to create the index? (I apologize if this is a basic question, but I have limited knowledge of databases and indexing.) Perhaps I'm attempting something unrealistic! :-) Thank you once again for your kind response. |
Hello. I was using the
pgvecto-rs:pg14-v0.2.0@sha256:90724186f0a3517cf6914295b5ab410db9ce23190a2d9d0b9dd6463e3fa298f0
image as instructed by Immich, but I started getting this error:I switched to
pgvecto-rs:pg14-v0.2.1
as it seems like was suggested in #409 and #376, but I still get the same error.What can I do? Thanks.
The text was updated successfully, but these errors were encountered: