You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm referring to this absolutely amazing experiment with database.build (formerly postgres.new).
Describe the problem
Question 1: Is there a documentation for database.build?
Question 2: How to work with embeddings?
I'm using the same dataset of athletes from the 2024 Paris Olympics as in the blog post and intro video, and I'd like to ask: Get the names of the athletes whose nick_names are not similar to their names.
How can I do this?
I'm aware that database.build uses transformers.js and PGLite that supports pgvector.
Describe the improvement
This could be the start of the documentation:
Using embeddings: database.build uses pgvector to store embeddings and transformers.js to create embeddings inside the browser. Instead of storing the embeddings "next to" the data the user provides, database.build creates a table meta that can be referenced (because embeddings can be large). When the LLM sees a reference to meta.embeddings, it knows it can "fetch" that data later when it's needed (for RAG etc).
Using the dataset of athletes from the 2024 Paris Olympics as in the blog post and intro video, and I'd like to ask: Get the names of the athletes whose nick_names are not similar to their names. (tbc. ...)
Additional context
A quick tip here on how to achieve a similarity search would be enough for me in the first place.
The text was updated successfully, but these errors were encountered:
Yes, I can do that PR but need your help before since actually yes, I got some results after I was confused of the id references, while I expected the embeddings there. Now I'm stuck with this:
Why aren't there at least same amount of rows (currently 6?) in meta.embeddings as rows with unique nick_names in athletes (2,623)?
What is the number of unique nick_names?
Executed SQL
The number of unique nicknames in the athletes table is 2,623. If you need more information or further analysis, feel free to ask!
Show statistics about the meta table.
Executed SQL
Here are the statistics for the meta.embeddings table:
Statistic Value
Total Rows 6
Unique IDs 6
Unique Contents 6
P.S. I was just about to write to you a PM about this: I think database.build is so cool that I would like to use it in the exercises for my lecture (possibly with our own LLM). Is database.build really just an experiment or will it still be around next year?
Improve documentation
Link
I'm referring to this absolutely amazing experiment with database.build (formerly postgres.new).
Describe the problem
Question 1: Is there a documentation for database.build?
Question 2: How to work with embeddings?
I'm using the same dataset of athletes from the 2024 Paris Olympics as in the blog post and intro video, and I'd like to ask:
Get the names of the athletes whose nick_names are not similar to their names
.How can I do this?
I'm aware that database.build uses transformers.js and PGLite that supports pgvector.
Describe the improvement
This could be the start of the documentation:
Using embeddings: database.build uses pgvector to store embeddings and transformers.js to create embeddings inside the browser. Instead of storing the embeddings "next to" the data the user provides, database.build creates a table
meta
that can be referenced (because embeddings can be large). When the LLM sees a reference tometa.embeddings
, it knows it can "fetch" that data later when it's needed (for RAG etc).Using the dataset of athletes from the 2024 Paris Olympics as in the blog post and intro video, and I'd like to ask:
Get the names of the athletes whose nick_names are not similar to their names
. (tbc. ...)Additional context
A quick tip here on how to achieve a similarity search would be enough for me in the first place.
The text was updated successfully, but these errors were encountered: