Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Set Up Embeddings in Default Index Settings #74

Open
alexaryn opened this issue Oct 2, 2023 · 1 comment
Open

Set Up Embeddings in Default Index Settings #74

alexaryn opened this issue Oct 2, 2023 · 1 comment

Comments

@alexaryn
Copy link
Contributor

alexaryn commented Oct 2, 2023

Is your feature request related to a problem? Please describe.
The out-of-the-box experience for Sycamore doesn't handle embeddings well. Considering that this is the main use-case, the default index settings should provide for a KNN index of embeddings.

Describe the solution you'd like
Default index settings in sycamore.writers.opensearch should be set up for embedding-based indexing and retrieval. Reasonable names should be used consistently for text, embeddings, title, author, etc.

Describe alternatives you've considered
Pass in index_settings to docset.write.opensearch(). These will be 90% copy-pasta, but have the possibility to diverge and cause problems.

Additional context
We should also revisit the integration tests to see how/if they can be simplified via defaults.

@alexaryn
Copy link
Contributor Author

alexaryn commented Oct 2, 2023

It appears that it's possible to specify index_settings at the time the index is explicitly created. It also appears possible to have the index created implicitly by simply ingesting a document. We should see if it matters which way it's done.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant