You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is your feature request related to a problem? Please describe.
The out-of-the-box experience for Sycamore doesn't handle embeddings well. Considering that this is the main use-case, the default index settings should provide for a KNN index of embeddings.
Describe the solution you'd like
Default index settings in sycamore.writers.opensearch should be set up for embedding-based indexing and retrieval. Reasonable names should be used consistently for text, embeddings, title, author, etc.
Describe alternatives you've considered
Pass in index_settings to docset.write.opensearch(). These will be 90% copy-pasta, but have the possibility to diverge and cause problems.
Additional context
We should also revisit the integration tests to see how/if they can be simplified via defaults.
The text was updated successfully, but these errors were encountered:
It appears that it's possible to specify index_settings at the time the index is explicitly created. It also appears possible to have the index created implicitly by simply ingesting a document. We should see if it matters which way it's done.
Is your feature request related to a problem? Please describe.
The out-of-the-box experience for Sycamore doesn't handle embeddings well. Considering that this is the main use-case, the default index settings should provide for a KNN index of embeddings.
Describe the solution you'd like
Default index settings in sycamore.writers.opensearch should be set up for embedding-based indexing and retrieval. Reasonable names should be used consistently for text, embeddings, title, author, etc.
Describe alternatives you've considered
Pass in index_settings to docset.write.opensearch(). These will be 90% copy-pasta, but have the possibility to diverge and cause problems.
Additional context
We should also revisit the integration tests to see how/if they can be simplified via defaults.
The text was updated successfully, but these errors were encountered: