Releases: raphaelsty/cherche
2.0.0
Excited to announce the release of Cherche 2.0, a comprehensive open-source search engine toolkit for Python. This new version comes with a host of new features and improvements, including:
- Batch-computation
- Optimization
- Progress bars
- Cross-Encoders compatibility
- Focus on retrievers, rankers and indexes compatible with Python.
- Requirements are lighter modular
1.0.1
Removed the dependency with grpcio that can cause problems during installation.
1.0.0
What's Changed
Here is an essential update for Cherche! 🥳
- Added compatibility with two new open-source retrievers: Meilisearch and TypeSense.
- Compatibility with the Milvus index to use the
retriever.Encoder
andretriever.DPR
models on massive corpora. - Compatibility with the Milvus index to store ranker embeddings in a database rather than in memory.
- Progress bar when pre-computing embeddings by Encoder, DPR retrievers and Encoder, DPR rankers.
- The path parameter is no longer used.
- All pipelines (voting, intersection, concatenation) produce a similarity score. To do so, the pipeline object applies a softmax to normalize the scores, thus allowing us to "compare" the scores of two distinct models.
- Integration of collaborative filtering models via adding a Recommend retriever and a Recommend ranker (indexation via Faiss and compatible with Milvus) to consider users' preferences in the search.
Cherche is now fully compatible with large-scale corpora and deeply integrates collaborative filtering. Updates retains the previous API and is compatible with previous versions.
0.1.0
Added compatibility with the ONNX environment and quantization to significantly speed up sentence transformers and question answering models. 🏎
It is now possible to choose the type of index for the Encoder and DPR retrievers in order to process the largest corpora while using the GPU.
0.0.9
Voting operator dedicated to retrievers and rankers.
0.0.8
Avoid checking similarities in TF-IDF retrievers while filtering documents.
0.0.7
- Significant improvement in the speed of the TF-IDF retriever using sparse CSC matrix.
- The setup.py file loads the readme file as UTF-8.
0.0.6
- Update documentation
- Update retriever Encoder and DPR, path is optionnal
- Add deployment documentation
- Update similarity type
- Avoid round similarity
0.0.5
- Loading and Saving tutorial
- Fuzzy retriever
- Similarities everywhere (retrievers, union, intersection provide similarity scores)
- RAG generation
0.0.4
Update of the encoder retriever and the DPR retriever. Documents in the Faiss index will not be duplicated. Query embeddings can now be pre-computed for ranker Encoder and ranker DPR to speed up evaluation without having to compute it again.