Skip to content

Commit

Permalink
Allow to choose faiss index
Browse files Browse the repository at this point in the history
  • Loading branch information
raphaelsty committed Jun 16, 2022
1 parent b468967 commit fb041ca
Show file tree
Hide file tree
Showing 7 changed files with 46 additions and 37 deletions.
2 changes: 1 addition & 1 deletion docs/api/rank/DPR.md
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,7 @@ DPR ranks documents using distinct models to encode the query and document conte

Path to the file dedicated to storing the embeddings. The ranker will read this file if it already exists to load the embeddings and will update it when documents are added.

- **similarity** – defaults to `<function dot at 0x1793078b0>`
- **similarity** – defaults to `<function dot at 0x144f058b0>`

Similarity measure to compare documents embeddings and query embedding (similarity.cosine or similarity.dot).

Expand Down
2 changes: 1 addition & 1 deletion docs/api/rank/Encoder.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@ SentenceBert Ranker.

Path to the file dedicated to storing the embeddings. The ranker will read this file if it already exists to load the embeddings and will update it when documents are added.

- **similarity** – defaults to `<function cosine at 0x179307820>`
- **similarity** – defaults to `<function cosine at 0x144f05820>`

Similarity measure to compare documents embeddings and query embedding (similarity.cosine or similarity.dot).

Expand Down
6 changes: 4 additions & 2 deletions docs/api/retrieve/DPR.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,8 @@ DPR as a retriever using Faiss Index.

- **path** (*str*) – defaults to `None`

- **index** (*faiss.swigfaiss.IndexFlatL2*) – defaults to `None`


## Attributes

Expand Down Expand Up @@ -100,7 +102,7 @@ DPR retriever

???- note "__call__"

Call self as a function.
Search for documents.

**Parameters**

Expand All @@ -120,7 +122,7 @@ DPR retriever

**Parameters**

- **tree** (*faiss.swigfaiss.IndexFlatL2*)
- **index** (*faiss.swigfaiss.IndexFlatL2*)
- **documents_embeddings** (*list*)

???- note "dump_embeddings"
Expand Down
6 changes: 4 additions & 2 deletions docs/api/retrieve/Encoder.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,8 @@ Encoder as a retriever using Faiss Index.

- **path** (*str*) – defaults to `None`

- **index** (*faiss.swigfaiss.IndexFlatL2*) – defaults to `None`


## Attributes

Expand Down Expand Up @@ -97,7 +99,7 @@ Encoder retriever

???- note "__call__"

Call self as a function.
Search for documents.

**Parameters**

Expand All @@ -117,7 +119,7 @@ Encoder retriever

**Parameters**

- **tree** (*faiss.swigfaiss.IndexFlatL2*)
- **index** (*faiss.swigfaiss.IndexFlatL2*)
- **documents_embeddings** (*list*)

???- note "dump_embeddings"
Expand Down
2 changes: 1 addition & 1 deletion docs/api/retrieve/Fuzz.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@

Number of documents to retrieve. Default is `None`, i.e all documents that match the query will be retrieved.

- **fuzzer** – defaults to `<cyfunction partial_ratio at 0x17b21aad0>`
- **fuzzer** – defaults to `<cyfunction partial_ratio at 0x14555a930>`

[RapidFuzz scorer](https://maxbachmann.github.io/RapidFuzz/Usage/fuzz.html): fuzz.ratio, fuzz.partial_ratio, fuzz.token_set_ratio, fuzz.partial_token_set_ratio, fuzz.token_sort_ratio, fuzz.partial_token_sort_ratio, fuzz.token_ratio, fuzz.partial_token_ratio, fuzz.WRatio, fuzz.QRatio, string_metric.levenshtein, string_metric.normalized_levenshtein

Expand Down
31 changes: 16 additions & 15 deletions docs/retrieve/dpr.md
Original file line number Diff line number Diff line change
Expand Up @@ -47,24 +47,22 @@ If we want to deploy this retriever, we should rely on Pickle to serialize the r
[{'id': 1, 'similarity': 0.01113}, {'id': 0, 'similarity': 0.01113}]
```

## Retriever DPR on GPU
## Index

To speed up the search for the most relevant documents, we can:
The retriever.DPR is based on the [faiss indexes](https://github.com/facebookresearch/faiss/wiki/Faiss-indexes) and is compatible with all the structures proposed by the library. By default, the index used is the `IndexFlatL2`. It is stored in memory and is called via the CPU. Faiss offers a wide range of algorithms that are suitable for different corpus sizes and speed constraints.

- Use the GPU to speed up the DPR model.
- Use the GPU to speed up faiss to retrieve documents.
[Here are the guidelines to choose an index](https://github.com/facebookresearch/faiss/wiki/Guidelines-to-choose-an-index).

To use faiss GPU, we need first to install faiss-gpu; we have to update the attribute `tree` of the retriever with the `faiss.index_cpu_to_gpu` method. After that, Faiss GPU significantly speeds up the search.
Let's create a faiss index stored in memory that run on GPU with the DPR model that also run on gpu.

```sh
pip install faiss-gpu
```

```python
>>> import faiss

>>> from cherche import retrieve
>>> from sentence_transformers import SentenceTransformer
>>> import faiss

>>> documents = [
... {
Expand All @@ -87,9 +85,15 @@ pip install faiss-gpu
... }
... ]

>>> encoder = SentenceTransformer('facebook-dpr-ctx_encoder-single-nq-base', device="cuda")

>>> d = encoder.encode("Embeddings size.").shape[0]
>>> index = faiss.IndexFlatL2(d)
>>> index = faiss.index_cpu_to_gpu(faiss.StandardGpuResources(), 0, index) # 0 is the id of the GPU

>>> retriever = retrieve.DPR(
... encoder = SentenceTransformer('facebook-dpr-ctx_encoder-single-nq-base', device="cuda").encode,
... query_encoder = SentenceTransformer('facebook-dpr-question_encoder-single-nq-base', device="cuda").encode,
... encoder = encoder.encode,
... query_encoder = SentenceTransformer('facebook-dpr-question_encoder-single-nq-base').encode,
... key = "id",
... on = ["title", "article"],
... k = 2,
Expand All @@ -98,12 +102,9 @@ pip install faiss-gpu

>>> retriever.add(documents)

# 0 is the id of the GPU.
>>> retriever.tree = faiss.index_cpu_to_gpu(faiss.StandardGpuResources(), 0, retriever.tree)

>>> retriever("paris")
[{'id': 0, 'similarity': 0.9025790931437582},
{'id': 2, 'similarity': 0.8160134832855334}]
[{'id': 1, 'similarity': 0.012779952697248447},
{'id': 0, 'similarity': 0.012022932290377224}]
```

## Map keys to documents
Expand Down Expand Up @@ -155,7 +156,7 @@ class CustomDPR:
model = CustomDPR()

# Your model should pass these tests, i.e Sentence Bert API.
assert model.documents(["Paris", "France", "Bordeaux"]).shape[0] == 3
assert model.documents(["Paris", "France", "Bordeaux"]).shape[0] == 3
assert isinstance(model.documents(["Paris", "France", "Bordeaux"]), np.ndarray)

assert len(model.documents("Paris").shape) == 1
Expand Down
34 changes: 19 additions & 15 deletions docs/retrieve/encoder.md
Original file line number Diff line number Diff line change
Expand Up @@ -55,24 +55,22 @@ If we want to deploy this retriever, we should move the pickle file that contain
{'id': 2, 'similarity': 0.8160134832855334}]
```

## Retriever Encoder on GPU
## Index

To speed up the search for the most relevant documents, we can:
The retriever.encoder is based on the [faiss indexes](https://github.com/facebookresearch/faiss/wiki/Faiss-indexes) and is compatible with all the structures proposed by the library. By default, the index used is the `IndexFlatL2`. It is stored in memory and is called via the CPU. Faiss offers a wide range of algorithms that are suitable for different corpus sizes and speed constraints.

- Use the GPU to speed up the encoder.
- Use the GPU to speed up faiss to retrieve documents.
[Here are the guidelines to choose an index](https://github.com/facebookresearch/faiss/wiki/Guidelines-to-choose-an-index).

To use faiss GPU, we need first to install faiss-gpu; we have to update the attribute `tree` of the retriever with the `faiss.index_cpu_to_gpu` method. After that, Faiss GPU significantly speeds up the search.
Let's create a faiss index stored in memory that run on GPU with the sentence transformer as encoder that also run on gpu.

```sh
pip install faiss-gpu
```

```python
>>> import faiss

>>> from cherche import retrieve
>>> from sentence_transformers import SentenceTransformer
>>> import faiss

>>> documents = [
... {
Expand All @@ -95,22 +93,28 @@ pip install faiss-gpu
... }
... ]

>>> encoder = SentenceTransformer("sentence-transformers/all-mpnet-base-v2", device="cuda")

>>> d = encoder.encode("Embeddings size.").shape[0]
>>> index = faiss.IndexFlatL2(d)
>>> index = faiss.index_cpu_to_gpu(faiss.StandardGpuResources(), 0, index) # 0 is the id of the GPU

>>> retriever = retrieve.Encoder(
... key = "id",
... on = ["title", "article"],
... encoder = SentenceTransformer("sentence-transformers/all-mpnet-base-v2", device="cuda").encode,
... encoder = encoder.encode,
... k = 2,
... path = "all-mpnet-base-v2.pkl"
... path = "all-mpnet-base-v2.pkl",
... index = index,
... )

>>> retriever.add(documents)

# 0 is the id of the GPU.
>>> retriever.tree = faiss.index_cpu_to_gpu(faiss.StandardGpuResources(), 0, retriever.tree)

>>> retriever("paris")
[{'id': 0, 'similarity': 0.9025790931437582},
{'id': 2, 'similarity': 0.8160134832855334}]
[{'id': 0,
'similarity': 0.9025790931437582},
{'id': 2,
'similarity': 0.8160134832855334}]
```

## Map keys to documents
Expand Down Expand Up @@ -154,7 +158,7 @@ class CustomEncoder:
model = CustomEncoder()

# Your model should pass these tests, i.e Sentence Bert API.
assert model.encode(["Paris", "France", "Bordeaux"]).shape[0] == 3
assert model.encode(["Paris", "France", "Bordeaux"]).shape[0] == 3
assert isinstance(model.encode(["Paris", "France", "Bordeaux"]), np.ndarray)

assert len(model.encode("Paris").shape) == 1
Expand Down

0 comments on commit fb041ca

Please sign in to comment.