Skip to content

Commit f048200

Browse files
authored
Merge pull request #226 from NillionNetwork/feat/nilrag-2
Add code segments to nilRAG and performance expectations
2 parents c9f5cb7 + 5bb2f8a commit f048200

File tree

1 file changed

+48
-5
lines changed

1 file changed

+48
-5
lines changed

docs/build/nilRAG.md

Lines changed: 48 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -46,20 +46,45 @@ Let us deep dive into the entities and their roles in the system.
4646
```
4747
:::
4848
49+
Let's dive a bit more into the example of employees records. First, Data
50+
Owners need to create a schema and a query in SecretVault:
51+
<details>
52+
<summary>Full 1.init_schema_query.py</summary>
53+
```py reference showGithubLink
54+
https://github.com/NillionNetwork/nilrag/blob/main/examples/1.init_schema_query.py
55+
```
56+
</details>
57+
58+
Now that the schema and the query are ready, Data Owners can upload their data:
59+
<details>
60+
<summary>Full 2.data_owner_upload.py</summary>
61+
```py reference showGithubLink
62+
https://github.com/NillionNetwork/nilrag/blob/main/examples/2.data_owner_upload.py
63+
```
64+
</details>
65+
4966
5067
2. **Client:** The client submits a query to search against the data owners'
51-
uploaded files in SecretVault, retrieve the most relevant data, and use the
52-
top-k results for privacy-preserving inference in SecretLLM. Similar to the
53-
encoding by data owners, the query is processed into its corresponding
54-
embeddings.
68+
uploaded files in SecretVault, retrieve the most relevant data, and use the
69+
top-k results for privacy-preserving inference in SecretLLM. Similar to the
70+
encoding by data owners, the query is processed into its corresponding
71+
embeddings.
5572
56-
Going back to our example, the client can query SecretLLM asking about Danielle:
73+
Going back to our example, the client can query SecretLLM asking about Danielle:
5774
:::note Employees Example
5875
```
5976
Who is Danielle Miller?
6077
```
6178
:::
6279
80+
Here is an example of how clients can run such a query:
81+
<details>
82+
<summary>Full 3.client_query.py</summary>
83+
```py reference showGithubLink
84+
https://github.com/NillionNetwork/nilrag/blob/main/examples/3.client_query.py
85+
```
86+
</details>
87+
6388
6489
3. **SecretVault:** SecretVault stores the blinded chunks and embeddings
6590
provided by data owners. When a client submits a query, SecretVault computes
@@ -89,3 +114,21 @@ nilRAG is a standalone library available through
89114
a feature of [SecretLLM](https://docs.nillion.com/build/secretLLM/quickstart) to
90115
enhance the inference with context that has been uploaded to [SecretVault](https://docs.nillion.com/build/secret-vault).
91116
117+
118+
### Performance Expectations
119+
120+
We have performed a series of benchmarks to evaluate the performance of nilRAG.
121+
Currently, nilRAG scales linearly to the number of rows stored in nilDB.
122+
The following table shows latency to upload to nilDB multiple paragraphs of a few sentences long, as well as the runtime for AI inference using SecretLLM with nilRAG.
123+
124+
| Number of Paragraphs Stored in nilDB | Upload Time to nilDB (sec.) | Query Time (Inference + RAG) (sec.) |
125+
| -------------- | ------------------ | ----------------- |
126+
| 1 | 0.2 | 2.4 |
127+
| 10 | 0.4 | 3.1 |
128+
| 100 | 1.0 | 5.8 |
129+
| 1000 | 10.5 | 13.2 |
130+
| 10000 | 51.3 | 21.9 |
131+
132+
Additionally, using multiple concurrent users, the query time for inference with nilRAG increases.
133+
Performing inference with nilRAG with a content of 100 paragraphs takes approximately 5 seconds for a single user, while with ten concurrent users the inference time for the same content goes up to almost 9 seconds.
134+
We're developing new research to further accelerate nilRAG and make it more scalable, stay tuned!

0 commit comments

Comments
 (0)