Skip to content

Commit 5bb2f8a

Browse files
committed
Add nilRAG performance expectations
1 parent 0c5f63e commit 5bb2f8a

File tree

1 file changed

+18
-0
lines changed

1 file changed

+18
-0
lines changed

docs/build/nilRAG.md

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -114,3 +114,21 @@ nilRAG is a standalone library available through
114114
a feature of [SecretLLM](https://docs.nillion.com/build/secretLLM/quickstart) to
115115
enhance the inference with context that has been uploaded to [SecretVault](https://docs.nillion.com/build/secret-vault).
116116
117+
118+
### Performance Expectations
119+
120+
We have performed a series of benchmarks to evaluate the performance of nilRAG.
121+
Currently, nilRAG scales linearly to the number of rows stored in nilDB.
122+
The following table shows latency to upload to nilDB multiple paragraphs of a few sentences long, as well as the runtime for AI inference using SecretLLM with nilRAG.
123+
124+
| Number of Paragraphs Stored in nilDB | Upload Time to nilDB (sec.) | Query Time (Inference + RAG) (sec.) |
125+
| -------------- | ------------------ | ----------------- |
126+
| 1 | 0.2 | 2.4 |
127+
| 10 | 0.4 | 3.1 |
128+
| 100 | 1.0 | 5.8 |
129+
| 1000 | 10.5 | 13.2 |
130+
| 10000 | 51.3 | 21.9 |
131+
132+
Additionally, using multiple concurrent users, the query time for inference with nilRAG increases.
133+
Performing inference with nilRAG with a content of 100 paragraphs takes approximately 5 seconds for a single user, while with ten concurrent users the inference time for the same content goes up to almost 9 seconds.
134+
We're developing new research to further accelerate nilRAG and make it more scalable, stay tuned!

0 commit comments

Comments
 (0)