You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: python/src/cairo_coder/dspy/retrieval_judge.py
+56-16Lines changed: 56 additions & 16 deletions
Original file line number
Diff line number
Diff line change
@@ -29,21 +29,61 @@
29
29
30
30
# Note: examples here should be auto-generated from using an optimizer.
31
31
classRetrievalRecallPrecision(dspy.Signature):
32
-
"""Compare a system's retrieval response to the query and rate how much it can be leveraged to answer the query.
33
-
34
-
When asked to reason, enumerate key ideas in each response, and whether they are present in the expected output.
35
-
A document is considered useful if it is directly relevant to the query, or if it is informative and can be useful for context.
36
-
37
-
For example, if the query is about creating or fixing a smart contract, then, an example of a smart contract, even if not _directly_ related, is considered useful. If the query is about a specific Cairo language feature, then a document about that feature is considered useful.
38
-
39
-
If the query is about learning about a concept, like cryptography, STARKs, AIRs, and the document is related to that concept, then it can be considered useful.
40
-
41
-
Contract and test templates are always considered useful.
42
-
43
-
Examples:
44
-
- The query asks about writing an AIR with STWO. The provided document is titled 'Basic building blocks' and covers the basic building blocks used to build the Cairo AIR. While it mentions Cairo AIR, it does not specifically address how to write an AIR with STWO but could be useful for context. Therefore, it is indirectly relevant to the user's query about STWO and AIRs.
45
-
- The query asks about writing an AIR with STWO. The provided document discusses writing a spreadsheet and mentions STWO in the context of SIMD operations and prover speed-up. While it touches upon STWO, it does not explain how to write an AIR with it. The document focuses on creating tables for proofs, which is a related but distinct topic. Therefore, the document is not indirectly relevant to answering the query about writing an AIR and can be kept.
46
-
- The query asks about writing an ERC20 contract. The provided document is called 'Components in Cairo' and covers the syntax of composable components in Cairo. While it does not directly address the ERC20 contract, it provides valuable context about including components in contracts, which can be useful to integrate the ERC20 Openzeppelin component. Therefore, it is relevant to the user's query about writing an ERC20 contract.
32
+
"""
33
+
Goal
34
+
----
35
+
Given a user query and a single technical resource (content + minimal metadata),
36
+
judge how useful the resource is for answering the query.
37
+
38
+
How to read inputs
39
+
------------------
40
+
- query: what the user needs. Extract the main intent (task / concept / error) and key entities
Good reasoning: “Resource <Cairo core arithmetic>: documents `saturating_sub` with examples; directly answers.” → 1.00
80
+
- Query: “ERC721 policy token on Starknet”
81
+
Context reasoning: “Resource <Components in Cairo>: explains components/modularity used when composing
82
+
ERC721; helpful context but not full implementation.” → ~0.50
83
+
- Query: “fees structure in Starknet”
84
+
Not useful: “Resource <General L1 gas primer>: EVM-only overview; no Starknet specifics.” → 0.00–0.25
85
+
- QUERY: "How to write a S-Two AIR"
86
+
Context reasoning: "Resource <Mersenne Prime>: explains what the Mersenne Prime is and how it's used in S-Two” → ~0.50
47
87
"""
48
88
49
89
query: str=dspy.InputField()
@@ -54,7 +94,7 @@ class RetrievalRecallPrecision(dspy.Signature):
54
94
desc="A short sentence, on why a selected resource will be useful. If it's not selected, reason about why it's not going to be useful. Start by Resource <resource_title>..."
55
95
)
56
96
resource_note: float=dspy.OutputField(
57
-
desc="A note between 0 and 1.0 on how useful the resource is to directly answer the query. 0 being completely unrelated, 1.0 being very relevant, 0.5 being 'not directly related but still informative and can be useful for context'."
97
+
desc="Float in [0.0, 1.0] per the scoring anchors."
0 commit comments