-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Receipt Leaf Circuit #405
base: feat/receipt-trie
Are you sure you want to change the base?
Receipt Leaf Circuit #405
Conversation
f0c0e28
to
434479c
Compare
Removed some unneeded code used for debugging circuits. |
434479c
to
4bd1f0a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool, very nice PR ! Left a bunch of comments but nothing critical
}) | ||
.collect(); | ||
|
||
// We need to express `at` in base 64, we are also assuming that the initial array was smaller than 64^2 = 4096 which we enforce with a range check. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why do you need to do a range check for the length of an Array
, isn't it a const generic i.e. constant (and therefore a simple assert! for us would be sufficient to make sure we're not abusing this API but as long as we put the constant < 4096 then we should be good ?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could just assert but the paranoid aspect of me (which comes from years of dealing with doing privacy) wanted the actual circuit to enforce that the prover didn't supply an index outside the size of the array possibly resulting in some unwanted behaviour.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah right the check is done on the at
but the comment says we enforce via a range check
but it's talking about the size of the array so that was confusing for me. Can you quickly re-write the comment then ? All good for the range check on the at
// We need to express `at` in base 64, we are also assuming that the initial array was smaller than 64^2 = 4096 which we enforce with a range check. | ||
// We also check that `at` is smaller that the size of the array. | ||
let array_size = b.constant(F::from_noncanonical_u64(SIZE as u64)); | ||
let less_than_check = less_than_unsafe(b, at, array_size, 12); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why unsafe is safe there ? Can you put that in a comment ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, it was because on line 637 I range check at
to be a 12-bit number anyway so it seemed pointless to do it twice.
let true_target = b._true(); | ||
b.connect(less_than_check.target, true_target.target); | ||
b.range_check(at, 12); | ||
let (low_bits, high_bits) = b.split_low_high(at, 6, 12); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry i'm a bit lost in this function. Can you write an example or just more doc about how you approach the decomposition and how you do the random access and the recombination at the end ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah sure thing
@@ -815,6 +907,51 @@ mod test { | |||
run_circuit::<F, D, C, _>(ValueAtCircuit { arr, idx, exp }); | |||
} | |||
|
|||
#[test] | |||
fn test_random_access_large_array() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you write a quick comparison of gates in either a comment on the function or in the test to see what is actually the gain ? Not asking for a full fledged benchmark but at least an idea of the gain.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
mp2-common/src/array.rs
Outdated
/// This is more expensive than [`Self::extract_array`] due to using [`Self::random_access_large_array`] | ||
/// instead of [`Self::value_at`]. This function enforces that the values extracted are within the array. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, it is more expensive only if the array is small. Can you say more about when it is beneficial to using this function ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should now be in the code, for the case I tested the old method took 129 rows and this new method took 15.
mp2-v1/src/receipt_extraction/mod.rs
Outdated
}; | ||
use plonky2::field::types::Field; | ||
|
||
/// Calculate `metadata_digest = D(key_id || value_id || slot)` for receipt leaf. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code below doesn't seem to reflect this comment. Can you clarify ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Apologies, I was using some of the already existing code as a starting point and adapting it, I must have forgotten to update the comment I'll fix that now
const H_RANGE: PublicInputRange = 0..PACKED_HASH_LEN; | ||
/// - `K : [6]F` : Length of the transaction index in nibbles | ||
const K_RANGE: PublicInputRange = H_RANGE.end..H_RANGE.end + MAX_INDEX_NIBBLES; | ||
/// `T : F` pointer in the MPT indicating portion of the key already traversed (from 6 → 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
6 should be a const generic right ? why 6 ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was working off of the 30 million gas block limit on Ethereum. Since any transaction costs a minimum of 21 000 gas the theoretical maximum number of transactions is 1428 in a block which is 3 bytes long, so 6 nibbles.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correction its two bytes long so should only be 4 nibbles, I forgot how to do maths there for a little bit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great, can you write this down in the source code then ? 🙏
002ed27
to
af6589f
Compare
mp2-test/src/mpt_sequential.rs
Outdated
@@ -111,7 +112,7 @@ pub fn generate_receipt_proofs() -> Vec<ReceiptProofInfo> { | |||
rt.block_on(async { | |||
// Spin up a local node. | |||
|
|||
let rpc = ProviderBuilder::new().on_anvil(); | |||
let rpc = ProviderBuilder::new().on_anvil_with_config(|anvil| Anvil::block_time(anvil, 1)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mhhh why are you introducing time effects now ? I don't understand the reason. Is it to have multiple tx inside a block or stg ?
Couldn't we use multicalls instead for example ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was to have multiple transactions in a block, I did not know the multicall functionality existed so I'll update it to use that instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If it's not too much more complicated it'd be great. Reason is i have personally very bad experience relying on timing in tests in CI, it's often randomly failing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So it turns out Multicall as a crate didn't play nice with the version of alloy that we were using. So instead I remembered that I can just make Anvil
only mine blocks when its told to rather than at set intervals. So we now simply send all the transactions and mine the block after they have all been sent.
6a94288
to
f4d4a4b
Compare
The latest commit does a number of things. Firstly it rebases the branch on to a more up to date version of the main repository. It also makes changes to reflect the documentation as per the notion page. We now treat receipt extraction as a subset of value extraction, along with simple and mapping. The APIs in In In SInce these commits have made some sizeable changes to the previous state of the PR if @nicholas-mainardi could look over again to check the logic as well as @nikkolasg that would be much appreciated. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Dense PR but really good work given the complexity of the circuits. Main changes are:
- Define methods to compute column ids uniquely for the receipt trie table
- Bind transaction index to the MPT key used to extract leaves
- Compute values digests coherently with the structure expected by the DB creation circuit after generic struct extraction
mp2-common/src/array.rs
Outdated
let arrays: Vec<Array<T, RANDOM_ACCESS_SIZE>> = (0..padded_size) | ||
.map(|i| Array { | ||
arr: create_array(|j| { | ||
let index = 64 * i + j; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
64 should be RANDOM_ACCESS_SIZE
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, will change
mp2-common/src/array.rs
Outdated
let less_than_check = less_than_unsafe(b, at, array_size, 12); | ||
let true_target = b._true(); | ||
b.connect(less_than_check.target, true_target.target); | ||
b.range_check(at, 12); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't this range-check redundant? It seems to me that at
will be implicitly range-checked inside split_low_high
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes it is, I forgot that it performed a range check inside split_low_high
so I'll remove it
|
||
// We know that the rlp encoding of the compact encoding of the key is going to be in roughly the first 10 bytes of | ||
// the node since the node is list byte, 2 bytes for list length (maybe 3), key length byte (1), key compact encoding (4 max) | ||
// so we take 10 bytes to be safe since this won't effect the number of random access gates we use. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AFAIU, the value 10 depends on the maximum number of nibbles we expect for transaction index, which should be 4. So, shouldn't we compute this constant depending from the maximum number of nibbles? Otherwise, if we decide to change this upper bound later on, then this estimation will no longer hold
|
||
/// Given a `PartitionWitness` that has only inputs set, populates the rest of the witness using the | ||
/// given set of generators. | ||
pub fn debug_generate_partial_witness< |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for adding this debugging utility. So, it looks to me that this method is basically a copy of generate_partial_witness
of Plonky2 with some print information, is it correct? If it is, then wdyt about modifying directly the method in our Plonky2 fork (maybe adding a debug input parameter that specifies whether to print out this data or not)? In this way, we don't duplicate the logic. Wdyt?
/// The topics for this Log | ||
topics: [LogColumn; 3], | ||
/// The extra data stored by this Log | ||
data: [LogColumn; 2], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't we use constants in place of integers in this struct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For data
yes we probably should, but topics
is hard limited by solidity at 4 (with the first being the event signature hash) so I think it should be fine to leave this as three rather than add another constant.
b: &mut CircuitBuilder<GoldilocksField, 2>, | ||
block_pi: &[Target], | ||
value_pi: &[Target], | ||
) -> ReceiptExtractionWires { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need to return wires here? build
should return wires either if there are input wires that need to be assigned, or if this a gadget with output wires that need to be employed by other circuit components. But it looks to me we are in neither of these 2 cases.
// We also keep track of which log this is in the receipt to avoid having identical rows in the table in the case | ||
// that the event we are tracking can be emitted multiple times in the same transaction but has no topics or data. | ||
let log_number = b.constant(F::from_canonical_usize(index + 1)); | ||
let log_no_digest = b.map_to_curve_point(&[one, log_number]); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as for gas used digest: I think we should employ a column identifier provided as input and computed off-circuit rather than the hard-coded constant 1
f07a164
to
04f6d13
Compare
This PR adds the functionality for verifying leaves in Receipt MPT Trie in a circuit, together with helper functions for extracting relevant information about specific event from and extra functionality for extract values from large arrays in circuit.