Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Receipt Leaf Circuit #405

Open
wants to merge 15 commits into
base: feat/receipt-trie
Choose a base branch
from
Open

Receipt Leaf Circuit #405

wants to merge 15 commits into from

Conversation

Zyouell
Copy link
Contributor

@Zyouell Zyouell commented Nov 8, 2024

This PR adds the functionality for verifying leaves in Receipt MPT Trie in a circuit, together with helper functions for extracting relevant information about specific event from and extra functionality for extract values from large arrays in circuit.

@Zyouell
Copy link
Contributor Author

Zyouell commented Nov 8, 2024

Removed some unneeded code used for debugging circuits.

@Zyouell Zyouell force-pushed the zyouell/receipt-leaf branch from 434479c to 4bd1f0a Compare November 8, 2024 14:23
Copy link
Collaborator

@nikkolasg nikkolasg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool, very nice PR ! Left a bunch of comments but nothing critical

})
.collect();

// We need to express `at` in base 64, we are also assuming that the initial array was smaller than 64^2 = 4096 which we enforce with a range check.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do you need to do a range check for the length of an Array , isn't it a const generic i.e. constant (and therefore a simple assert! for us would be sufficient to make sure we're not abusing this API but as long as we put the constant < 4096 then we should be good ?)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could just assert but the paranoid aspect of me (which comes from years of dealing with doing privacy) wanted the actual circuit to enforce that the prover didn't supply an index outside the size of the array possibly resulting in some unwanted behaviour.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah right the check is done on the at but the comment says we enforce via a range check but it's talking about the size of the array so that was confusing for me. Can you quickly re-write the comment then ? All good for the range check on the at

// We need to express `at` in base 64, we are also assuming that the initial array was smaller than 64^2 = 4096 which we enforce with a range check.
// We also check that `at` is smaller that the size of the array.
let array_size = b.constant(F::from_noncanonical_u64(SIZE as u64));
let less_than_check = less_than_unsafe(b, at, array_size, 12);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why unsafe is safe there ? Can you put that in a comment ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, it was because on line 637 I range check at to be a 12-bit number anyway so it seemed pointless to do it twice.

let true_target = b._true();
b.connect(less_than_check.target, true_target.target);
b.range_check(at, 12);
let (low_bits, high_bits) = b.split_low_high(at, 6, 12);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry i'm a bit lost in this function. Can you write an example or just more doc about how you approach the decomposition and how you do the random access and the recombination at the end ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah sure thing

@@ -815,6 +907,51 @@ mod test {
run_circuit::<F, D, C, _>(ValueAtCircuit { arr, idx, exp });
}

#[test]
fn test_random_access_large_array() {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you write a quick comparison of gates in either a comment on the function or in the test to see what is actually the gain ? Not asking for a full fledged benchmark but at least an idea of the gain.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Comment on lines 664 to 678
/// This is more expensive than [`Self::extract_array`] due to using [`Self::random_access_large_array`]
/// instead of [`Self::value_at`]. This function enforces that the values extracted are within the array.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, it is more expensive only if the array is small. Can you say more about when it is beneficial to using this function ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should now be in the code, for the case I tested the old method took 129 rows and this new method took 15.

};
use plonky2::field::types::Field;

/// Calculate `metadata_digest = D(key_id || value_id || slot)` for receipt leaf.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code below doesn't seem to reflect this comment. Can you clarify ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apologies, I was using some of the already existing code as a starting point and adapting it, I must have forgotten to update the comment I'll fix that now

const H_RANGE: PublicInputRange = 0..PACKED_HASH_LEN;
/// - `K : [6]F` : Length of the transaction index in nibbles
const K_RANGE: PublicInputRange = H_RANGE.end..H_RANGE.end + MAX_INDEX_NIBBLES;
/// `T : F` pointer in the MPT indicating portion of the key already traversed (from 6 → 0)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

6 should be a const generic right ? why 6 ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was working off of the 30 million gas block limit on Ethereum. Since any transaction costs a minimum of 21 000 gas the theoretical maximum number of transactions is 1428 in a block which is 3 bytes long, so 6 nibbles.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correction its two bytes long so should only be 4 nibbles, I forgot how to do maths there for a little bit.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great, can you write this down in the source code then ? 🙏

rustc-ice-2024-11-04T12_36_50-74186.txt Outdated Show resolved Hide resolved
rustc-ice-2024-11-04T12_37_01-74253.txt Outdated Show resolved Hide resolved
rustc-ice-2024-11-04T12_37_13-74307.txt Outdated Show resolved Hide resolved
@Zyouell Zyouell force-pushed the zyouell/receipt-leaf branch 2 times, most recently from 002ed27 to af6589f Compare November 11, 2024 19:11
@@ -111,7 +112,7 @@ pub fn generate_receipt_proofs() -> Vec<ReceiptProofInfo> {
rt.block_on(async {
// Spin up a local node.

let rpc = ProviderBuilder::new().on_anvil();
let rpc = ProviderBuilder::new().on_anvil_with_config(|anvil| Anvil::block_time(anvil, 1));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mhhh why are you introducing time effects now ? I don't understand the reason. Is it to have multiple tx inside a block or stg ?
Couldn't we use multicalls instead for example ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was to have multiple transactions in a block, I did not know the multicall functionality existed so I'll update it to use that instead.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it's not too much more complicated it'd be great. Reason is i have personally very bad experience relying on timing in tests in CI, it's often randomly failing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So it turns out Multicall as a crate didn't play nice with the version of alloy that we were using. So instead I remembered that I can just make Anvil only mine blocks when its told to rather than at set intervals. So we now simply send all the transactions and mine the block after they have all been sent.

@Zyouell Zyouell force-pushed the zyouell/receipt-leaf branch from 6a94288 to f4d4a4b Compare December 2, 2024 13:53
@Zyouell
Copy link
Contributor Author

Zyouell commented Dec 2, 2024

The latest commit does a number of things. Firstly it rebases the branch on to a more up to date version of the main repository. It also makes changes to reflect the documentation as per the notion page.

We now treat receipt extraction as a subset of value extraction, along with simple and mapping. The APIs in value_extraction/api.rs have been updated to reflect this along with added unit testing for the receipt case.

In block_extraction/circuit.rs we now specify the type of extraction we are performing, this allows us to select which of the three MPT roots we wish to verify against depending on if this extraction is to do with storage, receipts or transactions. The relevant type is selected using an enum.

In final_extraction we add a new circuit that does not use BaseCircuit since receipt extraction does not require verifying account storage roots in the state trie root.

SInce these commits have made some sizeable changes to the previous state of the PR if @nicholas-mainardi could look over again to check the logic as well as @nikkolasg that would be much appreciated.

Copy link
Contributor

@nicholas-mainardi nicholas-mainardi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dense PR but really good work given the complexity of the circuits. Main changes are:

  • Define methods to compute column ids uniquely for the receipt trie table
  • Bind transaction index to the MPT key used to extract leaves
  • Compute values digests coherently with the structure expected by the DB creation circuit after generic struct extraction

let arrays: Vec<Array<T, RANDOM_ACCESS_SIZE>> = (0..padded_size)
.map(|i| Array {
arr: create_array(|j| {
let index = 64 * i + j;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

64 should be RANDOM_ACCESS_SIZE?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, will change

let less_than_check = less_than_unsafe(b, at, array_size, 12);
let true_target = b._true();
b.connect(less_than_check.target, true_target.target);
b.range_check(at, 12);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't this range-check redundant? It seems to me that at will be implicitly range-checked inside split_low_high

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes it is, I forgot that it performed a range check inside split_low_high so I'll remove it


// We know that the rlp encoding of the compact encoding of the key is going to be in roughly the first 10 bytes of
// the node since the node is list byte, 2 bytes for list length (maybe 3), key length byte (1), key compact encoding (4 max)
// so we take 10 bytes to be safe since this won't effect the number of random access gates we use.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAIU, the value 10 depends on the maximum number of nibbles we expect for transaction index, which should be 4. So, shouldn't we compute this constant depending from the maximum number of nibbles? Otherwise, if we decide to change this upper bound later on, then this estimation will no longer hold


/// Given a `PartitionWitness` that has only inputs set, populates the rest of the witness using the
/// given set of generators.
pub fn debug_generate_partial_witness<
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adding this debugging utility. So, it looks to me that this method is basically a copy of generate_partial_witness of Plonky2 with some print information, is it correct? If it is, then wdyt about modifying directly the method in our Plonky2 fork (maybe adding a debug input parameter that specifies whether to print out this data or not)? In this way, we don't duplicate the logic. Wdyt?

/// The topics for this Log
topics: [LogColumn; 3],
/// The extra data stored by this Log
data: [LogColumn; 2],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't we use constants in place of integers in this struct?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For data yes we probably should, but topics is hard limited by solidity at 4 (with the first being the event signature hash) so I think it should be fine to leave this as three rather than add another constant.

mp2-v1/src/final_extraction/receipt_circuit.rs Outdated Show resolved Hide resolved
mp2-v1/src/final_extraction/receipt_circuit.rs Outdated Show resolved Hide resolved
b: &mut CircuitBuilder<GoldilocksField, 2>,
block_pi: &[Target],
value_pi: &[Target],
) -> ReceiptExtractionWires {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need to return wires here? build should return wires either if there are input wires that need to be assigned, or if this a gadget with output wires that need to be employed by other circuit components. But it looks to me we are in neither of these 2 cases.

// We also keep track of which log this is in the receipt to avoid having identical rows in the table in the case
// that the event we are tracking can be emitted multiple times in the same transaction but has no topics or data.
let log_number = b.constant(F::from_canonical_usize(index + 1));
let log_no_digest = b.map_to_curve_point(&[one, log_number]);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as for gas used digest: I think we should employ a column identifier provided as input and computed off-circuit rather than the hard-coded constant 1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants