Skip to content

Conversation

@silathdiir
Copy link
Contributor

@silathdiir silathdiir commented Jan 23, 2024

Related issue https://github.com/Lagrange-Labs/mapreduce-plonky2/issues/34

Summary

  • implement standard Poseidon hash, reference the hash_n_to_m_no_pad function in plonky2.
  • implement 16-arity digest circuit, handle differently for the leaf and branch of Merkle tree, the leaf input is type of [u8; 32] and the branch inputs is Vec<[F; 4]>.
  • The leaf input needs to be converted from [u8; 32] to [u32; 8], then make the hash.
  • add an unit test test_standard_poseidon for hash once.
  • add a recursive test to prove and verify with Merkle tree in src/benches/merkle_tree.rs.

@silathdiir silathdiir requested a review from nikkolasg January 23, 2024 06:37
@silathdiir silathdiir marked this pull request as draft January 23, 2024 06:38
@silathdiir silathdiir changed the title [WIP] feat: implement MPT digest [WIP] feat: implement 16-arity digest circuit Jan 25, 2024
Copy link
Collaborator

@nikkolasg nikkolasg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice first PR ! Feels to me it's almost there, I left a bunch of questions and comments.
One general thing though is this code drastically needs more comments. Comments on all structs, and most functions is strictly required ! I will be super annoying on this, sorry already, but the goal is to create a library easy to read and understand and comments is the absolute minimum.
Thank you 🙏

Comment on lines 63 to 65
let init_proof = circuit.prove_init(U::new(vec![])).unwrap().0;

*proof_result = Some(prove_once(circuit, vec![(input, init_proof)]));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure I understand why are you proving twice for the leaves ? prove_init already returns a proof so you can directly give that proof to its parent in the tree.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, fix to use proof result of proof_init for leaves in commit 0c2ae16:

*proof_result = Some(circuit.prove_init(U::new(inputs)).unwrap().0);

max_level: u64,
}

impl DigestTree<F, C, D> {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can call it a merkle tree in this module ;)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rename to MerkleTree in commit a77bb13.

Branch(
Vec<Self>,
HashOut<F>,
Option<ProofWithPublicInputs<F, C, D>>,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. While it's technically correct, there is no real reasons to keep the children proofs as option here because we know exactly how many are there, we don't care outside the circuit about the dummy proofs, it's only when you proof you need to fill the array
  2. This assumes (and it's what we want) that the children are always arranged from left to right, i.e. there are only real proofs then followed by dummy proofs. No [real, dummy, dummy, real ... ] for example. I think it's worth noting this in a comment.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add comments in commit dec3381.
Another reason of setting proof to Option is that proofs haven't been generated when creating the tree, it's set to None when init. If I understand right, proofs are generated when proving from lowest leaves to high branches recursively.

src/digest.rs Outdated
Comment on lines 62 to 108
inputs
.chunks(SPONGE_RATE)
.enumerate()
.for_each(|(i, chunks)| {
let chunk_offset = SPONGE_RATE * i;

chunks.iter().enumerate().for_each(|(i, elt)| {
let elt_offset = b.constant(F::from_canonical_usize(chunk_offset + i));
let is_elt = less_than(b, elt_offset, real_len, 8);

let elt = b.select(is_elt, *elt, zero);
state.set_elt(elt, i);
});

let new_state = b.permute::<PoseidonHash>(state);
let chunk_offset = b.constant(F::from_canonical_usize(chunk_offset));
let is_new_state = less_than(b, chunk_offset, real_len, 8);

let elts: Vec<_> = state
.as_ref()
.iter()
.zip(new_state.as_ref())
.map(|(elt, new_elt)| b.select(is_new_state, *new_elt, *elt))
.collect();

elts.into_iter()
.enumerate()
.for_each(|(i, elt)| state.set_elt(elt, i));
});

let mut outputs = Vec::with_capacity(NUM_HASH_OUT_ELTS);
loop {
for &s in state.squeeze() {
outputs.push(s);
if outputs.len() == NUM_HASH_OUT_ELTS {
let output = HashOutTarget::from_vec(outputs);

return Self::Wires {
inputs,
real_len,
output,
};
}
}
state = b.permute::<PoseidonHash>(state);
}
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It'd be great if this logic can be extracted in its own function, thoroughly commented and tested.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another reason is that this could be used as an alternative to the hash to curve design if it turns out to be too expensive. The api should be the same hash this array of [Option<F>].

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And once extracted, we can just call that new function inside build().

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Extract this Poseidon hash logic to build_standard_poseidon function, and add a test case in commit fcd4e80 🙏

src/digest.rs Outdated

let new_state = b.permute::<PoseidonHash>(state);
let chunk_offset = b.constant(F::from_canonical_usize(chunk_offset));
let is_new_state = less_than(b, chunk_offset, real_len, 8);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't it be real_len but offset to the next SPONGE_RATE chunk ?
i.e. sponge_rate = 3, real_len = 5, so first iteration passes, but second does not change the state while it should (2 permutation to do normally, one with 3 non zero elts, one with 2 non zero elts) ?
In this case, we should compare 3 * 2 with real_len "padded" to 6, so 6=6, the second iteration updates the state.
Does this make sense ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I fixed the variable name chunk_offset to chunk_start. The data of [chunk_start..chunk_start + SPONGE_RATE] should be added if chunk_start < real_len.
In above example, the offset (chunk_start) starts from zero, both first and second iterations should pass, since both 0 * sponge_rate and 1 * sponge_rate are less than 5.

I alos added comments to this Poseidon calculation process in commit 37a6f42.

src/digest.rs Outdated

let mut outputs = Vec::with_capacity(NUM_HASH_OUT_ELTS);
loop {
for &s in state.squeeze() {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SHouldn't it already spit out NUM_HASH_OUT_ELTS normally, always ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I referenced the hash_n_to_m_no_pad code before.
I check the SPONGE_RATE = 8 and NUM_HASH_OUT_ELTS = 4, then simply this code in commit c3cab5e.

let (inputs, proofs): (Vec<_>, Vec<_>) = inputs_proofs.into_iter().unzip();

let dummy_n = ARITY - proofs.len();
let proofs: [Option<ProofWithPublicInputs<F, C, D>>; ARITY] = proofs
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Especially because you're already building the array with the None here, you don't need to keep the options in the node structs.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems the same issue as I replied in this #37 (comment).

@silathdiir silathdiir changed the title [WIP] feat: implement 16-arity digest circuit feat: implement 16-arity digest circuit Jan 29, 2024
@silathdiir silathdiir requested a review from nikkolasg January 29, 2024 04:44
@silathdiir silathdiir marked this pull request as ready for review January 29, 2024 04:44
Copy link
Collaborator

@nikkolasg nikkolasg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's great !
Ironically, hopefully we won't use it, but at least we have plan B in place now :) And you very quickly upped your plonky2 skills, well done !!!

LGTM !

src/digest.rs Outdated
Comment on lines 67 to 69
let mut inputs: Vec<_> = value.into_iter().map(F::from_canonical_u8).collect();
inputs.resize(ARITY * 4 + 28, F::ZERO);
let inputs = inputs.try_into().unwrap();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
let mut inputs: Vec<_> = value.into_iter().map(F::from_canonical_u8).collect();
inputs.resize(ARITY * 4 + 28, F::ZERO);
let inputs = inputs.try_into().unwrap();
let inputs: [F;ARITY*4+28] = core::array::from_fn(|i| if i < 32 { F::from_canonical_u8(value[i]) } else { F::ZERO });

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(not sure if it compiles but i just love this method for creating arrays!)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update to use this method in commit 2c295e9.

@silathdiir silathdiir merged commit 6a36ff9 into main Feb 5, 2024
@silathdiir silathdiir deleted the feat/digest-tree branch February 5, 2024 08:53
lopeetall pushed a commit that referenced this pull request Feb 22, 2024
* Implement a simple version of MPT digest with ARITY 16.

* Add `src/benches/digest_tree.rs` to test recursive digest circuit.

* Update digest circuit recursive test.

* Finish 16-arity digest circuit.

* Fix to use init proof for the leaves (only prove once).

* Rename digest tree to merkle tree.

* Fix failed merkle tree test.

* Add comments for child proof arrangement.

* Add comments to the poseidon process in circuit.

* Simply the Poseidon squeeze code.

* Add more comments to circuit and test.

* Extract standard Poseidon hash logic to `build_standard_poseidon`, and add an unit test.

* fix on CyclicCircuit assign. to empty proof wires

* removed comments

* Fix the leaf value to `[u8; 32]`, and handle differently with branch.

* Fix to use `core::array` to build inputs argument.

---------

Co-authored-by: nikkolasg <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants