Skip to content

Conversation

@lopeetall
Copy link
Contributor

Trait for MapReduce computations

Each Map needs an input type and output type, each Reduce needs an input type (which is the same as its output type).

The input and output types are DataItems: an atomic piece of data that can be represented as a vector of field elements.

To define a Map computation you must provide a function that converts between the input type and the output type and a function adding constraints to a CircuitBuilder which verify the transformation. Defining a Reduce is similar except that the functions take a pair of inputs and produce an output of the same type.

Then a MapReduce computation is an ordered list of Maps and Reduces.

Copy link
Collaborator

@nikkolasg nikkolasg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good starting point ! I think we need some iteration over this PR to make sure we are both aligned on the vision and the usage of its API.
Left some long comments :D

&self,
builder: &mut CircuitBuilder<F, D>,
) -> Vec<Target> {
let cons = builder.constants(&self.encode());
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why should this be a constant ? Isn't it a variable if it's in a public input ? (otherwise what's the point of exposing a constant as public input, since both verifier and prover would know it already)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep thanks for catching. This should be a non-constant target

}
}

struct PublicByteString(Vec<u8>);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you implement the trait directly on Vec<u8> rather a wrapper ? I'm not sure I see the utility of having the wrapper here, maybe I'm missing something.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there are opportunities to simplify here but I want to think it through. I realized yesterday that every DataItem is going to correspond to a list of field elements and so maybe a trait isn't necessary and a struct will do. Some kind of wrapper holding metadata or doing out-of-circuit bookkeeping might still be necessary. I'm thinking this over

&self,
builder: &mut CircuitBuilder<F, D>,
) -> Vec<Target> {
let cons = builder.constants(&self.encode());
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here.


/// An instance of DataItem must provide a function that encodes
/// the data it contains as a vector of field elements
fn encode<F: RichField + Extendable<D>, const D: usize>(&self) -> Vec<F>;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given F and D are all present in each function, I'd suggest making it part of the trait constant, wdyt ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sounds good to me!

type Output: DataItem;

// The map computation to be performed on a single item.
// The input and output can have different types but each
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mmmhh... that is not sure.

I come from the cyclic circuit place and here, you must have the same input output pair for all the circuit, it's basically an IVC, one circuit you recurse over.

I'm ok to keep it like this here but can you note that in the case of cyclic, there must be only ONE input/output set ?

BTW in this case, we would need to have a MapReduce trait that define the union of the public inputs of the Map and Reduce step, don't we ? Can we do that ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand the issue here. The Input and Output types can be made the same.

An initial Map in an IVC chain could be written as a "different" map which doesn't take in a proof, only initial data. Then there could be a different cyclic map following whose input and outputs are the same.

I(x) = (y, Proof)
C(y, Proof) = (z, Proof)

output = C(C(C(...(I(x))))

Or, a dummy proof could be supplied for the initial map allowing the initial map to be identical to the cyclic map. Both approaches are doable with this setup.

output = C(C(C(...C(x, DummyProof))))

What would need added to the MapReduce trait is a way to compose n Maps of the same kind by supplying n instead of manually writing them out. This would expand out to one of the two approaches above.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As for your last paragraph I'm not sure what you mean. Can you give an example where we need this? (I'm sure we can do it but a concrete example would be helpful.)

BTW since a Union is a binary operation on sets it can be formulated as a Reduce in this framework

Comment on lines 46 to 47
input: &Self::Input,
output: &Self::Output,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't believe this is gonna work when we apply the separation of building vs proving.
During building time, we have no inputs at all. That would be up to the trait to decide if it must instantiate one, two etc Target as u64 etc but we can not give directly the inputs during building time.
It would be great if you can re-work the traits so it fits the build and prove steps (similar to the ones I made in the #21 PR)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I imagine things, "Proving" is a kind of Map, and proofs can be part of DataItems

The way I have thought about it is that each Map or Reduce takes in the input data and a mutable builder, adds its own constraints, and produces its output data with its eval method. When we reach a point where we want to produce a proof we can apply a special Map whose eval method builds the circuit that has accumulated and produces a proof for it, which becomes part of its output data. This gives us the flexibility choose when proofs are produced in the chain of computations instead of being forced to prove after each computational step.

Self { map, reduce }
}

fn apply<F: RichField + Extendable<D>, const D: usize>(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not seeing in which situation do we want to apply map AND reduce in the same circuit.
Could we just separate the map and reduce step in two functions (and therefore have a build_map / build_reduce and prove_map and prove_reduce?

The reason I'm saying this is because I have the following two different mental model:

1. Different Circuit
Where we have defined a circuit for the map, and a circuit for the reduce step. Ideally that's what we would like to do but recursion in plonky2 and our settings (MPT) make it a bit hard (but maybe we can do it).

struct MapLagrange { m: MapReduce }
impl MapLagrange {
    fn build_map( .... ) {
        let wires = self.m.map.build();
     }
    fn prove_map( ... ) {
        self.m.map.prove(wires);
    }
    fn build_reduce( ... ) {
        let wires = self.m.reduce.build();
   }
   fn prove_reduce ( ... ) {
       self.m.reduce.prove(wires);
    }
}

2. Unified Circuit
This is in the case where we are stuck with a single monolithic circuit that we can't divide, because of the verification keys problem. So in this case, we have to apply both map and reduce and "filter" the output by having
a bool or some indication that we are in a map or reduce case.

struct LagrangeCircuit{ m: MapReduce }

impl LagrangeCircuit {
   fn build( ... ) {
       let is_map = builder.new_virtual_target();
       // the wires that need assignement during proving
       let map_wires = self.m.build_map();
       // the wires that need assignement during proving
       let reduce_wires = self.m.build_reduce();
       let map_output = self.m.map_outputs(map_wires);
       let reduce_output = self.m.reduce_outputs(reduce_wire);
       let outputs  = vec![Target(0); MAX_OUTPUT];
       for i in 0..MAX_OUTPUT {
           outputs[i] = builder.conditional(is_map, map_output[i], reduce_output[i]);
       }
      b.register_public_outputs(outputs);
     // save wires to be assigned to save them on disk etc
   }
   fn prove(map_wires, reduce_wires, map_or_reduce: bool, ...  ) {
       self.m.prove_map(map_wires);
       self.m.prove_reduce(reduce_wires);
       pw.assign_target(is_map, map_or_reduce);
   }
}

This is the first time I wrote it down like this so no worries if it's new for you :D
Does that make sense to you ? How have you seen the integration in your head while writing the traits ? It feels different than mine.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we are on the same page. The MR trait is designed to be flexible. A Map m:A->A followed by a Reduce r: A×A-> A can be formulated as a single Reduce r(m(x), m(y)) in this framework if we wish. Or they can be separated with a proving step in-between. Whichever we need to optimize the circuit according to the VK issue.

Users might find it convenient to separate complex computations into smaller steps for ease of writing and reasoning about the computation, but I don't want the proving points to be dictated by their choices.

// A function adding constraints to a circuit builder which should
// be satisfied by the reduce computation in `eval`. That is,
// if eval(x, y) = z, then circuit(x, y, z) should be true.
fn add_constraints<F: RichField + Extendable<D>, const D: usize>(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here, we can't have real values at building time, it's only about creating wires.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand this at all. We can have "real values" but not do anything with them, right? Can you shed some light?

Comment on lines 159 to 119
let map_outs = self.map.apply_map(inputs, builder);
self.reduce.apply_reduce(&map_outs, builder)
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wrt my previous comment, I don't see in which situation we apply map AND reduce in circuit at the same time. Can you share some lights ? And if it's a valid scenario, then we should probably let the user choose if he wants to apply one or the other or both no ?

lopeetall and others added 4 commits January 23, 2024 16:45
* Implement a simple version of MPT digest with ARITY 16.

* Add `src/benches/digest_tree.rs` to test recursive digest circuit.

* Update digest circuit recursive test.

* Finish 16-arity digest circuit.

* Fix to use init proof for the leaves (only prove once).

* Rename digest tree to merkle tree.

* Fix failed merkle tree test.

* Add comments for child proof arrangement.

* Add comments to the poseidon process in circuit.

* Simply the Poseidon squeeze code.

* Add more comments to circuit and test.

* Extract standard Poseidon hash logic to `build_standard_poseidon`, and add an unit test.

* fix on CyclicCircuit assign. to empty proof wires

* removed comments

* Fix the leaf value to `[u8; 32]`, and handle differently with branch.

* Fix to use `core::array` to build inputs argument.

---------

Co-authored-by: nikkolasg <[email protected]>
silathdiir and others added 5 commits February 22, 2024 16:12
* Add a trait for digest circuits, both the arity circuit and multiset hashing circuit implement this trait, and could be reused in the same benchmark and testing functions.

* Initialize multiset hashing circuit.

* Extract Merkle tree structures.

* Add test case for multiset hashing circuit.

* Extract field extension and hash-to-field functions.

* Add crate `plonky2_ecgfp5` to it's `Point`, `CurveTarget` and associated functions.

* Rename `benches/digest/arity.rs` to `canonical_hashing.rs`.

* Update dep `ahash`.

* Implement `map_to_curve_point` for field arithmetic.

* Implement `map_to_curve_target` for circuit.

* Fix wrong io number in circuit.

* Add `map_to_curve` unit tests.

* Fix test case.

* Delete `child_num`, and add `is_leaf` flag.

* Add `debug_assertions` to `set_curve_target` code.

* Add more comments for Merkle tree functions, and `debug_assertions`.

* Fix to use `flat_map` when constructs inputs.

* Update `plonky2_ecgfp5`.

* Refactor to `group_hashing`.

* Add `CircuitBuilderGroupHashing` trait for circuit builder.

* Update comments.

* Update comments.

* Add `num_gates` benchmark.

* Rename public functions in group-hashing.

* Fix to use constants of inputs and outputs in `test_simple_swu_for_curve_point`.

* Fix to check the output with public input of proof in tests.
@nikkolasg nikkolasg closed this Jun 18, 2024
@delehef delehef deleted the feat/mr-trait branch July 18, 2024 11:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants