Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add dummy circuit to final extraction for indexing Data with no provable extraction #415

Open
wants to merge 153 commits into
base: main
Choose a base branch
from

Conversation

silathdiir
Copy link
Contributor

@silathdiir silathdiir commented Nov 29, 2024

The related document: Indexing Data with No Provable Extraction

Summary

  • Add a dummy circuit to final extraction, it exports the input digests as public values directly (without proving).
  • Add compute_table_row_digest and merge_table_row_digests for table row digest computation.
  • Add run_no_provable_test_cases to wrap the original test cases to run the final extraction and check with DB result directly (without values extraction).

@silathdiir silathdiir marked this pull request as ready for review December 1, 2024 07:08
@silathdiir silathdiir force-pushed the indexing-with-no-probvable-ext branch from 776bf1a to e110090 Compare December 1, 2024 07:14
Copy link
Contributor

@nicholas-mainardi nicholas-mainardi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Circuit logic looks good, I think API needs some changes, I left some comments about it. The integration test should instead use a generic off-chain table instead of using blockchain data, as we discussed offline. The data to fill this table can be randomly generated, no need to use blockchain data.

#[derive(Clone, Debug, Serialize, Deserialize)]
pub struct DummyWires {
#[serde(serialize_with = "serialize", deserialize_with = "deserialize")]
is_merge: BoolTarget,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that this flag doesn't need to be an input of the circuit: it's not meaningful in this case, so we can just hardcode it to false in the circuit?

/// Compute the row value digest of one table.
/// The parameter `digests` are computed by `compute_leaf_single_values_digest` or
/// `compute_leaf_mapping_values_digest`.
pub fn compute_table_row_digest(dimension: TableDimension, digests: &[Digest]) -> Digest {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this method should be used in cases when we prove no provenance of the data, I think it should be more generic and work for any table we might define, mimicking the source-agnostic computation that is done inside the rows tree circuits. More specifically, I think this method should take as input a table description (i.e., the set of columns in the table) and the table itself, and then:

  • Compute the digest of each row as D(D(id1||v1)+D(id2||v2)...)
  • Compute the cumulative digest of the table by summing up the digests of each row
    Indeed, in the case where we don't prove provenance, we might combine variables from a smart contract in more ways than what we support with provable extraction, so the computation of the digest should work for a generic table. Wdyt?

}

/// Merge the row value digests of multiple tables.
pub fn merge_table_row_digests(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the purpose of this function? There should be no longer a merge case when we don't prove the provenance, so there shouldn't be the need to compute the rows digest according to the merge case

/// Compute the metadata hash for a table including no provable extraction data.
/// The input is a metadata digest set by the caller, then generate the metadata hash
/// same as the DB computation.
pub fn no_provable_metadata_hash(metadata_digest: &Digest) -> MetadataHash {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wouldn't require the caller to compute the metadata digest: I think we can just define a method to compute such metadata_digest. Indeed, in this case the provenance is not proven, so the metadata hash is kind of pointless. For instance, we might simply compute it as D(id1||id2...||idN), where id1, id2, ..., idN are the identifiers of the columns of the table. So we could define a method like no_provenance_metadata_digest which takes as input the set of column identifiers and compute the metadata digest. This method would then just call no_provenance_metadata_digest providing as input the set of column identifiers, which will be requested as input in place of metadata_digest. Does it make sense?

block_hash: HashOutput,
prev_block_hash: HashOutput,
metadata_digest: Digest,
row_digest: Digest,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would avoid the caller to compute metadata_digest and row_digest. We could ask the caller to provide instead the inputs necessary to compute those digests (i.e., the set of column identifiers and the table itself), and then call inside this method the API functions we define to compute these digests (e.g. no_provenance_metadata_digest and `compute_row_digest). Wdyt?

@nicholas-mainardi nicholas-mainardi changed the base branch from main to fix/remove_old_query_circuits December 13, 2024 17:17
Base automatically changed from fix/remove_old_query_circuits to feat/batching_integration_test January 8, 2025 11:39
Base automatically changed from feat/batching_integration_test to feat/batching_circuits January 8, 2025 13:07
Base automatically changed from feat/batching_circuits to feat/batching_circuit_gadgets January 8, 2025 14:15
Base automatically changed from feat/batching_circuit_gadgets to feat/merkle-path-with-successor-gadget January 8, 2025 14:27
Base automatically changed from feat/merkle-path-with-successor-gadget to main January 9, 2025 13:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants