-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Add Cell Dissemination via Partial Message Specification #4558
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
specs/fulu/das-core.md
Outdated
| #### `PartialDataColumnSidecar` | ||
|
|
||
| ```python | ||
| class PartialDataColumnSidecar(Container): | ||
| # Only provided if the index can not be inferred from the Gossipsub topic | ||
| index: ColumnIndex | None | ||
| # Encoded the same as an IHAVE bitmap | ||
| cells_present_bitmap: ByteVector[NUMBER_OF_COLUMNS/8] # ceiling if NUMBER_OF_COLUMNS is not divisible by 8 | ||
| column: List[Cell, MAX_BLOB_COMMITMENTS_PER_BLOCK] | ||
| kzg_proofs: List[KZGProof, MAX_BLOB_COMMITMENTS_PER_BLOCK] | ||
| # The following are only provided on eager pushes. | ||
| kzg_commitments: None | List[KZGCommitment, MAX_BLOB_COMMITMENTS_PER_BLOCK] | ||
| signed_block_header: None | SignedBeaconBlockHeader | ||
| kzg_commitments_inclusion_proof: None | Vector[Bytes32, KZG_COMMITMENTS_INCLUSION_PROOF_DEPTH] | ||
| ``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there's three possible ways to go with this:
- Add an Optional SSZ type, which we are currently missing (see https://ethereum-magicians.org/t/eip-6475-ssz-optional/12891)
class PartialDataColumnSidecar(Container):
index: ColumnIndex
cells_bitmap: Bitlist[MAX_BLOB_COMMITMENTS_PER_BLOCK]
cells: List[Cell, MAX_BLOB_COMMITMENTS_PER_BLOCK]
kzg_proofs: List[KZGProof, MAX_BLOB_COMMITMENTS_PER_BLOCK]
kzg_commitments: Optional[List[KZGCommitment, MAX_BLOB_COMMITMENTS_PER_BLOCK]]
signed_block_header: Optional[SignedBeaconBlockHeader]
kzg_commitments_inclusion_proof: Optional[Vector[Bytes32, KZG_COMMITMENTS_INCLUSION_PROOF_DEPTH]- Separate the data part from the kzg commitments, header and proof part
class PartialDataColumnSidecar(Container):
index: ColumnIndex
cells_bitmap: Bitlist[MAX_BLOB_COMMITMENTS_PER_BLOCK]
cells: List[Cell, MAX_BLOB_COMMITMENTS_PER_BLOCK]
kzg_proofs: List[KZGProof, MAX_BLOB_COMMITMENTS_PER_BLOCK]
class KZGCommitmentsSidecar(Container):
kzg_commitments: List[KZGCommitment, MAX_BLOB_COMMITMENTS_PER_BLOCK]
signed_block_header: SignedBeaconBlockHeader
kzg_commitments_inclusion_proof: Vector[Bytes32, KZG_COMMITMENTS_INCLUSION_PROOF_DEPTH]- Have two different types for a column with and without committments (hopefully with nicer names)
class PartialDataColumnSidecar(Container):
index: ColumnIndex
cells_bitmap: Bitlist[MAX_BLOB_COMMITMENTS_PER_BLOCK]
cells: List[Cell, MAX_BLOB_COMMITMENTS_PER_BLOCK]
kzg_proofs: List[KZGProof, MAX_BLOB_COMMITMENTS_PER_BLOCK]
class PartialDataColumnSidecarWithCommitments(Container):
index: ColumnIndex
cells_bitmap: Bitlist[MAX_BLOB_COMMITMENTS_PER_BLOCK]
cells: List[Cell, MAX_BLOB_COMMITMENTS_PER_BLOCK]
kzg_proofs: List[KZGProof, MAX_BLOB_COMMITMENTS_PER_BLOCK]
kzg_commitments: List[KZGCommitment, MAX_BLOB_COMMITMENTS_PER_BLOCK]
signed_block_header: SignedBeaconBlockHeader
kzg_commitments_inclusion_proof: Vector[Bytes32, KZG_COMMITMENTS_INCLUSION_PROOF_DEPTH]Also:
- changed
ByteVectortoBitlist(but could also beBitvector) NUMBER_OF_COLUMNSisn't the length/max-length of a column, should beMAX_BLOB_COMMITMENTS_PER_BLOCK- I'd just leave the
column_indexin all the time, 8 bytes doesn't seem worth worrying about imho
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Preference for 1, as the having two containers makes things a bit more annoying as you need to then define when one or the other (or both) are transmitted.
I'd just leave the column_index in all the time, 8 bytes doesn't seem worth worrying about imho
The only reason we would need it is when the subnet != column index, right? 8 bytes is small, but a couple bytes here and a couple bytes there and soon we have a whole packet of redundant data. My preference might be to just remove this field and leave it as future work to define how to derive a column index when the subnet count != column count.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
1 may seem better but the other options move the complexity from the type system (and the requisite push to add a new concept to the ssz spec) to other layers of the stack
that said, we have have Option as a non-canonical type for some time now, and it may be time to go ahead and add formal support for it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Including the commitments here is an optimization. We could rely on just receiving them in the block body.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Something worth paying attention to here is also how DataColumnSidedcar evolves in the gloas spec (though these changes will likely pre-date gloas). At the moment it seems like header + inclusion proof will be removed in favor of beacon_block_root only, requiring the beacon block to be seen in advance (though this still will not contain the kzg commitments), which is somewhat natural with epbs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could do the same thing here, requiring the block (or a full sidecar) to have been seen before receiving a partial column (and excluding the commitments, since at this point they're still in the beacon block). Similar to what you did below but with beacon_block_root instead of the whole signed_beacon_block_header
class PartialDataColumnSidecar(Container):
index: ColumnIndex
cells_present_bitmap: Bitlist[MAX_BLOB_COMMITMENTS_PER_BLOCK]
partial_column: List[Cell, MAX_BLOB_COMMITMENTS_PER_BLOCK]
kzg_proofs: List[KZGProof, MAX_BLOB_COMMITMENTS_PER_BLOCK]
beacon_block_root: RootThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is what I am referring to re: gloas specs btw https://github.com/ethereum/consensus-specs/pull/4527/files
2e1e7f1 to
6113cf2
Compare
raulk
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While this mechanistically introduces cell-level deltas, how they precisely integrate into existing PeerDAS processes isn't very clear. How about a section covering the end-to-end behaviour:
- When nodes push
DataColumnSidecars - When nodes send partial IWANTs
- When nodes send partial IHAVEs
- When nodes send
PartialDataColumnSidecars - Dependency on
getBlobsV3
8be8bff to
829184f
Compare
829184f to
8c70c5b
Compare
8c70c5b to
c88bab2
Compare
c88bab2 to
e129681
Compare
|
We've been in the process of implementing this and have run into a few issues specifically around the validation. We probably need to think about and add the validation rules to this spec. Our current thinking is to probably include the kzg-commitments into the cell data so that we can verify each cell as we get them. If we don't have them in there (as the current design), we have to wait for the block data before we can verify if the cells are accurate or not. This also entails adding some kind of cache to keep track of which peer sent want in order to penalize them at a future time if we find out the data is invalid. This then opens up a few attack vectors, where peers can send us invalid data, filling the cache preventing us getting valid data etc. We would have to be careful with the cache also, to account for periods of non-finality with a high degree of forking where there can be many valid chains. Currently it seems the complexity of handing all these edge cases might outweigh the relatively small bandwidth savings we are making by excluding the kzg commitments. Maybe there is a clear implementation path here we are missing however. cc @dknopik |
Agreed.
One thing to point out is that this is only an issue for cells you get via an eager push. Without eager push, peers only give you cells that they know you’re missing, and giving a peer this information only happens after getting the block.
The cache is only relevant in the eager push case. If the cache is full, it wouldn’t prevent you from getting valid data after you’ve received you block.
I’m unfamiliar with this scenario. Can you explain a bit more please? Do we expect multiple valid proposers as well?
No strong opinion on what we do here overall, but I’ll point out that the gloas spec also removes some of the fields and requires the validator to wait for the builder’s bid first.
My plan was to limit the eager push cache to 2-3 messages per topic. On first partial publish, we can validate the cache and down score peers appropriately. I don’t think it’s an interesting attack to poison this cache as it, worst case, costs the victim 1 RTT of delay to receive the cells, at the expense of getting down scored and pruned. That said, if we wanted to include additional information for validating eager pushes of cells before we see a blob we could send them the relevant parts (KZGCommitments, Inclusion Proof, Signed Beacon block header) only when it’s an eager push. @fradamt, do you have any extra context on why gloas removes those fields useful for validating the column before the block in the DataColumnSidecar? |
I don't see how this is viable. Cells are 2 KBs (and might get down to even 1 KB in the future), so including all KZG commitments means 48 KBs * number of blobs, which is 100% overhead as soon as we get to ~40 blobs. The bandwidth saving are very large, if we care about being able to efficiently send individual cells and not just large-ish partial columns (which imo is essential). Something else we could do is to only include the class PartialDataColumnSidecar(Container):
cells_present_bitmap: Bitlist[MAX_BLOB_COMMITMENTS_PER_BLOCK]
partial_column: List[Cell, MAX_BLOB_COMMITMENTS_PER_BLOCK]
kzg_proofs: List[KZGProof, MAX_BLOB_COMMITMENTS_PER_BLOCK]
kzg_commitments: List[KZGCommitment MAX_BLOB_COMMITMENTS_PER_BLOCK]
inclusion_proofs: List[Vector[Bytes32, SINGLE_KZG_COMMITMENT_INCLUSION_PROOF_DEPTH], MAX_BLOB_COMMITMENTS_PER_BLOCK]
```python
def verify_partial_data_column_sidecar(sidecar: PartialDataColumnSidecar) -> bool:
"""
Verify if the KZG proofs are correct.
"""
# The column index also represents the cell index
cell_indices = [i for i, b in enumerate(cells_present_bitmap) if b]
# Batch verify that the cells match the corresponding commitments and proofs
verify_cell_kzg_proof_batch(
commitments_bytes=sidecar.kzg_commitments,
cell_indices=cell_indices,
cells=sidecar.column,
proofs_bytes=sidecar.kzg_proofs,
)
def verify_data_column_sidecar_inclusion_proofs(sidecar: PartialDataColumnSidecar) -> bool:
"""
Verify if the given KZG commitments included in the given beacon block.
"""
return all(
is_valid_merkle_branch(
leaf=hash_tree_root(sidecar.kzg_commitments[i]),
branch=sidecar.kzg_commitments_inclusion_proofs[i],
depth=SINGLE_KZG_COMMITMENT_INCLUSION_PROOF_DEPTH,
index=get_subtree_index(get_generalized_index(BeaconBlockBody, "blob_kzg_commitments", i)),
root=sidecar.signed_block_header.message.body_root,
)
for i in range(len(sidecar.kzg_commitments))
)
Yes, the thinking was that in gloas the beacon block is explicitly intended to come first, the columns are not even supposed to be sent before the beacon block has reached everyone (because the structure of the slot becomes propagate beacon block -> attest -> propagate payload and blobs -> ptc vote). However, for gloas purposes we could go back to "self-verifying columns" (kzg commitments + inclusion proof) if the added complexity turns out to not be worth it (I don't think the current work on gloas devnet0 has surfaced these issues, so maybe worth bringing it up to people working on it). Whether we do that or not doesn't change things for partial messages though, including all kzg commitments + inclusion proof is fine in a column since there's only one kzg commitment per cell, it is not fine in a small partial message. |
| ```python | ||
| class PartialDataColumnSidecar(Container): | ||
| cells_present_bitmap: Bitlist[MAX_BLOB_COMMITMENTS_PER_BLOCK] | ||
| partial_column: List[Cell, MAX_BLOB_COMMITMENTS_PER_BLOCK] | ||
| kzg_proofs: List[KZGProof, MAX_BLOB_COMMITMENTS_PER_BLOCK] | ||
| ``` | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The block root is the partial message group_id, which is sent separately from the message data. So adding it to the message is redundant.
Co-authored-by: fradamt <[email protected]>
In draft until the libp2p/specs work is merged and we have multiple implementations of this. Also requires TODOs to be addressed.
This PR specifies how consensus clients use Gossipsub's Partial Messages to disseminate cells rather than only full columns.
More context in this ethresearch post.