Skip to content

EIP-7594: Ask for recommandation about sampling #3825

Closed
@nalepae

Description

@nalepae

Let's define:

  • c as the number of columns (currently, 128)
  • q as the number of columns a node should custody (with a current minimum of 4)
  • s as the number of columns a node should sample (currently, 16)

According to the specifications:

If a node already has a column because of custody, it is not required to send out queries for that column.

We can consider 4 scenarios:

Scenario A

Scenario A-1: The s columns to be sampled are chosen randomly from all available columns. If the node already custodies some of the columns to be sampled, these columns are then excluded from the sample set.

  • Case 1: 0 <= q < s: The node will sample between s - q and s columns.
  • Case 2: s <= q <= c: The node will sample between 0 and c - q columns. Particularly, if the s columns to be sampled are a subset of the q custody columns, then the node won't sample any columns.

Scenario A-2: The node's custody columns are excluded before selecting the s columns for sampling.

  • Case 1: 0 <= q < c - s: The node will sample exactly s columns.
  • Case 2: c - s <= q <= c: The node will sample exactly c - q columns.

Scenario B (hypothesis: c is an even number)

For scenario B, we introduce the following constraint: q + s <= c/2.
This constraint is based on the fact that when q + s = c/2, we can already reconstruct all the columns when sampling is successful. Therefore, having s > c/2 - q is unnecessary.

Scenario B-1: The s columns to be sampled are chosen randomly from all available columns. If the node already custodies some of the columns to be sampled, these columns are then excluded from the sample set.

  • Case 1: 0 <= q < s: The node will sample between s - q and s columns.
  • Case 2: s <= q < c/2: The node will sample between 0 and c/2 - q columns. Particularly, if the s columns to be sampled are a subset of the q custody columns, then the node won't sample any columns.
  • Case 3: c/2 <= q <= c: The node won't sample any column.

Scenario B-2: The node's custody columns are excluded before selecting the s columns for sampling.

  • Case 1: 0 <= q < c/2 - s: The node will sample exactly s columns.
  • Case 2: c/2 - s <= q < c/2: The node will sample exactly c/2 - q columns.
  • Case 3: c/2 <= q <= c: The node won't sample any column.

Advantage of scenarios A: The selection of columns to sample is independent of the node's custody set.
Disadvantages of this scenarios A: The number of columns to be sampled is uncertain and could be zero in Case 2.

Advantages of this scenarios B: The number of columns to be sampled is clearly determined by the values of c, q, and s, and will only be zero when q = c (for supernodes) for B-1 or when q >= c/2 for B-2.
Disadvantages of this scenarios B: The selection of columns to sample is not independent of the node's custody set.

Currently, Prysm implements scenario B-2 on its peerDAS branch.

Question: Are there any recommendations or warnings regarding the use of any of the described scenarios?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions