EIP-7594: Ask for recommandation about sampling

Let's define:

- `c` as the number of columns (currently, `128`)
- `q` as the number of columns a node should custody (with a current minimum of `4`)
- `s` as the number of columns a node should sample (currently, `16`)

According to the [specifications](https://github.com/ethereum/consensus-specs/blob/dev/specs/_features/eip7594/das-core.md#sample-queries):
> If a node already has a column because of custody, it is not required to send out queries for that column.

We can consider 4 scenarios:

## Scenario A

### Scenario A-1: The `s` columns to be sampled are chosen randomly from all available columns. If the node already custodies some of the columns to be sampled, these columns are then excluded from the sample set.

- Case 1: `0 <= q < s`: The node will sample between `s - q` and `s` columns.
- Case 2: `s <= q <= c`: The node will sample between `0` and `c - q` columns. Particularly, if the `s` columns to be sampled are a subset of the `q` custody columns, then the node won't sample any columns.

### Scenario A-2: The node's custody columns are excluded *before* selecting the `s` columns for sampling.

- Case 1: `0 <= q < c - s`: The node will sample exactly `s` columns.
- Case 2: `c - s <= q <= c`: The node will sample exactly `c - q` columns.

## Scenario B (hypothesis: `c` is an even number)

For scenario B, we introduce the following constraint: `q + s <= c/2`.
This constraint is based on the fact that when `q + s = c/2`, we can already reconstruct all the columns when sampling is successful. Therefore, having `s > c/2 - q` is unnecessary.

### Scenario B-1: The `s` columns to be sampled are chosen randomly from all available columns. If the node already custodies some of the columns to be sampled, these columns are then excluded from the sample set.

- Case 1: `0 <= q < s`: The node will sample between `s - q` and `s` columns.
- Case 2: `s <= q < c/2`: The node will sample between 0 and `c/2 - q` columns. Particularly, if the `s` columns to be sampled are a subset of the `q` custody columns, then the node won't sample any columns.
- Case 3: `c/2 <= q <= c`: The node won't sample any column.

### Scenario B-2: The node's custody columns are excluded *before* selecting the `s` columns for sampling.

- Case 1: `0 <= q < c/2 - s`: The node will sample exactly `s` columns.
- Case 2: `c/2 - s <= q < c/2`: The node will sample exactly `c/2 - q` columns.
- Case 3: `c/2 <= q <= c`: The node won't sample any column.

**Advantage of scenarios A**: The selection of columns to sample is independent of the node's custody set.
**Disadvantages of this scenarios A**: The number of columns to be sampled is uncertain and could be zero in Case 2.

**Advantages of this scenarios B**: The number of columns to be sampled is clearly determined by the values of `c`, `q`, and `s`, and will only be zero when `q = c` (for supernodes) for B-1 or when `q >= c/2` for B-2.
**Disadvantages of this scenarios B**: The selection of columns to sample is not independent of the node's custody set.

Currently, Prysm implements scenario B-2 on its `peerDAS` branch.

**Question:** Are there any recommendations or warnings regarding the use of any of the described scenarios?


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

EIP-7594: Ask for recommandation about sampling #3825

Scenario A

Scenario A-1: The `s` columns to be sampled are chosen randomly from all available columns. If the node already custodies some of the columns to be sampled, these columns are then excluded from the sample set.

Scenario A-2: The node's custody columns are excluded before selecting the `s` columns for sampling.

Scenario B (hypothesis: `c` is an even number)

Scenario B-1: The `s` columns to be sampled are chosen randomly from all available columns. If the node already custodies some of the columns to be sampled, these columns are then excluded from the sample set.

Scenario B-2: The node's custody columns are excluded before selecting the `s` columns for sampling.

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

EIP-7594: Ask for recommandation about sampling #3825

Description

Scenario A

Scenario A-1: The s columns to be sampled are chosen randomly from all available columns. If the node already custodies some of the columns to be sampled, these columns are then excluded from the sample set.

Scenario A-2: The node's custody columns are excluded before selecting the s columns for sampling.

Scenario B (hypothesis: c is an even number)

Scenario B-1: The s columns to be sampled are chosen randomly from all available columns. If the node already custodies some of the columns to be sampled, these columns are then excluded from the sample set.

Scenario B-2: The node's custody columns are excluded before selecting the s columns for sampling.

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Scenario A-1: The `s` columns to be sampled are chosen randomly from all available columns. If the node already custodies some of the columns to be sampled, these columns are then excluded from the sample set.

Scenario A-2: The node's custody columns are excluded before selecting the `s` columns for sampling.

Scenario B (hypothesis: `c` is an even number)

Scenario B-1: The `s` columns to be sampled are chosen randomly from all available columns. If the node already custodies some of the columns to be sampled, these columns are then excluded from the sample set.

Scenario B-2: The node's custody columns are excluded before selecting the `s` columns for sampling.