Description
Let's define:
c
as the number of columns (currently,128
)q
as the number of columns a node should custody (with a current minimum of4
)s
as the number of columns a node should sample (currently,16
)
According to the specifications:
If a node already has a column because of custody, it is not required to send out queries for that column.
We can consider 4 scenarios:
Scenario A
Scenario A-1: The s
columns to be sampled are chosen randomly from all available columns. If the node already custodies some of the columns to be sampled, these columns are then excluded from the sample set.
- Case 1:
0 <= q < s
: The node will sample betweens - q
ands
columns. - Case 2:
s <= q <= c
: The node will sample between0
andc - q
columns. Particularly, if thes
columns to be sampled are a subset of theq
custody columns, then the node won't sample any columns.
Scenario A-2: The node's custody columns are excluded before selecting the s
columns for sampling.
- Case 1:
0 <= q < c - s
: The node will sample exactlys
columns. - Case 2:
c - s <= q <= c
: The node will sample exactlyc - q
columns.
Scenario B (hypothesis: c
is an even number)
For scenario B, we introduce the following constraint: q + s <= c/2
.
This constraint is based on the fact that when q + s = c/2
, we can already reconstruct all the columns when sampling is successful. Therefore, having s > c/2 - q
is unnecessary.
Scenario B-1: The s
columns to be sampled are chosen randomly from all available columns. If the node already custodies some of the columns to be sampled, these columns are then excluded from the sample set.
- Case 1:
0 <= q < s
: The node will sample betweens - q
ands
columns. - Case 2:
s <= q < c/2
: The node will sample between 0 andc/2 - q
columns. Particularly, if thes
columns to be sampled are a subset of theq
custody columns, then the node won't sample any columns. - Case 3:
c/2 <= q <= c
: The node won't sample any column.
Scenario B-2: The node's custody columns are excluded before selecting the s
columns for sampling.
- Case 1:
0 <= q < c/2 - s
: The node will sample exactlys
columns. - Case 2:
c/2 - s <= q < c/2
: The node will sample exactlyc/2 - q
columns. - Case 3:
c/2 <= q <= c
: The node won't sample any column.
Advantage of scenarios A: The selection of columns to sample is independent of the node's custody set.
Disadvantages of this scenarios A: The number of columns to be sampled is uncertain and could be zero in Case 2.
Advantages of this scenarios B: The number of columns to be sampled is clearly determined by the values of c
, q
, and s
, and will only be zero when q = c
(for supernodes) for B-1 or when q >= c/2
for B-2.
Disadvantages of this scenarios B: The selection of columns to sample is not independent of the node's custody set.
Currently, Prysm implements scenario B-2 on its peerDAS
branch.
Question: Are there any recommendations or warnings regarding the use of any of the described scenarios?