Eth2.0-specs: Shard count recap and discussion

Created on 1 Sep 2020 · 3Comments · Source: ethereum/eth2.0-specs

We had some dicussions during previous PR reviews. This issue is a recap for discussing phase 1 shard count setting strategies.

Based on #2027 changes

Previous research:

https://ethresear.ch/t/registrations-shard-count-and-shuffling/2129: once upon a time when the committee size was fixed, @JustinDrake proposed that doubling the shard count when the pending validators registrations queue is very full.
https://ethresear.ch/t/a-proposal-for-structuring-committees-cross-links-etc/2118: once upon a time when the crosslink was infrequent (i.e., not crosslink at every slot), with committee size k and N active validators, @vbuterin proposed three solutions:
- Fix k = 135, set shard_count = N/k
- Fix shard_count = 100, set k = N/shard_count
- Fix k = shard_count = sqrt(N)

Note that the committee selection algorithm was very different from now.

Status-quo

Fixed TARGET_COMMITTEE_SIZE
A MAX_SHARDS := 2**10 as the upper bound of shard count.
A phase 1 INITIAL_ACTIVE_SHARDS := 2**6 as the initial number of shards.
A stub function get_active_shard_count that returns INITIAL_ACTIVE_SHARDS.

TBD

1. Dynamic shard count or fixed shard count (only update by hard forks)

Potential criteria for automatic dynamically upsizing:
1. If there are N active validators for E epochs, increase shard count to N // (SLOTS_PER_EPOCH * TARGET_COMMITTEE_SIZE).
2. If reaching (1), and the average gasprice is higher than P, increase shard count.
Fixed shard count:
- Only increase the shard count with hard forks. (with community consensus)
- We don’t need the stub size MAX_SHARDS in the spec.

2. Do we downsize shard count if a large number of validators become inactive?

If we go with dynamic shard count, it's also possible to "freeze" the shards with the lowest gasprice when we have a low validator participants rate.

Recap: since the TARGET_COMMITTEE_SIZE is fixed now, lacking enough validators results in infrequent crosslinking for all shards.
Pros of downsizing: The apocalypse-like networking situation that has 0 crosslinking for a long time, downsizing may be helpful to reduce traffics.
Cons of downsizing: complexities, unfair to the less active shards.

3. Initial shard count

We currently set 64 initial shards.
Capacity research: need more networking simulations and testnets benchmarking results to evaluate the capacity.
Market demand research: perhaps there is no strong demand of 64 shards initially.

I personally hope to make it simple (i.e., fixed shard count only increase shard count with hard forks), but also hope to hear more ideas. 🙂

discussion phase1

Source

hwwhww

👍1

Most helpful comment

Regardless of what path we take, I do think that there are things we can do to make phase 1 more friendly to changing shard count. Particularly, currently we compute the start shards of historical epochs by starting from state.current_epoch_start_shard and then walking backwards through each epoch, modular-subtracting the number of shards processed in that epoch. But this becomes much more unwieldy when the modulus that we are wrapping around changes.

So I propose instead, we store a variable state.total_committees_processed, which basically does the same thing as state.current_epoch_start_shard except without the modulo. That is, at the end of each epoch, set state.total_committees_processed += num_committees_in_this_epoch), and then we would get epoch start shard for any epoch by starting with that variable, subtracting the committee counts between the desired epoch and the present one, and taking the resulting variable modulo whatever the shard count is at that time.

This lets us modify shard counts nearly seamlessly; the only issue is that at a shard count change boundary a shard might occasionally be skipped and have to wait for the next cycle (1-64 slots), which is a small cost in any case.