We had some dicussions during previous PR reviews. This issue is a recap for discussing phase 1 shard count setting strategies.
Based on #2027 changes
k and N active validators, @vbuterin proposed three solutions:k = 135, set shard_count = N/kshard_count = 100, set k = N/shard_countk = shard_count = sqrt(N)Note that the committee selection algorithm was very different from now.
TARGET_COMMITTEE_SIZEMAX_SHARDS := 2**10 as the upper bound of shard count.INITIAL_ACTIVE_SHARDS := 2**6 as the initial number of shards. get_active_shard_count that returns INITIAL_ACTIVE_SHARDS.N active validators for E epochs, increase shard count to N // (SLOTS_PER_EPOCH * TARGET_COMMITTEE_SIZE).P, increase shard count.MAX_SHARDS in the spec.If we go with dynamic shard count, it's also possible to "freeze" the shards with the lowest gasprice when we have a low validator participants rate.
TARGET_COMMITTEE_SIZE is fixed now, lacking enough validators results in infrequent crosslinking for all shards.64 initial shards.I personally hope to make it simple (i.e., fixed shard count only increase shard count with hard forks), but also hope to hear more ideas. 馃檪
Another con for downsizing is application level complexity. Dapps will know when and how to migrate
I personally hope to make it simple (i.e., fixed shard count only increase shard count with hard forks)
yeah, I'd prefer this as well
Regardless of what path we take, I do think that there are things we can do to make phase 1 more friendly to changing shard count. Particularly, currently we compute the start shards of historical epochs by starting from state.current_epoch_start_shard and then walking backwards through each epoch, modular-subtracting the number of shards processed in that epoch. But this becomes much more unwieldy when the modulus that we are wrapping around changes.
So I propose instead, we store a variable state.total_committees_processed, which basically does the same thing as state.current_epoch_start_shard except without the modulo. That is, at the end of each epoch, set state.total_committees_processed += num_committees_in_this_epoch), and then we would get epoch start shard for any epoch by starting with that variable, subtracting the committee counts between the desired epoch and the present one, and taking the resulting variable modulo whatever the shard count is at that time.
This lets us modify shard counts nearly seamlessly; the only issue is that at a shard count change boundary a shard might occasionally be skipped and have to wait for the next cycle (1-64 slots), which is a small cost in any case.
Most helpful comment
Regardless of what path we take, I do think that there are things we can do to make phase 1 more friendly to changing shard count. Particularly, currently we compute the start shards of historical epochs by starting from
state.current_epoch_start_shardand then walking backwards through each epoch, modular-subtracting the number of shards processed in that epoch. But this becomes much more unwieldy when the modulus that we are wrapping around changes.So I propose instead, we store a variable
state.total_committees_processed, which basically does the same thing asstate.current_epoch_start_shardexcept without the modulo. That is, at the end of each epoch, setstate.total_committees_processed += num_committees_in_this_epoch), and then we would get epoch start shard for any epoch by starting with that variable, subtracting the committee counts between the desired epoch and the present one, and taking the resulting variable modulo whatever the shard count is at that time.This lets us modify shard counts nearly seamlessly; the only issue is that at a shard count change boundary a shard might occasionally be skipped and have to wait for the next cycle (1-64 slots), which is a small cost in any case.