All the SSZ objects in pyspec are Pythonic-mutable.
For example, if there is a fuzzing test case with two blocks: the first block is invalid and the second block is valid. Although during the first block state transition it raises Exception, the mutable state might be updated.
state_transition function:In #1018, the state_transition function is updated to:
https://github.com/ethereum/eth2.0-specs/blob/4c1073fa2f898de7ffe26fd8aa197d38c25a8f35/specs/core/0_beacon-chain.md#beacon-chain-state-transition-function
def state_transition(state: BeaconState, block: BeaconBlock) -> BeaconState:
# Process slots (including those with no blocks) since block
process_slots(state, block.slot)
# Process block
process_block(state, block)
# Return post-state
return state
We can change it to:
def state_transition(state: BeaconState, block: BeaconBlock) -> BeaconState:
pre_state = copy.deepcopy(state)
try:
# Process slots (including those with no blocks) since block
process_slots(state, block.slot)
# Process block
process_block(state, block)
# Return post-state
except Exception:
return pre_state
else:
return state
Add copy.deepcopy(state) + return state in every helper function (maybe with decorator).
Generally, that's what Trinity is doing.
Although Option 1 might be good enough, I raise the possibility of other options here before the Frozen鈩笍. 馃槃
I like option 1 because it formalises the English-text comment "State transitions that trigger an unhandled exception (e.g. a failed assert or an out-of-range list access) are considered invalid." into code.
I'm not sure I'm a fan of option 1. It (1) brings in deepcopy which is pretty pythonic machinery into the spec, and (2) obfuscates the success or failure of the state transition. Returning an unmodified object is a pretty unexpected flag for an invalid block.
I would argue to either leave as is and require the caller to store whatever pre-state that wanted to remember _or_ to add the deepcopy at the start of state_transition to no mutate the argument, but still allow any failed assertion to bubble up uncaught.
There is an appeal to avoid introducing "heavy machinery" (deepcopy, try, except, ...) just for state_transition.
I have also been kind of bothered by all those functions that operate on state. But I agree that deepcopy would be a machinery that I wouldn't like to see in the spec.
I'm not sure if this is what Option 3 in @hwwhww's list refers to, but I think the cleanest options would be for the state transition functions to issue a number of "state updates" (basically state tree nodes with new value), which are to be all applied when all state transition functions returned without error. That way you avoid the "pure function" mess of having to copy everything that's in the state but still get (another) kind of pure function.
Oh good points!
We need to find a balance between clearly execute pyspec and clean spec. I think we should revert option 1 from #1018 in case it鈥檚 blocking.
I think a strong suggestion in the spec to have a way to apply changes only if they will be successful is enough (like commit-rollback for databases).
Still, the more can be done side-effect free the easier it is to test in isolation.
I'd say putting the deepcopy in the test files is cleaner than putting the deepcopy in the spec.
I definitely think rewriting the spec to be in pure functions will harm readability far too much to be a good idea.
I definitely think rewriting the spec to be in pure functions will harm readability far too much to be a good idea.
This is probably true.
Option 3b: Immutable everything, but with the help of a library such as pyrsistent to improve readability. Probably counts as "heavy machinery" too though.
Even that doesn't improve readability nearly enough.
Compare:
state.latest_state_roots[x] = y
And:
state = state.set("latest_state_roots", state.latest_state_roots.set(x, y))
And it gets worse for more deeply nested structures.
For nested structures there's transform:
state = state.transform(["latest_state_roots", x], y)
which is only a little harder to read and it doesn't get worse with additional nesting levels.
What about https://github.com/tobgu/pyrthon/?
It looks like there's rough consensus for the status quo, at least for spec freeze.
Most helpful comment
I'd say putting the deepcopy in the test files is cleaner than putting the deepcopy in the spec.
I definitely think rewriting the spec to be in pure functions will harm readability far too much to be a good idea.