Eth2.0-specs: Avoid side effect in state transition with mutable object

Created on 7 May 2019  路  13Comments  路  Source: ethereum/eth2.0-specs

Issue

All the SSZ objects in pyspec are Pythonic-mutable.

  • Pros: allowing us to update state easily
  • Cons: sometimes it causes an unexpected or undesirable effect

For example, if there is a fuzzing test case with two blocks: the first block is invalid and the second block is valid. Although during the first block state transition it raises Exception, the mutable state might be updated.

How to reduce the side effect:

Option 1: Add protection in the state_transition function:

In #1018, the state_transition function is updated to:

https://github.com/ethereum/eth2.0-specs/blob/4c1073fa2f898de7ffe26fd8aa197d38c25a8f35/specs/core/0_beacon-chain.md#beacon-chain-state-transition-function

def state_transition(state: BeaconState, block: BeaconBlock) -> BeaconState:
    # Process slots (including those with no blocks) since block
    process_slots(state, block.slot)
    # Process block
    process_block(state, block)
    # Return post-state
    return state

We can change it to:

def state_transition(state: BeaconState, block: BeaconBlock) -> BeaconState:
    pre_state = copy.deepcopy(state)
    try:
        # Process slots (including those with no blocks) since block
        process_slots(state, block.slot)
        # Process block
        process_block(state, block)
        # Return post-state
    except Exception:
        return pre_state
    else:
        return state
  • Pros: easiest way to hack this case
  • Cons: probably other cases may happen in the test suite that we don't know yet

Option 2: Make all helper functions to be pure functions

Add copy.deepcopy(state) + return state in every helper function (maybe with decorator).

  • Pros: pure functions yay!
  • Cons: verbosity; it's possible that someone will forget to add it and cause other bugs

Option 3: Immutable everything

Generally, that's what Trinity is doing.

  • Pros: it kills all side effect
  • Cons: verbosity

Although Option 1 might be good enough, I raise the possibility of other options here before the Frozen鈩笍. 馃槃

Ctests

Most helpful comment

I'd say putting the deepcopy in the test files is cleaner than putting the deepcopy in the spec.

I definitely think rewriting the spec to be in pure functions will harm readability far too much to be a good idea.

All 13 comments

I like option 1 because it formalises the English-text comment "State transitions that trigger an unhandled exception (e.g. a failed assert or an out-of-range list access) are considered invalid." into code.

I'm not sure I'm a fan of option 1. It (1) brings in deepcopy which is pretty pythonic machinery into the spec, and (2) obfuscates the success or failure of the state transition. Returning an unmodified object is a pretty unexpected flag for an invalid block.

I would argue to either leave as is and require the caller to store whatever pre-state that wanted to remember _or_ to add the deepcopy at the start of state_transition to no mutate the argument, but still allow any failed assertion to bubble up uncaught.

There is an appeal to avoid introducing "heavy machinery" (deepcopy, try, except, ...) just for state_transition.

I have also been kind of bothered by all those functions that operate on state. But I agree that deepcopy would be a machinery that I wouldn't like to see in the spec.

I'm not sure if this is what Option 3 in @hwwhww's list refers to, but I think the cleanest options would be for the state transition functions to issue a number of "state updates" (basically state tree nodes with new value), which are to be all applied when all state transition functions returned without error. That way you avoid the "pure function" mess of having to copy everything that's in the state but still get (another) kind of pure function.

Oh good points!

We need to find a balance between clearly execute pyspec and clean spec. I think we should revert option 1 from #1018 in case it鈥檚 blocking.

I think a strong suggestion in the spec to have a way to apply changes only if they will be successful is enough (like commit-rollback for databases).

Still, the more can be done side-effect free the easier it is to test in isolation.

I'd say putting the deepcopy in the test files is cleaner than putting the deepcopy in the spec.

I definitely think rewriting the spec to be in pure functions will harm readability far too much to be a good idea.

I definitely think rewriting the spec to be in pure functions will harm readability far too much to be a good idea.

This is probably true.

Option 3b: Immutable everything, but with the help of a library such as pyrsistent to improve readability. Probably counts as "heavy machinery" too though.

Even that doesn't improve readability nearly enough.

Compare:

state.latest_state_roots[x] = y

And:

state = state.set("latest_state_roots", state.latest_state_roots.set(x, y))

And it gets worse for more deeply nested structures.

For nested structures there's transform:

state = state.transform(["latest_state_roots", x], y)

which is only a little harder to read and it doesn't get worse with additional nesting levels.

It looks like there's rough consensus for the status quo, at least for spec freeze.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

econoar picture econoar  路  41Comments

protolambda picture protolambda  路  23Comments

JustinDrake picture JustinDrake  路  12Comments

JustinDrake picture JustinDrake  路  24Comments

JustinDrake picture JustinDrake  路  12Comments