Web3.py: Unified approach to string formatting.

Created on 29 Sep 2017  路  7Comments  路  Source: ethereum/web3.py

  • Version: 3.15.0
  • Python: 2.7/3.4/3.5

What was wrong?

We need a well though through string handling solution for bytesXX, bytes and string types being passed into functions as arguments.

Currently, users expect to be able to provide hex encoded strings to some of these methods which either fails due to these methods expecting bytestrings, or it appears to succeed when in fact it has sent the provided hex-string as-is.

How can it be fixed?

For web3.sha3 and web3.eth.sign we have a workable solution. See https://github.com/pipermerriam/web3.py/issues/289

For bytesXX we should be able to be smart and detect hex encoded values.

  • If the value is a byte string, validate len(v) == XX where XX is the bytesXX length.
  • If the value is a text string, assume hex encoding. validate len(v_without_0x_prefix) == 2 * XX where XX is the bytesXX length.

For bytes types.

  • If the value is a byte string, pass it through unmodified.
  • If the value is a text string, validate that it is a valid hexidecimal encoded value.

For string types.

  • If the value is a byte string, pass it through unmodified.
  • If the value is a text string, decode it as utf8 into it's bytes representation.

Most helpful comment

I've updated the initial issue description to reflect this change.

All 7 comments

  1. if the type of the value is bytes, pass it through as-is. This will allow certain silent failures in python2 but we're about to remove support so I'm ok with sweeping that under the rug.

Absolutely on board

  1. if the type of the value is text, it is required to be valid hexidecimal.

Mostly on board.

So lets look at all these cases for Python 3: (am I missing any critical ones for this discussion?)

  • python type of an argument to a contract function call -->

    • ABI type for the matching function call

  • bytes

    • bytes: pass through

    • bytesNN: pass through, after validating the length

    • string: pass through, after validating that the bytes can be UTF-8 decoded, else raise a validation error.

  • str

    • bytes: to_bytes(hexstr=arg)

    • bytesNN: to_bytes(hexstr=arg), and validate length

    • string: to_bytes(hexstr=arg)

Only the last bullet is problematic.

I would be very surprised for my str in python to be treated like hex when being sent to a string ABI type. Instead, I would expect:

  • str

    • string: to_bytes(text=arg) aka arg.encode('utf-8')

If I have a contract that has a setName(string) function in solidity, I expect to be able to call contract.setName("my name"). I expect it because I can send in the built-in python type for every other ABI type (int, bytes, bool, etc).

_PS~ I don't have a strong opinion about what we do with Python 2, as long as it doesn't add a ton of bloat._

I'm convinced. We treate string and bytes differently. And we need to have a big well written, easy to understand, easy to find place in the docs that explains this behavior and good error messages that point people int he right direction.

(Perhaps tangent, since bringing Populus and Solidity into the mix.)

If, for example, there was a contract called FormattingDemo on-chain with a function setString(string), and I wanted to call that function with argument "0xtentacle"; using web3.py and having an object demo for that contract's instance, - what would my Python code look like?

@carver's suggesting it should be

demo.transact().setString("0xtentacle"); # 30 78 74 65 6e 74 61 63 6c 65

What would your approach require, @pipermerriam?


To be clear on what Piper says in OP:

Our users _expect_ to be able to pass in hex encoded data to these methods.

This _might_ be true (pending a census on all users q:p), and I do maybe agree myself, EDIT: depending on what's "hex encoded data".

However, when calling a contract that has a function in its ABI specified as taking a string argument, I also _expect_ to be able to pass it a Python str.

@veox I believe what we are trying to support is both of the following:

  • demo.transact().setString("0xtentacle"); # that must be a unicode string (so in python2 u"0xtent...")
  • demo.transact().setString(b"\x30\x78\x74\x65\x6e\x74\x61\x63\x6c\x65");

Ah. Your laconic

I'm convinced.

got me thinking otherwise: that only the latter would be valid, and that the former would raise a Python exception.

I've updated the initial issue description to reflect this change.

Merged, and will be released with v4 beta, soonTM

Was this page helpful?
0 / 5 - 0 ratings