Pydantic: Support of parsing strings with non-decimal integers

Created on 24 Jul 2019 · 5Comments · Source: samuelcolvin/pydantic

Feature Request

I just realized that pydantic can't properly parse strings which contain non-decimal numbers:

from pydantic import BaseModel

class M(BaseModel):
    b: int
    f: int

bin_number = '0b10'
hex_number = '0xFF'
print(int(hex_number, 16) is 255)  # True

M(b=bin_number, f=hex_number)
'''
Traceback (most recent call last):
    ...
    raise ValidationError(errors)
pydantic.error_wrappers.ValidationError: 1 validation error
a
  value is not a valid integer (type=type_error.integer)
'''

Proposed solution

I don't know how to implement this yet, but this what first came in my mind:

from pydantic import BaseModel, BinInt, HexInt

class M(BaseModel):
    b: BinInt
    f: HexInt

And then I thought what if we could generate type for integer with any base at runtime just like this:

from pydantic import BaseModel, Int

class M(BaseModel):
    b: Int(2)
    f: Int(16)
# or
    b: Int[2]
    f: Int[16]

print(M(b='11', f='0xFF').f)
# Int(base=16): 0xFF (255)

What do you think about such feature?

feature request help wanted

Source

MrMrRobat

Most helpful comment

see https://github.com/samuelcolvin/pydantic/pull/683#discussion_r306730304

Personally I'd rather we did something like IntBase[16] etc.

samuelcolvin on 24 Jul 2019

👍2

All 5 comments

I think rather than creating a new int type, this could be supported by making use of the following feature:

print(int('0b100', base=0))  # 4
print(int('100', base=0))  # 100
print(int('0x100', base=0))  # 256

This would probably be a cheap extension to the existing int parser.

I just created a PR (#683) that uses base 0 for strings/bytes; @MrMrRobat would that address your use case?

dmontagu on 24 Jul 2019

see https://github.com/samuelcolvin/pydantic/pull/683#discussion_r306730304

Personally I'd rather we did something like IntBase[16] etc.

samuelcolvin on 24 Jul 2019

👍2

Thanks for quick response @dmontagu.

Personally, I agree with @samuelcolvin here. I guess we don't want to slow down or break existing decimal int parsing.

Also, it would be nice to have base-aware ints, so they can be represented in self base ~(don't know how implement this yet)~:

print(M(b='11', f='0xFF').f)
# Int(base=16): 0xFF (255)

This can be simply approached with:

In [6]: a = 16; f'a = 0b{a:b} or 0x{a:x}'                                       
Out[6]: 'a = 0b10000 or 0x10'

MrMrRobat on 24 Jul 2019

From @samuelcolvin's comment on #683

my only concern is int('0780') (or -0780) which was previously okay but would now be illegal.

I think it would be a good idea to add these as test cases for the int parser in a PR for this feature -- my (admittedly slow 😬) modification that would have broken them did not fail any tests.

dmontagu on 24 Jul 2019

yes, I just happened to think of it when reading the docs on int().

samuelcolvin on 24 Jul 2019

Was this page helpful?

0 / 5 - 0 ratings