Pydantic: incorrect datetime parsing result

Created on 21 Jan 2020  路  6Comments  路  Source: samuelcolvin/pydantic

Bug

Output of python -c "import pydantic.utils; print(pydantic.utils.version_info())":

>>> import pydantic.utils; print(pydantic.utils.version_info())
             pydantic version: 1.3
            pydantic compiled: True
                 install path: /home/dmig/.pyenv/versions/3.8.1/envs/marty-services-3.8.1/lib/python3.8/site-packages/pydantic
               python version: 3.8.1 (default, Dec 26 2019, 09:35:25)  [GCC 9.2.1 20191008]
                     platform: Linux-5.3.0-26-generic-x86_64-with-glibc2.29
     optional deps. installed: ['typing-extensions', 'email-validator']
...
>>> sys.version
'3.8.1 (default, Dec 26 2019, 09:35:25) \n[GCC 9.2.1 20191008]'

Actual results:

>>> from pydantic import datetime_parse
>>> datetime_parse.parse_date('13_18')
datetime.date(1970, 1, 1)
>>> datetime_parse.parse_datetime('13_18')
datetime.datetime(1970, 1, 1, 0, 21, 58, tzinfo=datetime.timezone.utc)
...

Expected results:

>>> from pydantic import datetime_parse
>>> datetime_parse.parse_date('13_18')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "pydantic/datetime_parse.py", line 106, in pydantic.datetime_parse.parse_date
pydantic.errors.DateError: invalid date format
>>> datetime_parse.parse_datetime('13_18')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "pydantic/datetime_parse.py", line 176, in pydantic.datetime_parse.parse_datetime
pydantic.errors.DateTimeError: invalid datetime format
...

This value '13_18' triggered a complicated bug in my application because I used a Union type for the value field:

value: Union[StrictBool, datetime, date, StrictInt, StrictFloat, None, str]

The problem here us a date/datetime parser which considers a string value to be some not valid date or time value. Theoretically '13_18' might be considered a valid time (13:18), but date parser should raise an exception here.

bug strictness

Most helpful comment

Thanks for reporting.

The reason for this is that

int('13_18') == 1318

As well as parsing date formats like 2020-02-20, pydantic's datetime (and date) parsing allows floats and ints and interprets them as unix timestamps. Pydantic's test for "is x a valid int" is int(x), hence the problem above.

From this comes two questions:

  1. What would the correct behaviour be?
  2. Can we switch to the correct behaviour now, or would that constitute a breaking change which needs to wait until v2?

The subject of "strictness" and what pydantic will and won't permit is a big and complex one without (as many people assume) a clear and simple answer. See all the issues under the "strictness" label for some context, or if you're suffering from insomnia.

I would suggest that the correct behaviour would be to only interpret values as unix timestamps if they are an int or float, not just if int() or float() work. However changing that now would be too big a change. We should move to that in v2.

For now we could not interpret values as unix timestamps unless they contain only [0-9.] or perhaps [0-9. ]? Are there other values except [0-9. ] that python permits in ints and floats?

All 6 comments

Thanks for reporting.

The reason for this is that

int('13_18') == 1318

As well as parsing date formats like 2020-02-20, pydantic's datetime (and date) parsing allows floats and ints and interprets them as unix timestamps. Pydantic's test for "is x a valid int" is int(x), hence the problem above.

From this comes two questions:

  1. What would the correct behaviour be?
  2. Can we switch to the correct behaviour now, or would that constitute a breaking change which needs to wait until v2?

The subject of "strictness" and what pydantic will and won't permit is a big and complex one without (as many people assume) a clear and simple answer. See all the issues under the "strictness" label for some context, or if you're suffering from insomnia.

I would suggest that the correct behaviour would be to only interpret values as unix timestamps if they are an int or float, not just if int() or float() work. However changing that now would be too big a change. We should move to that in v2.

For now we could not interpret values as unix timestamps unless they contain only [0-9.] or perhaps [0-9. ]? Are there other values except [0-9. ] that python permits in ints and floats?

I would suggest that the correct behaviour would be to only interpret values as unix timestamps if they are an int or float, not just if int() or float() work. However changing that now would be too big a change. We should move to that in v2.

Sounds reasonable for me. For now, I'll have to implement a custom StrictDatetime type.

For now we could not interpret values as unix timestamps unless they contain only [0-9.] or perhaps [0-9. ]? Are there other values except [0-9. ] that python permits in ints and floats?

ATM I've found only underscore, probably comma may be allowed depending on the locale settings

For now, I'll have to implement a custom StrictDatetime type

You could also use a simple validator, which rejects based on a regex or similar.

Here is the reason: https://www.python.org/dev/peps/pep-0515/

Is there a way to specify a variable type based on some external value?
F.e. I always get value of type str from the HTTP request, but sometimes I know the exact type it should be converted to. Thus I can avoid using unions (which I believe also cause performance issues).

I wonder if there is a simple way to specify a type dynamically? Or maybe a way to specify "metaproperty" which will not appear in the model, but depending on another property value, will assign the value with type conversion to one of value_str/value_int/value_bool/value_date?

Not simply, you could perhaps use generics or wait for discriminated union #619

There is something very close: https://pydantic-docs.helpmanual.io/usage/models/#parsing-data-into-a-specified-type

Suddenly found parse_obj_as -- that will help me a lot.

Was this page helpful?
0 / 5 - 0 ratings