Pydantic: subclasses of BaseModel can be hashable

Created on 11 Mar 2020  路  6Comments  路  Source: samuelcolvin/pydantic

Feature Request

Output of python -c "import pydantic.utils; print(pydantic.utils.version_info())":

             pydantic version: 1.4
            pydantic compiled: False
                 install path: /Users/username/Library/Caches/pypoetry/virtualenvs/virtualenvname/lib/python3.6/site-packages/pydantic
               python version: 3.6.4 (default, Mar  1 2018, 18:36:50)  [GCC 4.2.1 Compatible Apple LLVM 9.0.0 (clang-900.0.39.2)]
                     platform: Darwin-18.7.0-x86_64-i386-64bit
     optional deps. installed: ['typing-extensions']

I use Pydantic extensively in place of dataclasses throughout my projects. It would be nice to be able to use some of the simpler types as dict keys, or to put into sets:

import pydantic

class Foo(pydantic.BaseModel):
  foo: str = "foo"

d = { Foo(): "bar" }

I tried writing a superclass/mixin to selectively add this behavior to existing models:

class HashableMixin:
    def __hash__(self):
        return hash(
            (type(self),) + tuple(getattr(self, f) for f in self.__fields__.keys())
        )

though this particular implementation has struggles, as it doesn't work when it's not the first in the list of inherited classes. I think this has something to do with Pydantic's initialization and maybe metaclasses, but I didn't dig too deep. So I wrote it as a decorator instead:

def hashable(cls):
    def h(self):
        return hash(
            (type(self),) + tuple(getattr(self, f) for f in self.__fields__.keys())
        )
    setattr(cls, "__hash__", h)
    return cls

which seems to work more or less alright, though I haven't really run it through its paces so I don't know if I've missed anything.

Anyway, it would be great to have this baked in, even if it were default off. Maybe with something on Config?

feature request

Most helpful comment

By the way, you can use the pydantic dataclass support for this, maybe it's sufficient?

from pydantic.dataclasses import dataclass as pyd_dataclass

@pyd_dataclass(eq=True, frozen=True)
class Foo:
  foo: str = "foo"

d = { Foo(): "bar" }

All 6 comments

I would create your own MyBaseModel and use that in place of BaseModel to accomplish this.

from pydantic import BaseModel

class MyBaseModel(BaseModel):
    def __hash__(self):
        return hash((type(self),) + tuple(self.__dict__.values()))

class Foo(MyBaseModel):
    foo: str = 'foo'

f = Foo()
d = {f: 'bar'}
print(d)

Anyway, it would be great to have this baked in, even if it were default off. Maybe with something on Config?

I would argue that this is:

  1. very easy to achieve as demonstrated above
  2. Not that common requirement
  3. Often would require custom implementations for the tradeoff of performance vs. completeness (e.g. accepting more complex field values, non-hashable fields, sub models, __fields_set__)

You'd basically need to implement an entire hashable subset of python e.g. for lists, dicts, custom types etc.

So let's stick with the above of people people implementing their own solution for now.

Yeah, I understand the desire to not bloat the API surface area. I was just missing this feature from dataclasses.

FWIW I don't think you'd need to implement a hashable subset of the standard library: I don't consider (or want) models with lists to be hashable, which is a nice side effect of the implementation above that just forwards to the tuple hash function.

Anyway, for anybody who decides to do this on their own, note that you should also look into the "immutability" flag that Pydantic offers. I'm only trying to prevent me from shooting myself in the foot so it doesn't need to be watertight (and indeed, it's a pain to try to make it so when subclasses are involved), but it's good hygiene to require the flag to be set.

  1. Often would require custom implementations for the tradeoff of performance vs. completeness (e.g. accepting more complex field values, non-hashable fields, sub models, __fields_set__)

This makes a lot of sense. It's just counter intuitive to me, because I treat pydantic.BaseModel as a drop-in replacement of dataclass.

Can you please document __hash__ is unimplemented for pydantic.BaseModel?

Pr welcome to add it to documentation.

By the way, you can use the pydantic dataclass support for this, maybe it's sufficient?

from pydantic.dataclasses import dataclass as pyd_dataclass

@pyd_dataclass(eq=True, frozen=True)
class Foo:
  foo: str = "foo"

d = { Foo(): "bar" }

Building on antonl's response, here is a custom decorator in pydantic's style:

import typing
import pydantic

def hashable_dataclass(_cls: typing.Optional[typing.Type[typing.Any]] = None,
*,
init: bool = True,
repr: bool = True,
order: bool = False,
unsafe_hash: bool = False,
config: typing.Type[typing.Any] = None,
) -> typing.Union[typing.Callable[[typing.Type[typing.Any]], typing.Type['Dataclass']], typing.Type['Dataclass']]:
def wrap(cls: typing.Type[typing.Any]) -> typing.Type['Dataclass']:
return pydantic.dataclasses.dataclass(cls, init=init, repr=repr, eq=True, order=order, unsafe_hash=unsafe_hash,
frozen=True, config=config)

if _cls is None:
    return wrap

return wrap(_cls)
Was this page helpful?
0 / 5 - 0 ratings

Related issues

Yolley picture Yolley  路  18Comments

kryft picture kryft  路  35Comments

jasonkuhrt picture jasonkuhrt  路  19Comments

demospace picture demospace  路  26Comments

jasonkuhrt picture jasonkuhrt  路  21Comments