I noticed that I can use a schema to flatten nested fields. So, for example, this works:
from pydantic import BaseModel, Schema
class Flatten(BaseModel):
one: str
two: str = Schema(..., alias='nested.two')
d = {
'one': 'one',
'nested': {'two': 'two'}
}
flattened = Flatten.parse_obj(d) # or Flatten.from_orm(d) if orm_mode is set
>>> flattened == <Flatten one='one', two='two'>
I'm wondering whether it is also possible to de-flatten an input that is flat.
So, for example,
class Customer(BaseModel):
name: str
address: str
class Offer(BaseModel):
offer_number: str
customer: Customer
...
d = {
'offer_number': '12345'
'customer_name': 'alice',
'customer_address': 'Springfield'
}
unflattened = Offer.parse_obj(d) # This, of course, doesn't work.
I'm also not sure whether there is a logical syntax to define here, but I did try setting a schema on the customer field, with no success.
I think not really, the closest we have (and I'm not sure it was a good design in the beginning) is DSN which builds a nested object by reading other fields during validation.
However this does require you to include the custom_name and customer_address field on Offer, which perhaps defeats the point.
Thanks! I actually do not need this feature right now, but I was thinking of designing a nested model like this, since it can make it easier to transform a normalized table structure from the database to a nested JSON response in an API.
Perhaps something like this would be conceivable?
class Customer(BaseModel):
name: str
address: str
class Offer(BaseModel):
offer_number: str
customer: Customer = Schema(..., flattened_prefix="customer_")
...
d = {
'offer_number': '12345'
'customer_name': 'alice',
'customer_address': 'Springfield'
}
unflattened = Offer.parse_obj(d)
100% agree it would be useful.
I think rust's serde library has something like this I've used in the past.
from pydantic import BaseModel, Schema
class Flatten(BaseModel):
one: str
two: str = Schema(..., alias='nested.two')
d = {
'one': 'one',
'nested': {'two': 'two'}
}
flattened = Flatten.parse_obj(d)
While running the above snippet I am getting following:
pydantic.error_wrappers.ValidationError: 1 validation error for Flatten
nested.two
field required (type=value_error.missing)
Is flattening of dict really supported? If yes how can I do it.
No it's not supported.
This is a discussion about a possible future feature.
@samuelcolvin @omrihar Hi!
I'm confused.
I noticed that I can use a schema to flatten nested fields. So, for example, this works:
...
I'm wondering whether it is also possible to de-flatten an input that is flat.
So flattening input data is currently not possible using Field(previously Schema)?
What would be the best way for a user to address this need currently in Pydantic?
Probably root_validators, pre or post.
Okay, thanks.
I would add a +1 to the "flatten input data" feature described first(i.e. using alias with dot notations). I can't see a better API, other than a more general approach of a per-field custom getter function(a bit like GetterDict in orm mode).
With respect to flattening, I was facing a similar issue. I share the way I solved it, in case anybody cares. I had an input dictionary d that I wanted flattened to Flatten, just like those declared by @omrihar,
class Flatten(BaseModel):
one: str
two: str = Schema(..., alias='nested.two')
d = {
'one': 'one',
'nested': {'two': 'two'}
}
I used scalpl to make such aliasing possible. This is a library that provides an API for navigating through nested dictionaries through a syntax precisely like the one in the alias. So, I used
import typing as T
from scalpl import Cut
from pydantic import BaseModel, Field
from pydantic.utils import GetterDict
class ProxyGetterDict(GetterDict):
def __getitem__(self, key: str) -> T.Any:
try:
return self._obj.get(key)
except AttributeError as e:
raise KeyError(key) from e
def get(self, key: T.Any, default: T.Any = None) -> T.Any:
return self._obj.get(key, default)
class Flatten(BaseModel):
one: str
two: str = Field(..., alias='nested.two')
class Config:
orm_mode = True
getter_dict = ProxyGetterDict
d = {
'one': 'one',
'nested': {'two': 'two'}
}
proxy = Cut(d)
Flatten.from_orm(proxy)
>>> Flatten(one='one', two='two')
@omrihar I can also see how this could come in handy for BaseSettings. Devs seem to enjoy nested JSON or YAML files for their app configuration and having (only a single) Prefix in the model's Config is somewhat limiting. In large monoliths with lots of settings, I would consider it a good practice to create extremely narrow settings models with only the relevant kv pairs for a particular use case. Thus when all you want to do is extract a few of many values from a field lower down the config hierarchy, the property.access saves one from writing many extra models.
I think this is usually considered as the JSONPath spec and it would supposedly make sense to follow that standard..? From the simplistic examples here it seems we also haven't considered whether this feature should be using the JSON/alias representation or rather the Python symbols 馃
Most helpful comment
Okay, thanks.
I would add a +1 to the "flatten input data" feature described first(i.e. using
aliaswith dot notations). I can't see a better API, other than a more general approach of a per-field custom getter function(a bit likeGetterDictin orm mode).