Marshmallow: Schema generator for data classes

Created on 17 Apr 2018 · 14Comments · Source: marshmallow-code/marshmallow

Python 3.7 will come with a new feature called Data Classes (see PEP 557). It is available for Python 3.6 on PyPI for demonstration purposes.

It looks like this:

@dataclass
class Artist:
    name: str

@dataclass
class Album:
    title: str
    release_date: datetime.date
    artist: List[Artist]

You can immediately notice the resemblance with marshmallow:

class ArtistSchema(Schema):
    name = fields.Str()

class AlbumSchema(Schema):
    title = fields.Str()
    release_date = fields.Date()
    artist = fields.Nested(ArtistSchema())

I think for most cases there's enough information in the dataclass for marshmallow to figure out the schemas by itself. I admit that the schema might change while the model will stay the same but for most cases, it won't be a problem.

My idea is to leverage on the dataclass to build a Schema automatically for DRY purposes when it makes sense while still getting the features of marshmallow under the hood. What's your opinion on that? Do you have an idea of implementation so it can be both DRY and extensible?

I may be able to work on a PR for this but I want to go in the right direction.

Source

Diaoul

❤7

Most helpful comment

I published a library that does exactly that: generating schemas from dataclasses.

marshmallow-dataclass

from marshmallow_dataclass import dataclass # Importing from marshmallow_dataclass instead of dataclasses
from datetime import datetime

@dataclass
class Artist:
    name: str

@dataclass
class Album:
    title: str
    release_date: datetime
    artist: List[Artist]

Album.Schema # This is a valid marshmallow Schema class that you can use

lovasoa on 5 Feb 2019

❤13

All 14 comments

You actually don't need @dataclass to use type annotations on classes. PEP 526 landed support for type annotations on class properties in Python 3.6.

I personally don't use type annotations for any of my projects yet, but I like the idea of having the option to use native Python syntax for schema declarations. Fields are types, so it would be nice to treat them as such. The main limitation I see is that the builtin types don't provide a way to express field configuration.

Can you provide an example of how you would like to see marshmallow interacting with a data class?

Here is what I was imaging:

class ArtistSchema(Schema):
    name: fields.Str()

class AlbumSchema(Schema):
    title: fields.Str()
    release_date: fields.Date()
    artist: fields.Nested(ArtistSchema())

This would actually eliminate the need for a nested Meta class, because type annotations are stored separately from actual attributes.

class Foo(Schema):
    name: fields.Str()
    only: fields.Int()

    only = ['name']

Foo.__annotations__
# {'name': <fields.String(...)>, 'only': <fields.Integer(...)>}

Foo.only
# ['name']

/cc @sloria

deckar01 on 17 Apr 2018

@deckar01 https://github.com/justanr/marshmallow-annotations

I did that as kind of a joke a while ago, but if it's actually something useful and people want, I'd be open to either expanding it or PRing it into marshmallow if there's plans to drop py2

justanr on 17 Apr 2018

It probably isn't of much use to the core due to the 3.6+ requirement and it would be weird to allow both styles simultaneously. I might look into making marshmallow-annotations move class attributes to meta attributes.

deckar01 on 17 Apr 2018

🎉1

My idea was more something like this:

@dataclass
class Artist:
    name: str

@dataclass
class Album:
    title: str
    release_date: datetime.date
    artist: List[Artist]

@schema_for(Artist)
class ArtistSchema:
    pass

@schema_for(Album, required=True)  # global options for auto-generated fields
class AlbumSchema:
    title: fields.Str(data_key='album_title')  # override

This is just a helper so backward compatibility isn't an issue, it's all opt-in. It would also take care of adding the @post_load method so that objects are loaded / dumped as classes rather than plain dicts.

Or something like this even more DRY that does both with more magic:

@schema_dataclass
class Artist:
    name: str

@schema_dataclass(required=True)  # global options for auto-generated fields
class Album:
    title: str
    release_date: datetime.date
    artist: List[Artist]

    class Schema:  # override
        title: fields.Str(data_key='album_title')

This time the dataclass would act as a regular class but enhanced with load / dump methods with marshmallow under the hood.
I like this approach less as it is trying to mix both the class and the schema in a single object.

Diaoul on 17 Apr 2018

👍1

That looks similar to marshmallow-sqlalchemy. https://github.com/marshmallow-code/marshmallow-sqlalchemy

Instead of Meta.model you could use a dataclass property:

from marshmallow_annotations import Schema

@dataclass
class Artist:
    name: str

@dataclass
class Album:
    title: str
    release_date: datetime.date
    artist: List[Artist]

class ArtistSchema(Schema):
    dataclass = Artist

class AlbumSchema:
    title: fields.Str(data_key='album_title')  # override

    dataclass = Album
    required = True # global options for auto-generated fields

I would recommend opening an issue on https://github.com/justanr/marshmallow-annotations to continue this conversation unless anyone feels strongly about this being part of the core.

deckar01 on 17 Apr 2018

I think the point you made about the syntax being 3.6 only is a good reason not to mainline it into marshmallow. Doubly so for dataclasses being 3.7 only (and that's not even been properly released yet).

justanr on 18 Apr 2018

Agreed; this doesn't belong in marshmallow core. Closing this for now.

sloria on 18 Apr 2018

I published a library that does exactly that: generating schemas from dataclasses.

marshmallow-dataclass

from marshmallow_dataclass import dataclass # Importing from marshmallow_dataclass instead of dataclasses
from datetime import datetime

@dataclass
class Artist:
    name: str

@dataclass
class Album:
    title: str
    release_date: datetime
    artist: List[Artist]

Album.Schema # This is a valid marshmallow Schema class that you can use

lovasoa on 5 Feb 2019

❤13

Consuming objects from the typing module has been a pretty unpleasant experience for me. Some of the most fundamental operations necessary for working with generics have no public interface. The private properties I found in the code to hack something together have consistently changed in breaking ways between minor python releases.

Until they drop the provisional classification, the typing module imposes a maximum python version constraint on any library that depends on its API.

deckar01 on 5 Feb 2019

@deckar01 : I haven't experienced issues with that yet. The code is tested and works on all minor versions of python 3.7 and 3.8.

lovasoa on 6 Feb 2019

3.6 -> 3.7 releases had breaking changes. 3.8 is still in alpha and could break compatibility at any time before the final release several months from now. As it is now, the typing module will still be provisional in 3.8.

deckar01/marsha@50dbdcc5

https://www.python.org/dev/peps/pep-0569/#schedule

https://docs.python.org/3.8/library/typing.html

__origin__ and __args__ are undocumented, yet are the only way to inspect generics.

https://docs.python.org/3.7/library/typing.html

The issue for stabilizing this API seems to have stalled waiting for 3rd party packages to become PEP candidates. typing_inspect is experimental and its primary purpose isn't really to maintain cross-version compatibility, but it is still probably a safer option than using the private interface of the typing module.

https://bugs.python.org/issue29262

https://github.com/ilevkivskyi/typing_inspect

I'm not suggesting that the typing module shouldn't be used, but supporting it in a library will come with more maintenance overhead than normal. Without a mechanism to enforce which python versions can be used, it's probably a good idea to document any library that depends on generic typing as experimental.

deckar01 on 6 Feb 2019

❤2 👍2

Thank you for the pointer to typing_inspect, I am going to use it.

lovasoa on 6 Feb 2019

👍1

Python 3.8 is adding public methods that normalize args and origin access.

https://docs.python.org/3.8/library/typing.html#typing.get_origin

deckar01 on 18 Jul 2019

🎉1

@deckar01 Great ! Is there a backport to 3.7 and 3.6 ?

lovasoa on 22 Jul 2019

Was this page helpful?