The default behavior of Str fields is scary. Serialization allows None (from nullable fields in ORMs, etc) into the schema dump (instead of omitting non-required Str fields). If I then take the serialized dict and shove it into the schema load, it raises a ValidationError. To me this seems like marshmallow is almost incompatible with itself given default options(?) Unless I'm misunderstanding something.
I've switched to doing #3 by default; that is, I'm now specifying allow_none=True on all my String fields. I'm doing this because I do not have a nice option to omit Str fields with value None on serialization.
from marshmallow import Schema, fields
from marshmallow.exceptions import ValidationError
#1
class ArtistSchema(Schema):
name = fields.Str()
bowie = {'name': None}
schema = ArtistSchema(strict=True)
result = schema.dump(bowie).data
try:
print(schema.load(result))
except ValidationError:
print('#1 failed') # default behavior is scary
#2
class ArtistSchema(Schema):
name = fields.Str(allow_none=False)
bowie = {'name': None}
schema = ArtistSchema(strict=True)
result = schema.dump(bowie).data # doesn't fail because allow_none is for deserialization
try:
print(schema.load(result))
except ValidationError:
print('#2 failed')
#3
class ArtistSchema(Schema):
name = fields.Str(allow_none=True)
bowie = {'name': None}
schema = ArtistSchema(strict=True)
result = schema.dump(bowie).data
try:
print(schema.load(result))
except ValidationError:
print('#3 failed') # not raised
Can I add that marshmallow 2.x is a very powerful library, but the API seems a bit cobbled together over time getting pull requests and everyone's needs shoved into a backwards-compatible library. I'm really hoping that with all the experience and feedback gained from 2.x that a lot of thought is put into 3.x when development starts.
Sorry it's taken so long to reply to this.
Honestly, I'm not sure where to go with this. It is common to have app-level objects that have attributes that are None--e.g. nullable ORM fields, as you pointed out. Deserialization is meant to validate and construct app objects from user/client input, which may not necessarily be the same as the serialization output. In this case, I think it is better to be disallow None by default.
This behavior is common amongst other (de)serialization libraries. I verified that both django rest framework and colander do the same.
P.S. Development on 3.x is now underway. I greatly value any feedback you provide. I encourage you to comment on any of the issues labeled "feedback welcome", or open new issues with suggestions of your own.
Closing for now, since I think the current behavior is correct. Feel free to re-open if any further discussion is needed.
I don't think this default behavior is reasonable. None is a valid value on fields and database columns (NULL) by default. I would think you would want to explicitly specify when None is not an acceptable value, rather than have to specify when it is. The way it is now one needs to specify a particular value (None) is valid at the field level for the vast majority of fields in use which is both onerous and confusing (since why wouldn't it be valid?).
The argument that other serializers do it this way isn't convincing to me. It's just an appeal to popularity.
Most helpful comment
I don't think this default behavior is reasonable.
Noneis a valid value on fields and database columns (NULL) by default. I would think you would want to explicitly specify whenNoneis not an acceptable value, rather than have to specify when it is. The way it is now one needs to specify a particular value (None) is valid at the field level for the vast majority of fields in use which is both onerous and confusing (since why wouldn't it be valid?).The argument that other serializers do it this way isn't convincing to me. It's just an
appeal to popularity.