This issue revisits this comment https://github.com/samuelcolvin/pydantic/issues/462#issuecomment-480326378 by @tiangolo.
I think it would be a good idea that there is a standard way of getting secrets exported for propagation to other services.
This is not something that tends to happen in three tier apps where e.g. the db creds are secret but rather a case that happens _a lot_ when dealing with microservice architectures wherein request payloads may serialize and deserialize multiple times through the end to end request lifecycle.
.json() to me is semantically like .export and as such defaulting to revealing secret makes sense. But that would be a breaking change.
Other approaches:
.json(reveal_secrets=True).export()But maybe we can take the breaking change path via https://github.com/samuelcolvin/pydantic/issues/576 and then:
.json(keep_secrets=True)To be clear I don't see .json as being something used for logging. Something like structlog would work with pydantic.dict() instead:
log.info('something', data=model.dict())
I _think_ .dict defaulting to maintaining secrets seems right. But we could have, too:
log.info('something', data=model.dict(reveal_secrets=True))
But than we should make considerations around API consistency across methods and ensure usability is good overall, not just per case.
I think this is unnecessary.
The purpose of SecretStr is questionable anyway, or rather, it has relatively few useful applications. The environment in which you're processing "secret" values has to be trusted, so why not just use str?
Given this I think it's overkill to add another argument to dict() (and associated methods) and then pass the setting around recursively.
You can accomplish the same thing with your own custom encoder very easily:
from pydantic import BaseModel, SecretStr
from pydantic.json import pydantic_encoder
def custom_encoder(obj):
if isinstance(obj, SecretStr):
return obj.get_secret_value()
else:
return pydantic_encoder(obj)
class Model(BaseModel):
password: SecretStr = None
def json(self) -> str:
return super().json(encoder=custom_encoder)
print(Model(password='testing').json())
If you really wanted you could use your own custom BaseModel that adds another keyword argument reveal_secrets=False to json() and uses the above encoder if reveal_secrets=True.
The environment in which you're processing "secret" values has to be trusted, so why not just use str
For example it can be a breach of data policy to log patient identifiable information. The trust levels between a tightly controlled internal database and, say, an external service Datadog, are far from equal.
Yet, in dev, it can be very helpful to see this to help understand what is going on. We try to keep this kind of information at debug level and in prod keep log level at info. However, two layers of protection are good. In an emergency on prod where we had to enable debug logs, or, a configuration mistake, secrets would still be shielded from exposure.
ok, I still think the above encoder solution would be best.
@samuelcolvin cool, and thanks for hearing the case out.
It occurs to me that we could use parameterised types eg. SecretStr['json'] or as a function SecretStr(json=True), like we do for constr but I'm not sure it's worth it.
@samuelcolvin oooh... I love that. Controlling the secret string semantics at the field level seems great 馃槄
@samuelcolvin between SecretStr(json=True) and SecretStr['json'] the former, function form, seems most clear and extensible. The latter would be confusing to me.
The only reason for SecretStr['json'] is that it wouldn't upset mypy, but maybe that's a minor concern.
General comment: I think it wouldn't hurt to get more use-case feedback before making a call on the nuances of the api, if any, like you said:
but I'm not sure it's worth it.
If it is worth it, though, your field-based approach is clearly the way forward imo.
I've asked for feedback on lots of issues (see the "feedback" label), very few people (compared to the number of stars, projects using pydantic or downloads) give feedback.
Me being too utopian :D
@samuelcolvin shall we re-open this issue with a new title, maybe Configurable SecretStr?
Just bumped into this... Tried to use SecretStr to store token data, but then discovered that I can't render it to JSON model and return to client (e.g. during login).
It will be great to have SecretStr['json'] and SecretStr[json=True] which will cause model.json() to include the secret but omit it from model's repr() and str() when logging it.
P.S. Speaking of feedback :)
@samuelcolvin
I'm looking into implementing this, but it's been a while since I've looked at the pydantic library itself and how things are implemented so I wanted to ask before I proceed:
Is there any example of a type that is output differently using the json() call when compared to repr() or str(), and if so how is that specified in the library?
More generally: How would you like to see this be implemented?
It looks like the Color type maybe does something like it:
import pydantic
from pydantic.color import Color
class Config(pydantic.BaseModel):
a: pydantic.SecretStr
b: pydantic.PaymentCardNumber
c: pydantic.ByteSize
d: Color
def main():
c = Config(a="abc", b=4065972557141631, c=52000, d='hsl(180, 100%, 50%)')
print(str(c))
print(repr(c))
print(c.json())
if __name__ == '__main__':
main()
=>
a=SecretStr('**********') b='4065972557141631' c=52000 d=Color('cyan', rgb=(0, 255, 255))
Config(a=SecretStr('**********'), b='4065972557141631', c=52000, d=Color('cyan', rgb=(0, 255, 255)))
{"a": "**********", "b": "4065972557141631", "c": 52000, "d": "cyan"}
I think the solution is to wait until I implement customised serialisation.
I'm hoping to work on it fairly soon.
@samuelcolvin I saw the json_encoders option that you can add:
import pydantic
from pydantic.color import Color
class Config(pydantic.BaseModel):
a: pydantic.SecretStr
b: pydantic.PaymentCardNumber
c: pydantic.ByteSize
d: Color
class Config:
json_encoders = {
pydantic.SecretStr: lambda v: v.get_secret_value(),
}
def main():
c = Config(a="abc", b=4065972557141631, c=52000, d='hsl(180, 100%, 50%)')
#print(c.__dict__.items())
print(str(c))
print(repr(c))
print(c.json())
if __name__ == '__main__':
main()
=>
a=SecretStr('**********') b='4065972557141631' c=52000 d=Color('cyan', rgb=(0, 255, 255))
Config(a=SecretStr('**********'), b='4065972557141631', c=52000, d=Color('cyan', rgb=(0, 255, 255)))
{"a": "abc", "b": "4065972557141631", "c": 52000, "d": "cyan"}
Isn't this good enough?
Yes that's a temporary solution, but not how it should look long term.
Would it be worthwhile to add the json_encoders bit to the to the documentation example code for SecretStr and SecretBytes or would you rather just wait until you're working on customised serialisation?
Or we could fix this but editing pydantic_edcoder, that would be a better temporary solution.
@samuelcolvin Made a PR, at #1313
@samuelcolvin made a different PR, at #1328 that just documents an approach to make it plain-text dumpable using the json method.
Most helpful comment
It occurs to me that we could use parameterised types eg.
SecretStr['json']or as a functionSecretStr(json=True), like we do forconstrbut I'm not sure it's worth it.