A lot of times I pass data around using dictionaries. When I receive the data, for example as a function argument, what I want is a dict with that contains some specifics keys. I don't mind if the dict has more data.
So, I tried typed dicts:
Movie = TypedDict('Movie', {'name': str, 'year': int})
This works:
movie = {'name': 'Blade Runner', 'year': 1982} # type: Movie
But this throws the error Extra key 'has_sequel' for TypedDict "Movie":
movie = {'name': 'Blade Runner', 'year': 1982, 'has_sequel': True} # type: Movie
I can understand that you can't replace for the first value, because the result of keys or items is different.
But if I am only interested in having those keys, not iterating or other stuff, what are my options (if any)?
There is a cool new feature called total=False, does it work for you?
Here are some docs http://mypy.readthedocs.io/en/latest/kinds_of_types.html#totality
This looks like a feature request for TypedDict.
@gvanrossum Actually I am quite sure total=False covers this, or am I missing something?
Hm, I see one problem: potentially, the list of optional keys can be very long (since one still needs to list them).
Sadly no -- total=False allows omitting keys, it doesn't allow extra keys.
# ...imports...
A = TypedDict('A', {'x': int}, total=False)
def f(a: A) -> None:
print(a['x'])
b: A = {'x': 0, 'y': 0} # E: Extra key 'y' for TypedDict "A"
Yes, I think this is a valid feature request, as enumerating keys in the TypedDict could be painful.
This is already supported for function arguments:
from mypy_extensions import TypedDict
A = TypedDict('A', {'x': int})
B = TypedDict('B', {'x': int, 'y': str})
def f(x: A) -> None: ...
b: B = {'x': 1, 'y': 'foo'}
f(b) # Ok
The issues seems specific to creating a typed dict -- mypy makes sure that no extra keys are provided (as these could be accidental typos or obsolete key names, for example).
I don't think that changing the semantics of total=False is a good idea, since we'd lose a lot of type safety. One option would be to introduce a new flag. Let's call it allow_extra=True for now. Here is an attempt to define it:
allow_extra is true for the type. Allow arbitrary extra keys and values.x['whatever']. The value type would have to be Any unless the key is explicitly defined for the typed dict.x['whatever'] = <anything>.These semantics seem kind of arbitrary to me. To seriously consider such as feature I'd like to see more concrete examples where the current behavior is not sufficient. If the structure of a typed dict object is not well specified, the current recommendation is to use Dict[str, Any] instead. TypedDict is pretty restricted by design.
If we had intersection types, a similar effect to my proposed allow_extra flag could perhaps be achieved through Intersection[MyTypedDict, Dict[str, Any]]. We don't have any concrete plans to introduce intersection types, however.
Thanks.
I think that I can work with the current behavior.
JukkaL's example is what I am trying to do. Something like:
from typing import Iterable
from mypy_extensions import TypedDict
NamedThing = TypedDict('NamedThing', {'name': str})
Movie = TypedDict('Movie', {'name': str, 'year': int})
Replicant = TypedDict('Replicant', {'name': str, 'model': str})
def slug(x: NamedThing) -> str:
return x['name'].lower().replace(' ', '-')
blade_runner: Movie = {'name': 'Blade Runner', 'year': 1982}
roy: Replicant = {'name': 'Roy', 'model': 'Nexus-6'}
things: Iterable[NamedThing] = [blade_runner, roy]
for thing in things:
print(slug(thing))
When trying Mypy I was directly assigning the values or using variables without annotations, like:
blade_runner: NamedThing = {'name': 'Blade Runner', 'year': 1982}
slug({'name': 'Blade Runner', 'year': 1982})
blade_runner = {'name': 'Blade Runner', 'year': 1982}
slug(blade_runner)
I'm closing this since it seems that the current behavior is mostly good enough, and the solution to the larger issue would be messy.
I would like this issue to be re-opened. I think extra_keys is important to have.
The current solution of Dict[str, Union[str, int, float, bool]] is significantly less expressive than what other languages offer.
Typescript example:
// Simple example
interface ElasticsearchDocument {
_id: string
_type: string
_index: string
@timestamp: number
[key: string]: string|number|boolean;
}
// Complex example: Nested dict
interface SubObject {
[key: string]: string|number|boolean|SubObject
}
interface DynamoDBDocument {
index_id: number
secondary_index_id: number
[key: string]: string|number|boolean|SubObject
}
The above suggestion of allow_extra still has a capability gap where I cannot define the type of extra items (either the key or value).
I don't have a good sense of the restrictions on syntax that MyPy has to deal with. So this proposal might be totally unreasonable. But the syntax could be:
TypedDict("ElasticsearchDocument",
{"_id": str},
extra_keys={str: Union[str,int,float,bool]})
This says: "this is a dictionary with _id as a guaranteed key. There might be other keys, they have string type and value matches Union[str,int,float,bool].
Theoretically you could spec something like this:
TypedDict("ComplexDict",
{1: str, "a": int},
extra_keys={int: bool, str: str})
I would also like to see extra keys allowed. My use case is in a HTTP/REST client:
class SomeApiClient:
class RegisterRequest(TypedDict):
...
class RegisterResponse(TypedDict):
...
def register(self, payload: RegisterRequest) -> RegisterResponse
...
return resp_json
As payload - this is API that already exists, and accepts both fixed/defined and arbitrary key-value pairs on main level. This cannot be mapped currently. Nesting dynamic data under extra key or similar:
class RegisterRequest(TypedDict):
static_field: str
extra: dict
...cannot be done, because the API already exists and cannot be changed just like that.
As response - the server returns some irrelevant fields, which are really not worth mapping (but calling code might be interested about them). Or maybe the server adds a new field that the caller is interested about, before RegisterResponse is updated in the next version.
I would happily risk typoing (as explained in https://github.com/python/mypy/issues/6223#issuecomment-456376214) just to have extra keys (it would be optional feature, anyway).
This might be solveable using Protocols, Literals and __getitem__. One could have a Protocol class with a generic __getitem__ and many @overloads of said method using a Literal key and the corresponding return type. Whilst this would require many overloads it could work.
Can this issue be reopened?
@JukkaL As @alexjurkiewicz mentions, the advantage of allowing extra keys is that then TypedDict function as a sort of poor person's interface: "_You need to provide at least these fields_" with implied that extra keys would be ignored.
From a practical perspective: Python is a language that integrates many different systems, protocols, etc., it's not infrequent for various protocols to have slightly different extensions鈥攖he point of standardizations is to try to avoid that, but it's seemed inevitable. Since a lot of protocols communicate in JSON, having TypedDict's that expect several fields for sure but allow any other fields is useful. This is also the case when you have several versions of a protocol鈥攜ou make the never one a superset of the older one to be backwards compatible.
This is not a duplication of the total flag: The total flag is a global way to address the possibility of optional fields.
So you could have total=False and extra=True for when you allow for certain fields to be optional鈥攁nd are providing the type just as guide of which fields can be affected/tuned鈥攂ut also to ignore any superset.
From a programming language theoretical perspective: The topic of typed extensible records is not new, but dates to Mitchell & Wand in the 70s. A recent paper about this can be found here:
You may be amused to know the paper begins this way 馃槉馃:
Records and variants provide flexible ways to construct datatypes, but the restrictions imposed by practical type systems can prevent them from being used in flexible ways. These limitations are often the result of concerns about efficiency or type inference, or of the difficulty in providing accurate types for key operations.
It then goes on:
Unfortunately, practical languages are often less flexible in the operations that they provide to manipulate
records and variants. For example, many languages鈥攆rom C to Standard ML (SML)鈥攚ill only allow the programmer to select thelcomponent,r.l, from a recordrif the type ofris uniquely determined at compile-time.
@gvanrossum I suspect having the field extra would be enable the kind of polymorphism that this paper encourages鈥攁nd which is actually inherent in Python's original "duck" typing, and precisely what made Python such an attractive, and sane language to use for so many people.
A use case that I I think I have with this is integrating with a legacy MongoDB API where a given record could have a bunch of key/value pairs, but I'm really only concerned with say 3-4 of them in the codebase I'm working with. I want to strongly type the 3-4 that I'm concerned with, and not worry about the other ones but rather just pass them around.
Most helpful comment
I would like this issue to be re-opened. I think
extra_keysis important to have.The current solution of
Dict[str, Union[str, int, float, bool]]is significantly less expressive than what other languages offer.Typescript example:
The above suggestion of
allow_extrastill has a capability gap where I cannot define the type of extra items (either the key or value).I don't have a good sense of the restrictions on syntax that MyPy has to deal with. So this proposal might be totally unreasonable. But the syntax could be:
This says: "this is a dictionary with
_idas a guaranteed key. There might be other keys, they have string type and value matchesUnion[str,int,float,bool].Theoretically you could spec something like this: