We currently use dict[k1,k2]
as a shorthand for dict[(k1,k2)]
, because it was deemed at the time that there was no other useful meaning for this syntax. After having done a lot of programming with multi-level dictionaries, however, I've come to feel that nested indexing might be a better use for this. dict[k1,k2]
as shorthand for dict[k1][k2]
is not in itself terribly useful, but it would dovetail nicely with haskey(dict, k1, k2)
which would be equivalent to haskey(dict, k1) && haskey(dict[k1], k2)
, except more efficient. There are several other areas where APIs could be extended to make programming with multilevel dictionaries more convenient.
If we decide to go in this direction, the minimum requirement for 1.0 would be to deprecate the existing dict[k1,k2]
definition for dictionaries and other associative collections.
How about just getting rid of dict[k1, k2]
and stopping there? The tuple meaning is much more useful for folks who are using a Dict as a SparseMatrix, so I don't think there's going to be one "best shortcut."
Does this proposal extend to 3+-level dicts? That is, do you anticipate supporting dict[k1, k2, ..., kn]
?
The sparse matrix case was the original use case, but the fact is that dicts are kind of bad sparse matrices and it's easy enough to wrap a dict in a type that makes a much better sparse matrix, which you'd almost immediately want to do if that's how you're using them. But yes, the conservative thing to do here is to just delete the relevant methods and leave dict[k1,k2...]
as no method errors so that we can decide which use is better.
@sbromberger, yes, it would generalized to n
levels, including, I suppose dict[] === dict
.
Continuing my recent trend of linking to things other programming languages do, this reminds me a bit of Mathematica's Dataset query sublanguage – Datasets are essentially nested dictionaries, and they use exactly the syntax under discussion here to query them.
http://reference.wolfram.com/language/ref/Dataset.html
data["a", "b"]
desugars to Query["a", "b"][data]
, which would recursively go through the nested structure and assemble a result.
Queries have flexibility similar to Julia's normal multidimensional arrays, and also have the interesting feature that while some indices are interpreted "on the way down", others are interpreted "on the way up" – your query filters downwards hierarchically through the nested structure, and can then have aggregation functions applied at the various levels on the way up.
I agree with @timholy - perhaps getting rid of this entirely is a good option. Wrappers can let the appropriate data structure appear.
Otherwise, people will wonder why we don't have this (auto nested indexing) for arrays, tuples, named tuples, etc...
(BTW if anyone wants to play with the behaviour mentioned in the OP to see if you like it, you can try my WIP package Tabulars.jl.)
I kind of like the current behavior, and I've found it especially useful for novice users. Yes, I could point them to a better data structure, but they are overwhelmed already, and in my teaching this was a pattern that folks liked and picked up quickly.
foldl(getindex, dict, (k1, k2, ...))
:)
Eh, let's not do this. Discussed on triage.
Most helpful comment
How about just getting rid of
dict[k1, k2]
and stopping there? The tuple meaning is much more useful for folks who are using a Dict as a SparseMatrix, so I don't think there's going to be one "best shortcut."