map.these() yields keys in the map. This was initially based on python's
dictionary interface.
A good amount of discussion about this starts with this
comment
In sum, python's dict interface was designed that way so that there is a
symmetry between different uses of the in keyword.
some_key in some_map # check if a key is in the map
for some_key in some_map # iterate keys in the map
But Chapel doesn't have in (#5034). So, maybe we shouldn't follow this
precedent. Maybe map.these can iterate k-v pairs, and if we have in in the
future we can make it take k-v as the lhs argument, as well.
Personally, just looking at the dict interface, I am not a big fan of python's
dictionary iterator yielding keys, and it is reasonable to think about making
Chapel's map.these() yield k-v pairs.
OTOH, I am reluctant because I am used to it from python, and it may be
annoying for people coming from pythonland.
One disadvantage of yielding both keys and values: in the case that we decide to have a no-reference map, we'd ostensibly have to yield both keys and values by value.
I think it is reasonable to yield keys by value and to have the expectation that keys generally be small sized value types suitable for copying around. However I don't hold any such expectation for map values.
So if we decide (not saying we will) to prohibit yielding by reference, then there would be a cost to always yielding a value even if it isn't used.
However if we continue to exist in a world where we can yield references to values (and const references to keys), I think it would be totally reasonable to have map.these() yield both keys and values.
The above uncertainties I've expressed make me think that the safest choice would be to continue to have map.these() continue to yield only keys.
I think that yielding references is a separate problem from returning them because we historically have had the "no updating collection size while iterating" rule.
I personally prefer having map.these yield (key, value) tuples however I'm not sure if the language/compiler is up to the task of yielding modifyable values in this context. (Modifying the key in a loop over the map would definitely be a no-no - but modifying the value seems like something we should be able to support).
@daviditen Any thoughts as to what map.these() should yield as the implementor of map?
@bradcray I thought I saw you give a thumbs up in the OP, do you have an opinion one way or the other?
I was really surprised by the current map.these() behavior, and I think it was my comment (linked in the OP above) that caused Engin to fork off this issue. (key,value) pairs is what I would've expected with no other information.
I have no objection to it yielding (key, value) tuples.
@ben-albrecht As a Python power-user, would you be comfortable or feel alarmed if map.these yielded key/value pairs instead of just keys? (Ah I just realized you thumbs-upped the OP, I take that to mean you're OK with the change.)
@lydia-duncan Ditto experience with Python - any preference?
I am OK with key,value pairs, and like that looping over keys or values requires an explicit iterator: map.keys() or map.values()
Seems reasonable to me!
A new thought just occurred to me, which is that even if we enabled returning tuples by const ref as in #15973, we'd unfortunately still have to return the entire tuple from map.these() by const ref. This is because we require keys to be immutable, and we'd have no way of specifying the constness of the individual elements of a returned referential tuple.
A bit of a shame, because it means if a user wants to mutate a value in a loop on map.these(), they'd have to grab a mutable reference:
for (k, v) in myMap {
ref v2 = myMap[k];
v2 = foo;
}
So I suppose as long as you don't plan to mutate values, map.these() still ends up being fairly idiomatic.
I wonder if we should make it possible to specify the return intent for tuple elements individually?
E.g. proc map.these(): (const ref keyType, ref valType)
Woah there. That's a very bold question you're asking...
I wonder if we should make it possible to specify the return intent for tuple elements individually?
This is reminiscent of a question @vasslitvinov asked some time ago when we were wrestling with tuple semantics, but I don't recall whether we didn't like it or just didn't have the incentive to chase after it at that time. To me, it also seems a little entangled with the ongoing desire to have ref fields in objects #8481.
To me, it also seems a little entangled with the ongoing desire to have ref fields in objects #8481.
@bradcray - is there some specific part of ref fields in objects that makes you concerned about this? For the tuple, it's more that it's an anonymous object, and the user wishes to make one field const ref and another ref.
For my part, I think the language could handle ref fields in objects at this point and getting that going is mainly an implementation effort.
I wouldn't say concerned, just that I think of our implementation of (heterogeneous) tuples as being record-like, which is why the two efforts seem related to me. I've been in favor of ref fields in general, just pointing out that implementing one may involve (or get us most of the way to) supporting the other.
Just to be clear - tuples already support ref fields but it's handled as special stuff in the compiler rather than a general feature.
We have notes from a deep dive on tuples on 2015-05-19. Reference components were a major topic. The deep dive was inconclusive.
Back then, we were still debating "tuple as a shortcut for multiple variables" vs. "tuple as a lightweight record" and how to pass an array by reference when returning it as a part of a tuple. Brad expressed unexcitement about the syntax proc foo((const in i, ref A)). The syntaxes return (1, ref A) and new MyRecord(1, ref A) were also proposed, I vaguely recall the majority not liking it because it was "too much".
Most helpful comment
I am OK with key,value pairs, and like that looping over keys or values requires an explicit iterator:
map.keys()ormap.values()