By merging RFC 1194, set recovery we have acknowledged that the values of keys "matter". That is, it's reasonable to have an equal key, but want to know about the details of the stored key.
That RFC added fn get(&T) -> Option<&T>, take(&T) -> Option<T>, and replace(T) -> Option<T>.
However, what if I have an entry-like situation?
Today, this is the best we can do:
fn get_or_insert(set: &mut HashSet<Key>, key: Key) -> &Key {
let dupe = key.clone();
if !set.contains(&key) {
set.insert(key)
}
set.get(&dupe).unwrap();
}
Not only do we incur double-lookup (triple-lookup in the insertion case!), we also incur an unconditional Clone _even though we already had a by-value key_!
Optimally, we could write
fn get_or_insert(set: &mut HashSet<Key>, key: Key) -> &Key {
set.entry(key).into_ref()
}
What's the entry API for sets? Well, a heck of a lot simpler. The entry API on maps is all about deferred value handling, and that doesn't make sense for sets.
Vacant::insert and Occupied::insert don't make sense because we already have the keyOccupied::get_mut and into_mut don't make sense because we don't acknowledge key mutationOccupied::get and into_ref (to mirror into_mut), and remove are the only ones that make sensereplace() to explicitly overwrite the old key... or something..?So basically it would be something like entry(K) -> WasVacant(Entry) | WasOccupied(Entry). Critically, you get the same interface no matter what state the world was in, because there's nothing to do in the Vacant case but insert what was already given.
Supporting this would probably mean expanding the Entry API to "care about keys".
I haven't thought about the full implications here, and I don't have the bandwidth to write a full RFC at the moment.
:+1: Needed this today.
+1
+1 and thanks apasel422 for linking my PR and pointing me to this RFC! ;)
Also needed this today, specifically:
let set: HashSet<String> = HashSet::new();
let ... = set.entry("the_key").or_insert(|| String::new("the_key"));
I had a discussion about this on reddit today, and assumed that because replace is a thing, insert was meant to not replace an existing key. The current implementation, AFAICT, doesn't replace the key as expected. However given the "best we can do" scenario @Gankro wrote, I'm now unsure about this. Is key replacement in insert deliberately left unspecified? Or is there something else that I am missing that makes the "best we can do" code behave differently than the following:
fn get_or_insert(set: &mut HashSet<Key>, key: Key) -> &Key {
let dupe = key.clone();
set.insert(key);
set.get(&dupe).unwrap()
}
+1
I needed this today. It's a shame Rust doesn't have this. Please add it to sets.
+1, this would allow a safe zero-copy implementation of my makeuniq.rs script.
For string interning, this is a very useful feature, and I hit into it today.
Would definitely like to see this! I have a use case where even if the keys don't "matter", it's still useful to "insert a value and get a reference to either the existing value or inserted value". I'm working on an iterator adapter that filters out duplicates, and without an entry API, there's either an unnecessary lookup or an unnecessary clone:
struct Dedupe<I: Iterator>
where I::Item: Eq + Hash + Clone {
iter: I,
seen: HashSet<I::Item>
}
impl<I: Iterator> Iterator for Dedupe<I>
where I::Item: Eq + Hash + Clone {
fn next(&mut self) -> Option<Self::Item> {
loop {
let item = self.iter.next()?;
// Alternatively, do a contains() followed by insert()
if self.seen.insert(item.clone()) {
break Some(item);
}
}
}
}
With the entry API, you could do:
fn next(&mut self) -> Option<Self::Item> {
loop {
if let WasVacant(item) = self.seen.entry(self.iter.next()?) {
break Some(item.clone()); // Clone only on a cache miss
}
}
}
Essentially, there's a class of use case where you want to check if an T is present, insert it if not, then continue working with it as an &T without having to duplicate the lookup. This use case exists even if the "matteringness" of a particular key vs an equal key doesn't exist.
Most helpful comment
Would definitely like to see this! I have a use case where even if the keys don't "matter", it's still useful to "insert a value and get a reference to either the existing value or inserted value". I'm working on an iterator adapter that filters out duplicates, and without an entry API, there's either an unnecessary lookup or an unnecessary clone:
With the entry API, you could do:
Essentially, there's a class of use case where you want to check if an
Tis present, insert it if not, then continue working with it as an&Twithout having to duplicate the lookup. This use case exists even if the "matteringness" of a particular key vs an equal key doesn't exist.