Rust: Tracking issue for HashMap::raw_entry

Created on 22 Nov 2018 · 12Comments · Source: rust-lang/rust

Added in #54043.

As of 6ecad338381cc3b8d56e2df22e5971a598eddd6c / 2019-01-09, this feature covers:

impl<K, V, S> HashMap<K, V, S>
    where K: Eq + Hash,
          S: BuildHasher
{
    pub fn raw_entry(&self) -> RawEntryBuilder<K, V, S> {…}
    pub fn raw_entry_mut(&mut self) -> RawEntryBuilderMut<K, V, S> {…}
}

pub struct RawEntryBuilder<'a, K: 'a, V: 'a, S: 'a> {…} // Methods return Option<(&'a K, &'a V)>
pub struct RawEntryBuilderMut<'a, K: 'a, V: 'a, S: 'a> {…} // Methods return RawEntryMut<'a, K, V, S>
pub enum RawEntryMut<'a, K: 'a, V: 'a, S: 'a> {
    Occupied(RawOccupiedEntryMut<'a, K, V>),
    Vacant(RawVacantEntryMut<'a, K, V, S>),
}
pub struct RawOccupiedEntryMut<'a, K: 'a, V: 'a> {…}
pub struct RawVacantEntryMut<'a, K: 'a, V: 'a, S: 'a> {…}

… as well as Debug impls for each 5 new types, and their inherent methods.

A-collections B-unstable C-tracking-issue I-nominated Libs-Tracked T-libs

Source

sfackler

👍5

Most helpful comment

This is a really great API, it's also what keeps crates (hashlink for example) using hashbrown instead of the stdlib hash map -- since hashbrown exposes this.

What could be next steps here towards stabilization?

thomcc on 11 Apr 2020

👍8

All 12 comments

What is the motivation for having separate from_hash and search_bucket methods? It seems that the only difference is whether the hash value is checked before calling is_match. However if the table does not store full hashes (i.e. hashbrown) then there is no difference between these methods.

Could we consider merging these methods into a single one? Or is there some use case where the difference in behavior is useful?

Amanieu on 26 Nov 2018

I am also extremely confused by this distinction, as my original designs didn't include them (I think?) and the documentation that was written is very unclear.

Gankra on 27 Nov 2018

cc @fintelia

Amanieu on 27 Nov 2018

The reason I added search_bucket was because I wanted to be able to delete a random element from a HashMap in O(1) time, without storing an extra copy of all the keys. Basically, instead of doing something like this:

let key = map.iter().nth(rand() % map.len()).0.clone();
map.remove(&key);

I wanted to just be able to pick a random "bucket" and then get an entry/raw entry to the first element in it if any:

loop {
    if let Occupied(o) = map.raw_entry_mut().search_bucket(rand(), || true) {
        o.remove();
        break;
    }
}

(the probabilities aren't uniform in the second version, but close enough for my purposes)

fintelia on 28 Nov 2018

I continue to not want to support the "random deletion" usecase in std's HashMap. You really, really, really, should be using a linked hashmap or otherwise ordered map for that.

Gankra on 28 Nov 2018

👍2

I have removed this method in the hashbrown PR (#56241). Your code snippet for random deletion won't work with hashbrown anyways since it always checks the hash as part of the search process.

Amanieu on 9 Dec 2018

I can avoid unnecessary clones inherent to the original entry API which is nice. But unless I'm mistaken, the current raw_entry API seems to hash the keys twice in this simple use case:

#![feature(hash_raw_entry)]

use std::collections::HashMap;

let mut map = HashMap::new();

map.raw_entry_mut()
   .from_key("poneyland")
   .or_insert("poneyland", 3);

Currently I use the following function to hash once and automatically provide an owned key if necessary (somewhat similar to what was discussed in https://github.com/rust-lang/rfcs/pull/1769):

use std::borrow::Borrow;
use std::collections::hash_map::RawEntryMut;
use std::hash::{BuildHasher, Hash, Hasher};

fn get_mut_or_insert_with<'a, K, V, Q, F>(
    map: &'a mut HashMap<K, V>,
    key: &Q,
    default: F,
) -> &'a mut V
where
    K: Eq + Hash + Borrow<Q>,
    Q: Eq + Hash + ToOwned<Owned = K>,
    F: FnOnce() -> V,
{
    let mut hasher = map.hasher().build_hasher();
    key.hash(&mut hasher);
    let hash = hasher.finish();

    match map.raw_entry_mut().from_key_hashed_nocheck(hash, key) {
        RawEntryMut::Occupied(entry) => entry.into_mut(),
        RawEntryMut::Vacant(entry) => {
            entry
                .insert_hashed_nocheck(hash, key.to_owned(), default())
                .1
        }
    }
}

Given k1 and k2 with the same type K such that hash(k1) != hash(k2), is there a use-case for calling RawEntryBuilderMut::from_key_hashed_nocheck with hash(k1), &k1 and then inserting with RawVacantEntry::or_insert using k2 ?

If there isn't, why not saving the hash in RawVacantEntryMut and using it inside RawVacantEntryMut::insert ? It would even be possible to assert in debug builds that the owned key has indeed the same hash as the borrowed key used to lookup the entry.

gdzx on 1 Mar 2019

👍4

I'm not yet very familiar with this API, but what @gdouezangrard suggested seems like a great idea to me. What even happens currently if the two hashes don't match, is the key-value pair then inserted into the wrong bucket? It's not clear to me from (quickly) reading the source code.

timvermeulen on 13 Apr 2019

I submitted https://github.com/rust-lang/hashbrown/pull/54 to support using a K that doesn't implement Hash via the raw entry API. See https://github.com/rust-lang/hashbrown/issues/44 for the original motivation. Now that hashbrown is merged into std, could we expose this functionality on the std::collections::hash_map types as well?

If so, I'd be happy to submit a PR!

sujayakar on 26 Apr 2019

This is a really great API, it's also what keeps crates (hashlink for example) using hashbrown instead of the stdlib hash map -- since hashbrown exposes this.

What could be next steps here towards stabilization?

thomcc on 11 Apr 2020

👍8

Just gonna add another ping here -- what's blocking this right now?

sanbox-irl on 7 Nov 2020

I see a few things that need to be resolved:

We need proper documentation for the API functions, including examples.
hashbrown has a few extensions to RawEntry, such as https://docs.rs/hashbrown/0.9.1/hashbrown/hash_map/enum.RawEntryMut.html#method.and_replace_entry_with. We should decide whether to port these to std as well.
The name feels a bit wierd since it isn't clear what "raw" means. But I don't have any better ideas.

I would recommend prototyping in the hashbrown crate first, which can then be ported back in the the std HashMap.

Amanieu on 12 Nov 2020

Was this page helpful?

0 / 5 - 0 ratings

Related issues

Unhelpful diagnostic when using proc macros 2.0 without #![feature(use_extern_macros)]

lambda-fairy · 3Comments

Comments for macro expansions

Robbepop · 3Comments

Cannot impl Drop for a generic type with a Higher-Rank Trait Bound

dwrensha · 3Comments

Internal compiler error: cannot relate bound region (likely caused by `conservative_impl_trait`)

jmegaffin · 3Comments

#![allow(unknown_lints)] does not work if there is a unknow lint at the same level since nightly-2017-08-11-x86_64-unknown-linux-gnu

overvenus · 3Comments