C++ Unordered Map Under the Hood

The std::unordered_map in C++ standard library is a really powerful data structure offering insertion, deletion, and lookup in O(1) amortized time. std::unordered_map is implemented as a hashmap or hashtable which are the same thing. This article is going to describe how the C++ standard library implements the unordered_map, how it achieves those time complexities, and when to use it.

Member Variables

class unordered_map{
private:
list<pair<int,int>> *buckets;
int bucket_size;
int total_elements;
float max_load_factor;
}
https://en.wikipedia.org/wiki/Hash_table#Separate_chaining

Functions

int hash(int key) {
int hashed_key = strongHashingFunction(key)
return hashed_key % bucket_size;
}
list<int>::iterator find(int key){
for(auto& i : buckets[hash(key)]) {
if(i->first == key){
return i;
}
}
return buckets[hashed_key].end();
}
void insert(int key, int val){
auto it = find(key);
if(it != buckets[hash(key)].end()){
it->second = val;
return;
}
buckets[hash(key)].push_back({key,val});
total_elements++;
rehashIfNeeded();
}
void delete(int key){
auto it = find(key)
if(it != buckets[hash(key)].end()){
buckets[hash(key)].erase(it);
total_elements--;
}
}
void rehashIfNeeded(){
if(total_elements/bucket_size <= max_load_factor){
return;
}
bucket_size *= 2;
auto new_buckets = new list<pair<int,int>>[bucket_size];
for(int i = 0; i < bucket_size/2; i++){
for(auto& kv_itr: buckets[i]){
new_buckets[hash(kv_itr->first)].push_back(*kv_itr);
}
}
delete[] buckets;
buckets = new_buckets;
}

Complexity Analysis

find hashes in O(1), then traverses the bucket looking for the key. Assuming that we have a good low factor and a hash functions, the number of collisions or elements in this linked list is constant. In fact, on average, the number of elements in each bucket (collisions) should be less than the max_load_factor. Thus, find is O(1) on average.

Using an unordered_map

You should use an unordered_map when you don’t care about keeping a sorted order of the keys. I’d say, you’d use an unordered_map over an (ordered) map in C++ unless you have a reason for wanting to preserve order. map comes with a big complexity hit, so your usually better off with the unordered_map. For the full reference of std::unordered_map visit https://en.cppreference.com/w/cpp/container/unordered_map

C/C++ | Computer Systems | Low Latency | Distributed Systems | Computer Networks | Operating Systems