BhrikutiSoft

How does Hash Map work internally?

This is a widely asked the question how HashMap works internally. In this document, I have explained how this HashMap works internally. Firstly its explains the entryset than the entry object array that Table of Entry objects. Thereafter it discusses the Hashing function. Very important to the put () method to keep Entry objects into the table .later how to get() works. HashCode calculation and index calculation for a key are discussed as well.Please go through it it's easy to understand the basic working function of hashMap.

Entry Class -

HashMap stores the data in the form of key-value pairs. Each key-value pair is stored in an object of Entry<K, V> class

. Entry<K, V> class is the static inner class of HashMap which is defined like below.

static class Entry<K,V> implements Map.Entry<K,V>

{

final K key;

V value;

Entry<K,V> next;

int hash;

//Some methods are defined here

}

As you see, this inner class has four fields. key, value, next and hash.

key : It stores the key of an element and its final.

value : It holds the value of an element.

next : It holds the pointer to next key-value pair. This attribute makes the key-value pairs stored as a linked list.

hash : It holds the hashcode of the key.

Array Table Of Entry Objects

These Entry objects are stored in an array called table[]. This array is initially of size 16. It is defined like below.

/**

* The table, resized as necessary. Length MUST Always be a power of two.

transient Entry<K,V>[] table;

To summarise the whole HashMap structure, each key-value pair is stored in an object of Entry<K, V> class. This class has an attribute called next which holds the pointer to next key-value pair. This makes the key-value pairs stored as a linked list. All these Entry<K, V> objects are stored in an array called table[]. The below image best describes the HashMap structure.

The above image roughly shows how the HashMap stores its elements. Internally it uses an array of Entry<K, V> class called table[] to store the key-value pairs. But how HashMap allocates slot in table[] array to each of its key-value pair is very interesting. It doesn’t inserts the objects as you put them into HashMap i.e first element at index 0, second element at index 1 and so on. Instead it uses the hashcode of the key to decide the index for a particular key-value pair. It is called Hashing.

What Is Hashing?

The whole HashMap data structure is based on the principle of Hashing. Hashing is nothing but the function or algorithm or method which when applied on any object/variable returns an unique integer value representing that object/variable. This unique integer value is called hash code. Hash function or simply hash said to be the best if it returns the same hash code each time it is called on the same object. Two objects can have same hash code.

Whenever you insert new key-value pair using put() method, HashMap blindly doesn’t allocate slot in the table[] array. Instead it calls hash function on the key. HashMap has its own hash function to calculate the hash code of the key. This function is implemented so that it overcomes poorly implemented hashCode() methods. Below is implementation code of hash().

/**

* Retrieve object hash code and applies a supplemental hash function to the

* result hash, which defends against poor quality hash functions. This is

* critical because HashMap uses power-of-two length hash tables, that

* otherwise encounter collisions for hashCodes that do not differ

* in lower bits. Note: Null keys always map to hash 0, thus index 0.

final int hash(Object k) {

int h = 0;

if (useAltHashing) {

if (k instanceof String) {

return sun.misc.Hashing.stringHash32((String) k);

}

h = hashSeed;

}

h ^= k.hashCode();

// This function ensures that hashCodes that differ only by

// constant multiples at each bit position have a bounded

// number of collisions (approximately 8 at default load factor).

h ^= (h >>> 20) ^ (h >>> 12);

return h ^ (h >>> 7) ^ (h >>> 4);

}

After calculating the hash code of the key, it calls indexFor() method by passing the hash code of the key and length of the table[] array. This method returns the index in the table[] array for that particular key-value pair.

Put() Method of HashMap

public V put(K key, V value) {

if (key == null)

return putForNullKey(value);

int hash = hash(key);

int i = indexFor(hash, table.length);

for (Entry<K,V> e = table[i]; e != null; e = e.next) {

Object k;

if (e.hash == hash && ((k = e.key) == key || key.equals(k))) {

V oldValue = e.value;

e.value = value;

e.recordAccess(this);

return oldValue;

}

modCount++;

addEntry(hash, key, value, i);

return null;

}

Let’s see how this code works step by step.

Step 1 : First checks whether the key is null or not. If the key is null, it calls putForNullKey() method. table[0] is always reserved for null key. Because, hash code of null is 0.

Step 2 : If the key is not null, then it calculates the hash code of the key by calling hash() method.

Step 3 : Calls indexFor() method by passing the hash code calculated in step 2 and length of the table[] array. This method returns index in table[] array for the specified key-value pair.

Step 4 : After getting the index, it checks all keys present in the linked list at that index ( or bucket). If the key is already present in the linked list, it replaces the old value with new value.

Step 5 : If the key is not present in the linked list, it appends the specified key-value pair at the end of the linked list.

How get() method Works?

Let’s see how get() method has implemented.

public V get(Object key) {

if (key == null)

return getForNullKey();

int hash = hash(key.hashCode());

for (Entry<K , V> e = table[indexFor(hash, table.length)]; e != null; e = e.next) {

Object k;

if (e.hash == hash && ((k = e.key) == key || key.equals(k)))

return e.value;

}

return null;

}

Step 1 : First checks whether specified key is null or not. If the key is null, it calls getForNullKey() method.

Step 2 : If the key is not null, hash code of the specified key is calculated.

Step 3 : indexFor() method is used to find out the index of the specified key in the table[] array.

Step 4 : After getting index, it will iterate though linked list at that position and checks for the key using equals() method. If the key is found, it returns the value associated with it. otherwise returns null.

public class java.util.HashMap<K, V> extends java.util.AbstractMap<K, V> implements java.util.Map<K, V>, java.lang.Cloneable, java.io.Serializable {

static final int DEFAULT_INITIAL_CAPACITY;

static final int MAXIMUM_CAPACITY;

static final float DEFAULT_LOAD_FACTOR;

static final int TREEIFY_THRESHOLD;

static final int UNTREEIFY_THRESHOLD;

static final int MIN_TREEIFY_CAPACITY;

transient java.util.HashMap$Node<K, V>[] table;

transient java.util.Set<java.util.Map$Entry<K, V>> entrySet;

transient int size;

transient int modCount;

int threshold;

final float loadFactor;

static final int hash(java.lang.Object);

static java.lang.Class<?> comparableClassFor(java.lang.Object);

static int compareComparables(java.lang.Class<?>, java.lang.Object, java.lang.Object);

static final int tableSizeFor(int);

public java.util.HashMap(int, float);

public java.util.HashMap(int);

public java.util.HashMap();

public java.util.HashMap(java.util.Map<? extends K, ? extends V>);

final void putMapEntries(java.util.Map<? extends K, ? extends V>, boolean);

public int size();

public boolean isEmpty();

public V get(java.lang.Object);

final java.util.HashMap$Node<K, V> getNode(int, java.lang.Object);

public boolean containsKey(java.lang.Object);

public V put(K, V);

final V putVal(int, K, V, boolean, boolean);

final java.util.HashMap$Node<K, V>[] resize();

final void treeifyBin(java.util.HashMap$Node<K, V>[], int);

public void putAll(java.util.Map<? extends K, ? extends V>);

public V remove(java.lang.Object);

final java.util.HashMap$Node<K, V> removeNode(int, java.lang.Object, java.lang.Object, boolean, boolean);

public void clear();

public boolean containsValue(java.lang.Object);

public java.util.Set<K> keySet();

public java.util.Collection<V> values();

public java.util.Set<java.util.Map$Entry<K, V>> entrySet();

public V getOrDefault(java.lang.Object, V);

public V putIfAbsent(K, V);

public boolean remove(java.lang.Object, java.lang.Object);

public boolean replace(K, V, V);

public V replace(K, V);

public V computeIfAbsent(K, java.util.function.Function<? super K, ? extends V>);

public V computeIfPresent(K, java.util.function.BiFunction<? super K, ? super V, ? extends V>);

public V compute(K, java.util.function.BiFunction<? super K, ? super V, ? extends V>);

public V merge(K, V, java.util.function.BiFunction<? super V, ? super V, ? extends V>);

public void forEach(java.util.function.BiConsumer<? super K, ? super V>);

public void replaceAll(java.util.function.BiFunction<? super K, ? super V, ? extends V>);

public java.lang.Object clone();

final float loadFactor();

final int capacity();

java.util.HashMap$Node<K, V> newNode(int, K, V, java.util.HashMap$Node<K, V>);

java.util.HashMap$Node<K, V> replacementNode(java.util.HashMap$Node<K, V>, java.util.HashMap$Node<K, V>);

java.util.HashMap$TreeNode<K, V> newTreeNode(int, K, V, java.util.HashMap$Node<K, V>);

java.util.HashMap$TreeNode<K, V> replacementTreeNode(java.util.HashMap$Node<K, V>, java.util.HashMap$Node<K, V>);

void reinitialize();

void afterNodeAccess(java.util.HashMap$Node<K, V>);

void afterNodeInsertion(boolean);

void afterNodeRemoval(java.util.HashMap$Node<K, V>);

void internalWriteEntries(java.io.ObjectOutputStream) throws java.io.IOException;

}

BhrikutiSoft

How HashMap Works Internally

No comments:

Popular Posts

Translate

Pages

Comments

Mukesh Jha

Followers

Copyright

Popular Posts

Recent Post

My Posts