I always wanted to implement the logic behind the famous data structure of Java, HashMap and here it comes. Though it’s not the complete implementation considering all optimization scenarios, but it will highlight the basic algorithm behind the HashMap.
So let’s start, this implementation will use my LinkedList implementation (Reason: for this implementation I thought to write everything from the scratch apart from primitive data types. Sounds weird? May be ;-) ). You may refer my earlier post on LinkedList, as I’m going to use it.
HashMap is a key-value pair data structure, where you retrieve value by passing key. To optimize the search of key, a concept of Bucket (an Array of Java Objects) has been introduced. This is used, so that if search hits this Bucket, corresponding value element is instantly retrieved, otherwise it iterates sequentially through the Linked List associated with each Bucket element. So you can say if all HashMap elements get fit into the Bucket, retrieving elements back, will be super fast and this is the Best Case. But that also means you are keeping everything in an array which is not good in-terms of heap memory storage limitation. So we need to consider both the things Bucket and LinkedList. This is really interesting, so let me know, if you guys want to discuss more on this.
To make sure that all Key_Index (Hashcode of Key Object, calculated by calling Object’s hashCode method) becomes less than Bucket_Size, we will do the mod operation i.e.
Key_Index = key.hashCode() % Bucket_Size
This returns integer value which is always less than Bucket_Size. If multiple key Objects return same Key_Index, then all matching keys will be stored in the associated LinkedList of the Bucket element. So if the bucket size is 10, following figure depicts the behavior of element entry into HashMap-
K0, K1 etc. are elements stored in the bucket,
where as V0, V1, etc. are elements stored in LinkedList of corresponding
bucket elements. So, 3 elements stored in K0’s LinkedList, 1 element
stored in K1’s LinkedList.
Structure of HashMap Node Object
import com.ds.link.LinkedList;
public class HashNode {
/**
* HashMap Key
*/
Object key;
/**
* HashMap Value
*/
Object value;
/**
* List of HashMap nodes which doesn't get fit into bucket
*/
LinkedList list;
/**
* by default the list will be null
*/
public void initList(){
if(list == null)
list = new LinkedList();
}
}
MyHashMap.java
First thing first, define the Bucket (an Object Array)
final int bucket_size = 10;
Object[] bucket;
/**
* @param bucket_size- custom bucket size
*/
public MyHashMap(int bucket_size){
this.bucket_size = bucket_size;
bucket = new Object[bucket_size];
}
/**
* default bucket_size
*/
public MyHashMap(){
bucket = new Object[bucket_size];
}
Add Element
/**
* Add key, value pair in the HashMap
* @param key
* @param value
*/
public void add(Object key, Object value){
int kCode = key.hashCode();
int index = kCode % bucket_size;
System.out.println("->"+ index);
if(bucket[index] == null){
System.out.println("First entry in Bucket: Key="+ key);
HashNode node = new HashNode();
node.key = key;
node.value = value;
bucket[index] = node;
}
else{
// value already exists, so add this element into LinkedList
HashNode node = (HashNode)bucket[index];
System.out.println("Add into List of: "+ node.key +" Key="+ key);
// initiate the list
node.initList();
HashNode newNode = new HashNode();
newNode.key = key;
newNode.value = value;
// add the new element into list
node.list.addToEnd(newNode);
}
}
Get Element
/**
* If the key is not present, it will return null
* @param key
* @return
*/
public Object get(Object key){
int index = key.hashCode() % bucket_size;
System.out.println("Searching for Bucket Index: "+ index);
if(index >= bucket.length)
new ArrayIndexOutOfBoundsException("Invalid Attempt: Index="+ index);
HashNode node = (HashNode)bucket[index];
if(node == null){
// At this position, there is no node in the bucket
return null;
}
if(node.key.hashCode() == key.hashCode()){
// element is found in the bucket
System.out.println("1st Level Match....");
return node.value;
}
// unable to find in the element in the bucket
// find in the associated list
LinkedList list = node.list;
if(list == null){
System.out.println("No associated list");
return null;
}
// pointer to the root node of the linkedlist
Node first = list.getFirst();
while(true){
if(first == null)
break;
HashNode hn = (HashNode)first.data;
if(key.hashCode() == hn.key.hashCode()){
System.out.println("2nd Level Match Found...");
return hn.value;
}
first = first.pointer;
}
// still unable to find the element
System.out.println("Failed to hit bulls eye!! No match found");
return null;
}
Share your comments, I'm listening. For more on my open source work, visit Git Hub.
Hi Prasanta,
ReplyDeleteNice effort to understand the working of a HashMap.
Just to add on to the above algorithm :
Java HashMap supports null keys, and null doesn't have a hash code
So to deal with it and prevent NPE, it just defaults to 0th index if key is null.
That means whenever key is null..bucket 0 is selected and the node is added to the corresponding list.
Similarly while fetching the value of null key , bucket 0 is searched for node whose key is null.
~Sachin
Very Nice Prasanta.
ReplyDeleteI have 2 observations.
bucket_size can not be final & in the add method
node.initList() should be done following a null check .
if(node.list == null){ node.initList() }
Wonderful, too good !
ReplyDeleteThanks, it was helpful, Could you add details on Resizing and Race conditions too ?
ReplyDeleteExcellent explanation,I am still confuse how it handles null key and why we require to add this feature
ReplyDeleteNice explanation. Keep sharing about collections and Threads. Thanks in advance
ReplyDeleteAwesome...!
ReplyDeleteBy
Vasan Rajendran
Good Explanation and thanks!
ReplyDeletebut i am confused in the following code of lines:
if(node.key.hashCode() == key.hashCode()){
// element is found in the bucket
System.out.println("1st Level Match....");
return node.value;
}
Here we are just matching the key hashcode and according to that we are returning value but If more that one keys returned the same hashcode then then what will be returned/
For example if our key was AB and hascode return its length that is bucket is 2 now suppose if we search with key CD then again hashcode is 2 and according to above line of code it will return the value of associated with key AB.
Please let me know
Nice One...
ReplyDeleteBut 1 Observation in 2nd Level Match in place of
if(key.hashCode() == hn.key.hashCode()){ It should be
if(key.equals(hn.key)){}
Please correct me if I am wrong
Thanks
nice explanation
ReplyDeleteyour above mentioned linkedList link is not working.. kindly check and let us know the exact URL for your linkedList post
ReplyDeleteI have changed the URL. Pls check-
ReplyDeletehttp://prasanta-paul.blogspot.in/2010/06/linkedlist-in-java.html
BR,
Prasanta