java.util.HashMap is a basic data structure for
Java Virtual Machine.
It provides functionality to map value by key.
The most important interface implemented is
It is a generic class, so key and value can be of any type.
HashMap will calculate the
hashCode of the object to operate with it.
What interesting is that this implementation provides
constant-time performance for the basic operations (
In this article I will overview the internal implementation and answer the question, How does it provide
There are 3 main constructors for HashMap:
- default one
public HashMap(). It doesn’t have any arguments. By default, it assumes
public HashMap(int initialCapacity)- in this constructor the default
public HashMap(int initialCapacity, float loadFactor)- in this constructor, both parameters can be customized.
HashMap stores elements in so-called
buckets and the number of buckets is called
Every time you insert an element,
integer needs to be calculated.
Integer can store
4,294,967,295 values. For sure we cannot have that amount of buckets.
In real live Java starts with
16 buckets. It needs to map calculated hashCode to a number of buckets.
It uses a simple algorithm:
targetBuckerNumber = hashCode % buckerNumbers;
So what will happen when we have a conflict of hashCodes? It will simply make a list of keys. It can be visualized by the picture below:
Load factor and capacity
When does HashMap know when to increase the number of buckets? The algorithm is pretty easy:
Let’s say our initial capacity of Hashmap is 16. The default load factor is 0.75. Maximum number of elements for HashMap = 16 * 0.75 = 12. So the HashMap can store 12 elements, when 13-th elements occurred, HashMap will increase capacity by a factor of 2, so it will become 32.
As we can see, even with an increasing number of buckets, there is still possible that one bucket will contain a lot of keys. Iteration over elements create means that time complexity is O(n). But Hashmap guarantees that access time would be constant.
For this to work correctly, equal keys must have the same hash, however, different keys can have the same hash. If two different keys have the same hash, the two values belonging to them will be stored in the same bucket. Inside a bucket, values are stored in a list and retrieved by looping over all elements. The cost of this is O(n).
Java 8 came with a solution of JEP 180 the data structure in which the values inside one bucket are stored is changed from a list to a balanced tree if a bucket contains 8 or more values, and it’s changed back to a list if, at some point, only 6 values are left in the bucket. This improves the performance to be O(log n).
In this article, we discovered the internal implementation of HashMap and found an optimization mechanism that keeps this data structure extremely efficient.