自己动手写写：HashMap源码浅析

boy00fly

浏览: 194576 次
性别:
来自: 南京

最近访客更多访客>>

sycvswho

woailihui

yangbo126

664050598

博主相关

博客

微博

相册

留言

关于我

文章分类

社区版块

存档分类

博客分类：

Java source code analysis

虽说论坛中有很多关于HashMap源码的分析，并且都是分析得很不错的文章，但是我还是想写出自己的一份心德！

三. HashMap

还是先来看看HashMap的类结构吧！

public class HashMap<K, V> extends AbstractMap<K, V> implements Map<K, V>, Cloneable, Serializable

1. HashMap的数据存储结构

HashMap采用的是一种数组+链表的存储数据结构！先来感性地看一张图：

其中数据1,2,4,15都是属于HashMap中存储的value值，至于这些值为什么存放在不同位置，这是key经过hash运算，再计算得出的;

这里有人就会问了：”这个计算出来的结果会不会重复呢？“，答案是：这种情况是很有可能发生的。接着又会问：”重复了的话，值怎么放呢？“，

此时链表的作用就发挥了，图中4和15这两个value值就是这种情况。ps:下面会详细介绍。

2. 几个重要的成员变量

/**                                                                      
 * The default initial capacity - MUST be a power of two.                
 */                                                                      
static final int DEFAULT_INITIAL_CAPACITY = 16;                          
                                                                         
/**                                                                      
 * The maximum capacity, used if a higher value is implicitly specified  
 * by either of the constructors with arguments.                         
 * MUST be a power of two <= 1<<30.                                      
 */                                                                      
static final int MAXIMUM_CAPACITY = 1 << 30;                             
                                                                         
/**                                                                      
 * The load factor used when none specified in constructor.              
 */                                                                      
static final float DEFAULT_LOAD_FACTOR = 0.75f;                          
                                                                         
/**                                                                      
 * The table, resized as necessary. Length MUST Always be a power of two.
 */                                                                      
transient Entry[] table;                                                 
                                                                         
/**                                                                      
 * The number of key-value mappings contained in this map.               
 */                                                                      
transient int size;                                                      
                                                                         
/**                                                                      
 * The next size value at which to resize (capacity * load factor).      
 * @serial                                                               
 */                                                                      
int threshold;                                                           
                                                                         
/**                                                                      
 * The load factor for the hash table.                                   
 *                                                                       
 * @serial                                                               
 */                                                                      
final float loadFactor;

DEFAULT_INITIAL_CAPACITY ：其实并不是HashMap的默认初始化容量，而是table数组的长度，并且值大小必须是2的幂次方;

MAXIMUM_CAPACITY：table数组的最大长度是2的30次方;

table：存储了所有的key-value mapping！

我们先来看一下Entry的源码片段：

static class Entry<K, V> implements Map.Entry<K, V>//类结构

//重要的变量
final K key;      
                  
V value;          
                  
Entry<K, V> next; 
                  
final int hash;

Entry是HashMap的一个内部静态类，这些成员变量你们一看就应该明白的，其中next是在链表上的下一个Entry；

例如上图中：值为15的Entry的next就指向了值为4的Entry，而值为1的Entry的next为null，因为没有此链表上没有next Entry.

size：HashMap的已存储数据的数量;ps：不是table数组的长度

DEFAULT_LOAD_FACTOR：默认的加载因子是0.75f;

threshold：称之为闸阀，如果HashMap的size >= threadhold了，那么table数组就要扩容了，并且扩容率是100%，即table数组长度变为原来的两倍;

此时有人要问了：”这个threshold的值大小是怎么算出来的呢？“，源码中已经表述得很清楚了，下面是构造函数中的一个代码片段：

// Find a power of 2 >= initialCapacity  
int capacity = 1;                        
while (capacity < initialCapacity)       
    capacity <<= 1;                      
                                         
this.loadFactor = loadFactor;            
threshold = (int)(capacity * loadFactor);

其中initialCapacity是构造函数的一个参数，意为：初始容量；明白了吧，这个initialCapacity并不能直接拿来用，要经过一定的运算保证，

初始化的table数组大小必须是2的幂次方并且不能比initialCapacity的值小。

3. 构造函数

/**                                                                                        
 * Constructs an empty <tt>HashMap</tt> with the specified initial                         
 * capacity and load factor.                                                               
 *                                                                                         
 * @param  initialCapacity the initial capacity                                            
 * @param  loadFactor      the load factor                                                 
 * @throws IllegalArgumentException if the initial capacity is negative                    
 *         or the load factor is nonpositive                                               
 */                                                                                        
public HashMap(int initialCapacity, float loadFactor)                                      
{                                                                                          
    if (initialCapacity < 0)                                                               
        throw new IllegalArgumentException("Illegal initial capacity: " + initialCapacity);
    if (initialCapacity > MAXIMUM_CAPACITY)                                                
        initialCapacity = MAXIMUM_CAPACITY;                                                
    if (loadFactor <= 0 || Float.isNaN(loadFactor))                                        
        throw new IllegalArgumentException("Illegal load factor: " + loadFactor);          
                                                                                           
    // Find a power of 2 >= initialCapacity                                                
    int capacity = 1;                                                                      
    while (capacity < initialCapacity)                                                     
        capacity <<= 1;                                                                    
                                                                                           
    this.loadFactor = loadFactor;                                                          
    threshold = (int)(capacity * loadFactor);                                              
    table = new Entry[capacity];                                                           
    init();                                                                                
}

上面的这个构造函数是比较重要的，另外一些构造函数都是依赖于它的。在明白了上面我描述的内容后，此构造函数理解起来是相当简单的，不在累述了！

4. 几个重要的方法

put(K key, V value)

/**                                                                 
 * Associates the specified value with the specified key in this map.
 * If the map previously contained a mapping for the key, the old   
 * value is replaced.                                               
 *                                                                  
 * @param key key with which the specified value is to be associated
 * @param value value to be associated with the specified key       
 * @return the previous value associated with <tt>key</tt>, or      
 *         <tt>null</tt> if there was no mapping for <tt>key</tt>.  
 *         (A <tt>null</tt> return can also indicate that the map   
 *         previously associated <tt>null</tt> with <tt>key</tt>.)  
 */                                                                 
public V put(K key, V value)                                        
{                                                                   
    if (key == null)                                                
        return putForNullKey(value);                                
    int hash = hash(key.hashCode());                                
    int i = indexFor(hash, table.length);                           
    for (Entry<K, V> e = table[i]; e != null; e = e.next)           
    {                                                               
        Object k;                                                   
        if (e.hash == hash && ((k = e.key) == key || key.equals(k)))
        {                                                           
            V oldValue = e.value;                                   
            e.value = value;                                        
            e.recordAccess(this);                                   
            return oldValue;                                        
        }                                                           
    }                                                               
                                                                    
    modCount++;                                                     
    addEntry(hash, key, value, i);                                  
    return null;                                                    
}

这个方法时比较重要的，也是值得好好分析一下的，下面我们一步一步来分析：

1. key == null 时，看一下putForNullKey(V value)这个方法的源码：

/**                                                      
 * Offloaded version of put for null keys                
 */                                                      
private V putForNullKey(V value)                         
{                                                        
    for (Entry<K, V> e = table[0]; e != null; e = e.next)
    {                                                    
        if (e.key == null)                               
        {                                                
            V oldValue = e.value;                        
            e.value = value;                             
            e.recordAccess(this);                        
            return oldValue;                             
        }                                                
    }                                                    
    modCount++;                                          
    addEntry(0, null, value, 0);                         
    return null;                                         
}

/**                                                               
 * Adds a new entry with the specified key, value and hash code to
 * the specified bucket.  It is the responsibility of this        
 * method to resize the table if appropriate.                     
 *                                                                
 * Subclass overrides this to alter the behavior of put method.   
 */                                                               
void addEntry(int hash, K key, V value, int bucketIndex)          
{                                                                 
    Entry<K, V> e = table[bucketIndex];                           
    table[bucketIndex] = new Entry<K, V>(hash, key, value, e);    
    if (size++ >= threshold)                                      
        resize(2 * table.length);                                 
}

这里先遍历table[0]出的链表，看是否已经存放过key为null的Entry，如果存在则替换掉此Entry的value值，否则就在table[0]处插入Entry。

ps:这里我们可以看出key为null的Entry均是放在table[0]处的，并且hash值也为0.

2. key != null 时，先通过key计算出hash值，再通过hash值运算出table的索引值i，接着循环遍历在table[i]处的链表，

看链表中的key是否已经存在，存在就替换value值，不存在就new一个Entry出来，插入的链表中，next指向插入前table[i]处的Entry！

get(Object key)

/**                                                                                 
 * Returns the value to which the specified key is mapped,                          
 * or {@code null} if this map contains no mapping for the key.                     
 *                                                                                  
 * <p>More formally, if this map contains a mapping from a key                      
 * {@code k} to a value {@code v} such that {@code (key==null ? k==null :           
 * key.equals(k))}, then this method returns {@code v}; otherwise                   
 * it returns {@code null}.  (There can be at most one such mapping.)               
 *                                                                                  
 * <p>A return value of {@code null} does not <i>necessarily</i>                    
 * indicate that the map contains no mapping for the key; it's also                 
 * possible that the map explicitly maps the key to {@code null}.                   
 * The {@link #containsKey containsKey} operation may be used to                    
 * distinguish these two cases.                                                     
 *                                                                                  
 * @see #put(Object, Object)                                                        
 */                                                                                 
public V get(Object key)                                                            
{                                                                                   
    if (key == null)                                                                
        return getForNullKey();                                                     
    int hash = hash(key.hashCode());                                                
    for (Entry<K, V> e = table[indexFor(hash, table.length)]; e != null; e = e.next)
    {                                                                               
        Object k;                                                                   
        if (e.hash == hash && ((k = e.key) == key || key.equals(k)))                
            return e.value;                                                         
    }                                                                               
    return null;                                                                    
}

get方法也很简单，对于key值为null的做一个特殊处理，table[0]出的链表遍历一遍，有就返回value，没有就返回null，不多说了.

containsKey(Object key)和containsValue(Object value)

说一下思路吧：

containsKey就是经过一系列的运算找到key对应的table index值(当然了null key要特殊处理的，你们懂的！)，再循环遍历table[index]的链表即可。

containsVlaue没有好的办法，两层循环来搞定，看源码吧：

public boolean containsValue(Object value)           
{                                                    
    if (value == null)                               
        return containsNullValue();                  
                                                     
    Entry[] tab = table;                             
    for (int i = 0; i < tab.length; i++)             
        for (Entry e = tab[i]; e != null; e = e.next)
            if (value.equals(e.value))               
                return true;                         
    return false;                                    
}

看到了吧，遍历数组，再遍历每一个链表。

remove(Object key)

由于remove方法就是调用了removeEntryForKey，我们来看这个方法的源码：

/**                                                                                  
 * Removes and returns the entry associated with the specified key                   
 * in the HashMap.  Returns null if the HashMap contains no mapping                  
 * for this key.                                                                     
 */                                                                                  
final Entry<K, V> removeEntryForKey(Object key)                                      
{                                                                                    
    int hash = (key == null) ? 0 : hash(key.hashCode());                             
    int i = indexFor(hash, table.length);                                            
    Entry<K, V> prev = table[i];                                                     
    Entry<K, V> e = prev;                                                            
                                                                                     
    while (e != null)                                                                
    {                                                                                
        Entry<K, V> next = e.next;                                                   
        Object k;                                                                    
        if (e.hash == hash && ((k = e.key) == key || (key != null && key.equals(k))))
        {                                                                            
            modCount++;                                                              
            size--;                                                                  
            if (prev == e)                                                           
                table[i] = next;                                                     
            else                                                                     
                prev.next = next;                                                    
            e.recordRemoval(this);                                                   
            return e;                                                                
        }                                                                            
        prev = e;                                                                    
        e = next;                                                                    
    }                                                                                
                                                                                     
    return e;                                                                        
}

也说一下思路吧：

经过一系列的运算找到key对应的table index值，也就找到了这个链表，遍历链表得到此key的Entry，删除此Entry,再将链表接起来，

算法细节大家就自己直接看源码吧，不再累述了！

entrySet()

/**                                                                 
 * Returns a {@link Set} view of the mappings contained in this map.
 * The set is backed by the map, so changes to the map are          
 * reflected in the set, and vice-versa.  If the map is modified    
 * while an iteration over the set is in progress (except through   
 * the iterator's own <tt>remove</tt> operation, or through the     
 * <tt>setValue</tt> operation on a map entry returned by the       
 * iterator) the results of the iteration are undefined.  The set   
 * supports element removal, which removes the corresponding        
 * mapping from the map, via the <tt>Iterator.remove</tt>,          
 * <tt>Set.remove</tt>, <tt>removeAll</tt>, <tt>retainAll</tt> and  
 * <tt>clear</tt> operations.  It does not support the              
 * <tt>add</tt> or <tt>addAll</tt> operations.                      
 *                                                                  
 * @return a set view of the mappings contained in this map         
 */                                                                 
public Set<Map.Entry<K, V>> entrySet()                              
{                                                                   
    return entrySet0();                                             
}

为何要将一下这个方法？论坛中也有很多谈论map遍历的效率的问题，用哪种方法效率高！如果你能够了解HashMap的内部数据结构的话这个问题就很简单了，

当然是遍历table这个数组就行了啊，效率杠杠地！呵呵，对entrySet就是返回的这个，不过是以Set的形式返回而已！

ps:对于这个方法的细节问题我们就不讨论了，有兴趣的可以自己看源码分析！

好了，HashMap的内容暂时就这么多了，当然了还有很多的问题我们没有讨论，比如hash运算的问题，我觉得这个是另外一块的内容了，

对于了解HashMap暂且可以抛开这个问题，hash运算是个很大的讨论内容了，这里不再累述了，有兴趣的读者可以google了解下。

ps:附件中我上传了一个jar包，可以模拟Data Structure相关的运算，非常的不错！推荐下载！命令java - jar visualization.jar 就可以运行！

里面包含了hashing模拟运算过程！

也可参考一篇文章Java Map 集合类简介