HashMap源码分析
HashMap源码分析
1. HashMap介绍
HashMap是一个Hash表,通过key-value来存储数据,并允许使用 null 值和 null 键。HashMap并不保证映射顺序,而是通过Hash算法将key-value保存到对应的索引位置。
还有一点就是HashMap不是线程安全的,但是可以通过Collections类的静态方法synchronizedMap变成线程安全的Map。HashMap之所以线程不安全是因为多个线程修改map时,可能会导致环状结构,形成死循环。
2. HashMap继承的接口
public class HashMap<K,V> extends AbstractMap<K,V>
implements Map<K,V>, Cloneable, Serializable 3. HashMap属性
/**
* The default initial capacity - MUST be a power of two.
* ---------------------------------------------------------
* 默认的初始容量,必须是2的次幂
*/
static final int DEFAULT_INITIAL_CAPACITY = 1 << 4; // aka 16
/**
* The maximum capacity, used if a higher value is implicitly specified
* by either of the constructors with arguments.
* MUST be a power of two <= 1<<30.
* -----------------------------------------
* 最大的容量值,指定的最大容量值也要小于等于2^29
*/
static final int MAXIMUM_CAPACITY = 1 << 30;
/**
* The load factor used when none specified in constructor.
* -------------------------------------------
* 默认的加载因子
*/
static final float DEFAULT_LOAD_FACTOR = 0.75f;
/**
* The bin count threshold for using a tree rather than list for a
* bin. Bins are converted to trees when adding an element to a
* bin with at least this many nodes. The value must be greater
* than 2 and should be at least 8 to mesh with assumptions in
* tree removal about conversion back to plain bins upon
* shrinkage.
* -------------------------------------------
* 树化链表的阈值,如果一个链表长度大于或等于8,扩容或者树化链表
*/
static final int TREEIFY_THRESHOLD = 8;
/**
* The bin count threshold for untreeifying a (split) bin during a
* resize operation. Should be less than TREEIFY_THRESHOLD, and at
* most 6 to mesh with shrinkage detection under removal.
* ------------------------------------------
* 取消树化的阈值
*/
static final int UNTREEIFY_THRESHOLD = 6;
/**
* The smallest table capacity for which bins may be treeified.
* (Otherwise the table is resized if too many nodes in a bin.)
* Should be at least 4 * TREEIFY_THRESHOLD to avoid conflicts
* between resizing and treeification thresholds.
* -----------------------------------------------
* 当某一位置的链表长度到达8时,就会进行树化。
* 树化的过程 当table小于64,则进行扩容 ; 否则进行树化
* 后面会讲解到
*/
static final int MIN_TREEIFY_CAPACITY = 64;
/**
* The table, initialized on first use, and resized as
* necessary. When allocated, length is always a power of two.
* (We also tolerate length zero in some operations to allow
* bootstrapping mechanics that are currently not needed.)
* --------------------------------------------------------
* 存储数据的数组
*/
transient Node<K,V>[] table;
/**
* Holds cached entrySet(). Note that AbstractMap fields are used
* for keySet() and values().
* --------------------------------------------------------------
* keySet返回的结果集
*/
transient Set<Map.Entry<K,V>> entrySet;
/**
* The number of key-value mappings contained in this map.
* ----------------------------------------------------------------
* Hash表中key-value的个数
*/
transient int size;
/**
* The number of times this HashMap has been structurally modified
* Structural modifications are those that change the number of mappings in
* the HashMap or otherwise modify its internal structure (e.g.,
* rehash). This field is used to make iterators on Collection-views of
* the HashMap fail-fast. (See ConcurrentModificationException).
* ----------------------------------------------------------------
* 修改次数
*/
transient int modCount;
/**
* The next size value at which to resize (capacity * load factor).
* ---------------------------------------------------------
* 下一个要调整的大小值
* @serial
*/
int threshold;
/**
* The load factor for the hash table.
* ----------------------------------------------
* 装载因子
* @serial
*/
final float loadFactor;
4. 内部类
/**
* Basic hash bin node, used for most entries. (See below for
* TreeNode subclass, and in LinkedHashMap for its Entry subclass.)
*/
static class Node<K,V> implements Map.Entry<K,V> {
final int hash;
final K key;
V value;
Node<K,V> next;
Node(int hash, K key, V value, Node<K,V> next) {
this.hash = hash;
this.key = key;
this.value = value;
this.next = next;
}
public final K getKey() { return key; }
public final V getValue() { return value; }
public final String toString() { return key + "=" + value; }
public final int hashCode() {
return Objects.hashCode(key) ^ Objects.hashCode(value);
}
public final V setValue(V newValue) {
V oldValue = value;
value = newValue;
return oldValue;
}
//判断key-value是否相等
//判断相等首先要看是否是地址相同,然后判断equals判断key和value是否相同
public final boolean equals(Object o) {
if (o == this)
return true;
if (o instanceof Map.Entry) {
Map.Entry<?,?> e = (Map.Entry<?,?>)o;
if (Objects.equals(key, e.getKey()) &&
Objects.equals(value, e.getValue()))
return true;
}
return false;
}
} //返回key的集合
public Set<K> keySet() {
Set<K> ks = keySet;
if (ks == null) {
ks = new KeySet();
keySet = ks;
}
return ks;
}
final class KeySet extends AbstractSet<K> {
public final int size() { return size; }
public final void clear() { HashMap.this.clear(); }
public final Iterator<K> iterator() { return new KeyIterator(); }
public final boolean contains(Object o) { return containsKey(o); }
public final boolean remove(Object key) {
return removeNode(hash(key), key, null, false, true) != null;
}
public final Spliterator<K> spliterator() {
return new KeySpliterator<>(HashMap.this, 0, -1, 0, 0);
}
public final void forEach(Consumer<? super K> action) {
Node<K,V>[] tab;
if (action == null)
throw new NullPointerException();
if (size > 0 && (tab = table) != null) {
int mc = modCount;
for (int i = 0; i < tab.length; ++i) {
for (Node<K,V> e = tab[i]; e != null; e = e.next)
action.accept(e.key);
}
if (modCount != mc)
throw new ConcurrentModificationException();
}
}
} //返回value的集合
public Collection<V> values() {
Collection<V> vs = values;
if (vs == null) {
vs = new Values();
values = vs;
}
return vs;
}
final class Values extends AbstractCollection<V> {
public final int size() { return size; }
public final void clear() { HashMap.this.clear(); }
public final Iterator<V> iterator() { return new ValueIterator(); }
public final boolean contains(Object o) { return containsValue(o); }
public final Spliterator<V> spliterator() {
return new ValueSpliterator<>(HashMap.this, 0, -1, 0, 0);
}
public final void forEach(Consumer<? super V> action) {
Node<K,V>[] tab;
if (action == null)
throw new NullPointerException();
if (size > 0 && (tab = table) != null) {
int mc = modCount;
for (int i = 0; i < tab.length; ++i) {
for (Node<K,V> e = tab[i]; e != null; e = e.next)
action.accept(e.value);
}
if (modCount != mc)
throw new ConcurrentModificationException();
}
}
}
/**
* Entry for Tree bins. Extends LinkedHashMap.Entry (which in turn
* extends Node) so can be used as extension of either regular or
* linked node.
*/
static final class TreeNode<K,V> extends LinkedHashMap.Entry<K,V> {
TreeNode<K,V> parent; // red-black tree links
TreeNode<K,V> left;
TreeNode<K,V> right;
TreeNode<K,V> prev; // needed to unlink next upon deletion
boolean red;
TreeNode(int hash, K key, V val, Node<K,V> next) {
super(hash, key, val, next);
}
//…………一堆方法
} 5. 构造方法
/*
* 从此方法可以看出,构造函数传进来{初始容量}和{装载因子},通过修改threshold来在resize()方法中
* 改变table数组的大小。
*/
public HashMap(int initialCapacity, float loadFactor) {
if (initialCapacity < 0)
throw new IllegalArgumentException("Illegal initial capacity: " +
initialCapacity);
if (initialCapacity > MAXIMUM_CAPACITY)
initialCapacity = MAXIMUM_CAPACITY;
if (loadFactor <= 0 || Float.isNaN(loadFactor))
throw new IllegalArgumentException("Illegal load factor: " +
loadFactor);
//装载因子
this.loadFactor = loadFactor;
//找到大于initialCapacity最小2的次幂数
this.threshold = tableSizeFor(initialCapacity);
} 6. 重要方法
1. resize
/**
* Initializes or doubles table size. If null, allocates in
* accord with initial capacity target held in field threshold.
* Otherwise, because we are using power-of-two expansion, the
* elements from each bin must either stay at same index, or move
* with a power of two offset in the new table.
*
* @return the table
*/
final Node<K,V>[] resize() {
Node<K,V>[] oldTab = table;
//oldCap为旧数组长度
int oldCap = (oldTab == null) ? 0 : oldTab.length;
//oldThr为旧下次要调整的大小
int oldThr = threshold;
int newCap, newThr = 0;
//主要是给newCap和newThr赋值
if (oldCap > 0) {
//如果数组超过最大值,则将
if (oldCap >= MAXIMUM_CAPACITY) {
threshold = Integer.MAX_VALUE;
return oldTab;
}
else if ((newCap = oldCap << 1) < MAXIMUM_CAPACITY &&
oldCap >= DEFAULT_INITIAL_CAPACITY)
newThr = oldThr << 1; // double threshold
}
else if (oldThr > 0) // initial capacity was placed in threshold
newCap = oldThr;
else { // zero initial threshold signifies using defaults
newCap = DEFAULT_INITIAL_CAPACITY;
newThr = (int)(DEFAULT_LOAD_FACTOR * DEFAULT_INITIAL_CAPACITY);
}
if (newThr == 0) {
float ft = (float)newCap * loadFactor;
newThr = (newCap < MAXIMUM_CAPACITY && ft < (float)MAXIMUM_CAPACITY ?
(int)ft : Integer.MAX_VALUE);
}
threshold = newThr;
@SuppressWarnings({"rawtypes","unchecked"})
Node<K,V>[] newTab = (Node<K,V>[])new Node[newCap];
table = newTab;
if (oldTab != null) {
for (int j = 0; j < oldCap; ++j) {
Node<K,V> e;
if ((e = oldTab[j]) != null) {
oldTab[j] = null;
//只有一个节点
if (e.next == null)
newTab[e.hash & (newCap - 1)] = e;
//红黑树
else if (e instanceof TreeNode)
((TreeNode<K,V>)e).split(this, newTab, j, oldCap);
//链表
else { // preserve order
//将每个链分为两份
Node<K,V> loHead = null, loTail = null;
Node<K,V> hiHead = null, hiTail = null;
Node<K,V> next;
//循环,将一个链表分割成两个链表
do {
next = e.next;
if ((e.hash & oldCap) == 0) {
if (loTail == null)
loHead = e;
else
loTail.next = e;
loTail = e;
}
else {
if (hiTail == null)
hiHead = e;
else
hiTail.next = e;
hiTail = e;
}
} while ((e = next) != null);
//loHead链表还在原来位置
if (loTail != null) {
loTail.next = null;
newTab[j] = loHead;
}
//hiTail链表迁移到j+oldCap位置
if (hiTail != null) {
hiTail.next = null;
//非常灵性
//为什么说j+oldCap有灵性呢?
//因为计算机都是二进制存储,扩容之后的新的位置的bit位变成1即可
newTab[j + oldCap] = hiHead;
}
}
}
}
}
return newTab;
} 总结一下
resize方法是HashMap中table的初始化方法和扩容方法,扩容后的大小为threshold。然后分割红黑树,或者分割链表。最后返回一个新数组。
2. hash
计算key的hash值
static final int hash(Object key) {
int h;
return (key == null) ? 0 : (h = key.hashCode()) ^ (h >>> 16);
} 3. put
put操作是最最常规用的方法之一
put -> putVal -> 链表尾插入 || putTreeVal ->
/**
* Associates the specified value with the specified key in this map.
* If the map previously contained a mapping for the key, the old
* value is replaced.
* ---------------------------------------------------------------------
* 在map里将特殊的key和value联系起来,如果map中包含这个key,则将旧的value替换成新的value
*/
public V put(K key, V value) {
return putVal(hash(key), key, value, false, true);
} put方法调用的是putVal方法
/**
* Implements Map.put and related methods
*
* @param hash hash for key
* @param key the key
* @param value the value to put
* @param onlyIfAbsent if true, don't change existing value
* @param evict if false, the table is in creation mode.
* @return previous value, or null if none
*/
final V putVal(int hash, K key, V value, boolean onlyIfAbsent,
boolean evict) {
Node<K,V>[] tab; Node<K,V> p; int n, i;
//判断数组是否需要初始化
if ((tab = table) == null || (n = tab.length) == 0)
n = (tab = resize()).length;
//1.没有hash冲突
if ((p = tab[i = (n - 1) & hash]) == null)
tab[i] = newNode(hash, key, value, null);
//2.有hash冲突,使用拉链法
else {
Node<K,V> e; K k;
//p是一个链表的首元素(头节点)
//1.key已经存在,但是要根据onlyIfAbsent来决定是否更新
if (p.hash == hash &&
((k = p.key) == key || (key != null && key.equals(k))))
e = p;
//2. 红黑树
else if (p instanceof TreeNode)
e = ((TreeNode<K,V>)p).putTreeVal(this, tab, hash, key, value);
//3. 链表
else {
//binCount记录链表长度
for (int binCount = 0; ; ++binCount) {
//不存在,插入链表结尾
if ((e = p.next) == null) {
//新节点
p.next = newNode(hash, key, value, null);
//是否树化
if (binCount >= TREEIFY_THRESHOLD - 1) // -1 for 1st
treeifyBin(tab, hash);
break;
}
//已存在
if (e.hash == hash &&
((k = e.key) == key || (key != null && key.equals(k))))
break;
p = e;
}
}
//存在待更新的值
if (e != null) { // existing mapping for key
V oldValue = e.value;
if (!onlyIfAbsent || oldValue == null)
e.value = value;
afterNodeAccess(e);
//返回替换的旧节点
return oldValue;
}
}
//修改次数+1
++modCount;
//判断是否扩容
if (++size > threshold)
resize();
//LinkedHashMap实现的方法,而HashMap中是空方法
afterNodeInsertion(evict);
return null;
}
再了解一下treeifyBin方法
/**
* Replaces all linked nodes in bin at index for given hash unless
* table is too small, in which case resizes instead.
* ------------------------------------------------------
* 树化方法
*/
final void treeifyBin(Node<K,V>[] tab, int hash) {
int n, index; Node<K,V> e;
//重点
//table长度小于64就会扩容
if (tab == null || (n = tab.length) < MIN_TREEIFY_CAPACITY)
resize();
//树化
else if ((e = tab[index = (n - 1) & hash]) != null) {
TreeNode<K,V> hd = null, tl = null;
do {
TreeNode<K,V> p = replacementTreeNode(e, null);
if (tl == null)
hd = p;
else {
p.prev = tl;
tl.next = p;
}
tl = p;
} while ((e = e.next) != null);
if ((tab[index] = hd) != null)
hd.treeify(tab);
}
} 总结一下putVal
putVal是HashMap的插入方法,首先table是否初始化。接着判断key值是否出现hash冲突,不冲突直接加入table中;冲突则采用拉链法,一种加入到红黑树,一种加入链表。
- 链表长度大于等于8时,会转成红黑树
- 转化为红黑树的必要条件是table数组长度大于等于64,否则不会转化为红黑树,而是进行扩容
4. get
public V get(Object key) {
Node<K,V> e;
return (e = getNode(hash(key), key)) == null ? null : e.value;
} get方法调用getNode方法
/**
* Implements Map.get and related methods
*
* @param hash hash for key
* @param key the key
* @return the node, or null if none
*/
final Node<K,V> getNode(int hash, Object key) {
Node<K,V>[] tab; Node<K,V> first, e; int n; K k;
if ((tab = table) != null && (n = tab.length) > 0 &&
(first = tab[(n - 1) & hash]) != null) {
//判断首元素
if (first.hash == hash && // always check first node
((k = first.key) == key || (key != null && key.equals(k))))
return first;
if ((e = first.next) != null) {
//树取值
if (first instanceof TreeNode)
return ((TreeNode<K,V>)first).getTreeNode(hash, key);
//链表取值
do {
if (e.hash == hash &&
((k = e.key) == key || (key != null && key.equals(k))))
return e;
} while ((e = e.next) != null);
}
}
return null;
} 5. remove
remove为删除方法
/**
* Removes the mapping for the specified key from this map if present.
*/
public V remove(Object key) {
Node<K,V> e;
return (e = removeNode(hash(key), key, null, false, true)) == null ?
null : e.value;
} public boolean remove(Object key, Object value) {
return removeNode(hash(key), key, value, true, true) != null;
} removeNode方法
/**
* Implements Map.remove and related methods
*
* @param hash hash for key
* @param key the key
* @param value the value to match if matchValue, else ignored
* @param matchValue if true only remove if value is equal
* matchValue 如果为true equals(value) 则删除 ; 否则不关心value
* @param movable if false do not move other nodes while removing
* movable 删除后是否移动节点 true 移动 false 不移动
* @return the node, or null if none
*/
final Node<K,V> removeNode(int hash, Object key, Object value,
boolean matchValue, boolean movable) {
Node<K,V>[] tab; Node<K,V> p; int n, index;
if ((tab = table) != null && (n = tab.length) > 0 &&
(p = tab[index = (n - 1) & hash]) != null) {
Node<K,V> node = null, e; K k; V v;
//头节点
if (p.hash == hash &&
((k = p.key) == key || (key != null && key.equals(k))))
node = p;
//在链表或红黑树查找
else if ((e = p.next) != null) {
//红黑树
if (p instanceof TreeNode)
node = ((TreeNode<K,V>)p).getTreeNode(hash, key);
//链表
else {
do {
if (e.hash == hash &&
((k = e.key) == key ||
(key != null && key.equals(k)))) {
node = e;
break;
}
p = e;
} while ((e = e.next) != null);
}
}
//开始移除
//node为空 --- matchValue=false --- 不需要关心value的值
// --- matchValue=true --- key对应的v == value
if (node != null && (!matchValue || (v = node.value) == value ||
(value != null && value.equals(v)))) {
//红黑树移除
if (node instanceof TreeNode)
((TreeNode<K,V>)node).removeTreeNode(this, tab, movable);
//链表头节点移除
else if (node == p)
tab[index] = node.next;
//链中移除
else
p.next = node.next;
//修改次数
++modCount;
--size;
//LinkedHashMap实现
afterNodeRemoval(node);
return node;
}
}
return null;
}
6. contains
contains判断map中是否包含key或者value
containsKey
- 如果此映射包含指定键的映射,则返回
true。
/**
* Returns <tt>true</tt> if this map contains a mapping for the
* specified key.
*
* @param key The key whose presence in this map is to be tested
* @return <tt>true</tt> if this map contains a mapping for the specified
* key.
*/
public boolean containsKey(Object key) {
//前面说到getNode方法
return getNode(hash(key), key) != null;
} containsValue
- 如果此地图将一个或多个键映射到指定值,则返回
true。
/**
* Returns <tt>true</tt> if this map maps one or more keys to the
* specified value.
*
* @param value value whose presence in this map is to be tested
* @return <tt>true</tt> if this map maps one or more keys to the
* specified value
*/
public boolean containsValue(Object value) {
Node<K,V>[] tab; V v;
if ((tab = table) != null && size > 0) {
//两层循环判断value是否存在,就像二维数组查找
for (int i = 0; i < tab.length; ++i) {
for (Node<K,V> e = tab[i]; e != null; e = e.next) {
if ((v = e.value) == value ||
(value != null && value.equals(v)))
return true;
}
}
}
return false;
} 7. replace
/**
* 判断oldVaule和key对应的value是否一样,如果一样则替换,返回true,否则返回false
*/
public boolean replace(K key, V oldValue, V newValue) {
Node<K,V> e; V v;
if ((e = getNode(hash(key), key)) != null &&
((v = e.value) == oldValue || (v != null && v.equals(oldValue)))) {
e.value = newValue;
afterNodeAccess(e);
return true;
}
return false;
} /**
* 找到key,直接替换,将替换后的value返回
*/
public V replace(K key, V value) {
Node<K,V> e;
if ((e = getNode(hash(key), key)) != null) {
V oldValue = e.value;
e.value = value;
afterNodeAccess(e);
return oldValue;
}
return null;
} 7. 总结
仅是看了部分的HashMap的源码,从中学习到了很多知识
- 外部public方法调用内部private方法,防止外部直接调用内部逻辑方法
- ()内赋值再判断,减少代码量,但是可读性变差了,小白可能会崩溃的
- 内部类和外部类的完美应用,相互依靠,相得益彰
- 一个方法的设计要尽可能降低耦合度,提高方法的复用
- 方法中可以提供一些功能,让子类来实现
// Callbacks to allow LinkedHashMap post-actions
void afterNodeAccess(Node<K,V> p) { }
void afterNodeInsertion(boolean evict) { }
void afterNodeRemoval(Node<K,V> p) { } 8. 面试
1. HashMap插入数据的流程
2. HashMap初始化
- 默认table长度16,负载因子0.75。
- 链表长度大于8树化。
- 树节点数小于6,链表化。
3. Hash函数设计
key的hashCode是int型32位的值,然后让hashCode的高16位和低16位进行异或操作。这叫做扰动函数,是为了减少hash碰撞,尽量分散;另一方面每次扩容都要进行再hash,也是为了提高效率。
hash值的范围太大,所以需要对其取模,得到的余数作为访问数组的下标。
4. 为什么长度为2的整数次幂
index = hash & (length - 1),与操作使高位全为0,仅保留底为的值用来作为访问数组的下标。
但是只取后几位碰撞概率也很大,这时候扰动函数就体现出来了价值。高16位和低16位进行异或运算能够保证高位和底为的特征保留下来,也降低了随机性。
5. JDK 1.8 HashMap较1.7做了哪些优化
- 数组+链表 改成了 数组+链表+红黑树
- 链表插入方式从头插入变成了尾插入。1.7是直接插入到头部;1.8要遍历整个数组
- 1.7扩容是需要重新hash再插入,1.8采用了将链表分为索引不变、原数组索引+旧容量大小两个部分。
- 1.7是先判断是否需要扩容,然后再插入;1.8是先插入,在判断需不需要扩容
6. 为什么做这些优化
防止hash冲突,链表长度过长,将时间复杂度O(n)降到O(logn)
头插入会使链表发生反转,多线程情况下可能会产生环。
线程A插入B,线程B也插入,但是需要扩容,头插入会导致形成环状结构
7. 扩容的时候为什么不重新计算hash来定位位置
因为之前说过,扩容的大小为原数组的2倍,原数组的大小是2的次幂。所以扩容后对应的位置是index+oldCapacity。也就是掩码高位多出一个1。
8. HashMap是线程安全的吗?
并不是,1.7的时候会出现环状结构、数据丢失、数据覆盖的问题。1.8会有数据覆盖的问题,两个线程同时操作HashMap,put操作相同位置时,会出现数据覆盖。
9. 如何解决HashMap的线程安全问题
使用HashTable、ConcurrentHashMap、Collections.synchronizedMap来实现HashMap线程安全。
10. HashMap树化和去树化的阀值?
树化的阀值是8,去树化的阀值是6。8都是经过概率计算的;而6是为了防止树和链表一直发生转化。
11. HashMap内部节点是有序的吗?
非也,Node是根据hash计算随机插入的,是无序的。有序的是LinkedHashMap和TreeMap。
12. LinkedHashMap如何实现有序的?
内部维持着一个单链表,Node类不管继承了HashMap中Node属性,还增加了before、after用来标识前后节点。可以实现按顺序插入和访问。
13. TreeMap是如何实现有序的?
TreeMap按照key的自然顺序或者Comparable接口和Comparator接口,底层是红黑树实现的。

查看4道真题和解析