Skip to content

ConcurrentHashMap

Pslydhh edited this page Nov 18, 2017 · 15 revisions

The ConcurrentHashMap in Java1.8 which present algorithms for shrinking and expanding allowing relatively efficient concurrency. Let's explore it!


    public V put(K key, V value) {
        return putVal(key, value, false);
    }

The third argument is false.


    final V putVal(K key, V value, boolean onlyIfAbsent) {
        if (key == null || value == null) throw new NullPointerException();
        int hash = spread(key.hashCode());
        int binCount = 0;
        for (Node[] tab = table;;) {
            Node f; int n, i, fh;
            if (tab == null || (n = tab.length) == 0)
                tab = initTable();
            else if ((f = tabAt(tab, i = (n - 1) & hash)) == null) {
                if (casTabAt(tab, i, null,
                             new Node(hash, key, value, null)))
                    break;                   // no lock when adding to empty bin
            }
            else if ((fh = f.hash) == MOVED)
                tab = helpTransfer(tab, f);
            else {
                V oldVal = null;
                synchronized (f) {
                    if (tabAt(tab, i) == f) {
                        if (fh >= 0) {
                            binCount = 1;
                            for (Node e = f;; ++binCount) {
                                K ek;
                                if (e.hash == hash &&
                                    ((ek = e.key) == key ||
                                     (ek != null && key.equals(ek)))) {
                                    oldVal = e.val;
                                    if (!onlyIfAbsent)
                                        e.val = value;
                                    break;
                                }
                                Node pred = e;
                                if ((e = e.next) == null) {
                                    pred.next = new Node(hash, key,
                                                              value, null);
                                    break;
                                }
                            }
                        }
                        else if (f instanceof TreeBin) {
                            Node p;
                            binCount = 2;
                            if ((p = ((TreeBin)f).putTreeVal(hash, key,
                                                           value)) != null) {
                                oldVal = p.val;
                                if (!onlyIfAbsent)
                                    p.val = value;
                            }
                        }
                    }
                }
                if (binCount != 0) {
                    if (binCount >= TREEIFY_THRESHOLD)
                        treeifyBin(tab, i);
                    if (oldVal != null)
                        return oldVal;
                    break;
                }
            }
        }
        addCount(1L, binCount);
        return null;
    }

As the name implies, if onlyIfAbsent is true, then the operation only works if the value is missing. at first, if key == null or value == null, this operation is over(throw new NullPointerException()). then, the key and value must be nonNull, so get it's hashCode, we see:


    int hash = spread(key.hashCode());
    static final int spread(int h) {
        return (h ^ (h >>> 16)) & HASH_BITS;
    }
    static final int HASH_BITS = 0x7fffffff; 

I think the purpose is to get a hash that can make the distribution evenly, 31 bits.


int binCount = 0;

This variable is used to count the number of elements required for the iteration to complete the operation. The next step is a large for loop!

 
        for (Node[] tab = table;;) {
            Node f; int n, i, fh;
            if (tab == null || (n = tab.length) == 0)
                tab = initTable();
            else if ((f = tabAt(tab, i = (n - 1) & hash)) == null) {
                if (casTabAt(tab, i, null,
                             new Node(hash, key, value, null)))
                    break;                   // no lock when adding to empty bin
            }
            else if ((fh = f.hash) == MOVED)
                tab = helpTransfer(tab, f);
            else {
                V oldVal = null;
                synchronized (f) {
                    if (tabAt(tab, i) == f) {
                        if (fh >= 0) {
                            binCount = 1;
                            for (Node e = f;; ++binCount) {
                                K ek;
                                if (e.hash == hash &&
                                    ((ek = e.key) == key ||
                                     (ek != null && key.equals(ek)))) {
                                    oldVal = e.val;
                                    if (!onlyIfAbsent)
                                        e.val = value;
                                    break;
                                }
                                Node pred = e;
                                if ((e = e.next) == null) {
                                    pred.next = new Node(hash, key,
                                                              value, null);
                                    break;
                                }
                            }
                        }
                        else if (f instanceof TreeBin) {
                            Node p;
                            binCount = 2;
                            if ((p = ((TreeBin)f).putTreeVal(hash, key,
                                                           value)) != null) {
                                oldVal = p.val;
                                if (!onlyIfAbsent)
                                    p.val = value;
                            }
                        }
                    }
                }
                if (binCount != 0) {
                    if (binCount >= TREEIFY_THRESHOLD)
                        treeifyBin(tab, i);
                    if (oldVal != null)
                        return oldVal;
                    break;
                }
            }
        }

Let's take a closer look at it .


      Node f; int n, i, fh;
            if (tab == null || (n = tab.length) == 0)
                tab = initTable();

first, if tab is not properly initilized. You must first initialize it.

Let's see:

    private final Node[] initTable() {
        Node[] tab; int sc;
        while ((tab = table) == null || tab.length == 0) {
            if ((sc = sizeCtl) < 0)
                Thread.yield(); // lost initialization race; just spin
            else if (U.compareAndSwapInt(this, SIZECTL, sc, -1)) {
                try {
                    if ((tab = table) == null || tab.length == 0) {
                        int n = (sc > 0) ? sc : DEFAULT_CAPACITY;
                        @SuppressWarnings("unchecked")
                        Node[] nt = (Node[])new Node,?>[n];
                        table = tab = nt;
                        sc = n - (n >>> 2);
                    }
                } finally {
                    sizeCtl = sc;
                }
                break;
            }
        }
        return tab;
    }

Place the condition in the while loop, and we take the sizeCtl to control this process. when first thread into initTable,then it convert sizeCtl to a negative number(-1), then carry out its real work. and the second/third/fourth threads..will yield directly or begin to yield after the failure of the competition. And the most important is that other threads will just Waiting for the only thread until it completes the task... the only thread Locked the work of other threads. when the only thread complete its work, it set the table, and set the sizeCtl, note that the value of sizeCtl is three quarters of the current table length. then it back to the outer for loop, and take a loop again.

so we keep looking at this loop:

            else if ((f = tabAt(tab, i = (n - 1) & hash)) == null) {
                if (casTabAt(tab, i, null, new Node(hash, key, value, null)))
                    break;                   // no lock when adding to empty bin
            }

now, we have initlize the table properly, so we can take the entry from the table array. we just take the hash value & the (length - 1)(a perfect all-bit-in-1 value), thus get the corresponding slot. We first deal with the null logic(After all if it is null, then we have no more things to do, the principle here should be simple things put in front). We put our new key, the new value, the new hash into the table i position through CAS operation(It can correctly handle several CAS competition conditions based on the return value). if success, we done.

else it take a loop again, and it will get a Node(key/value). so let's we look at the third case:

           else if ((fh = f.hash) == MOVED)
                tab = helpTransfer(tab, f);

the helpTransfer logic is a big one, so let's skip it now ,and then back to look at it closely. the fourth case:

            else {
                V oldVal = null;
                synchronized (f) {
                    if (tabAt(tab, i) == f) {
                        if (fh >= 0) {
                            binCount = 1;
                            for (Node e = f;; ++binCount) {
                                K ek;
                                if (e.hash == hash &&
                                    ((ek = e.key) == key ||
                                     (ek != null && key.equals(ek)))) {
                                    oldVal = e.val;
                                    if (!onlyIfAbsent)
                                        e.val = value;
                                    break;
                                }
                                Node pred = e;
                                if ((e = e.next) == null) {
                                    pred.next = new Node(hash, key,
                                                              value, null);
                                    break;
                                }
                            }
                        }
                        else if (f instanceof TreeBin) {
                            Node p;
                            binCount = 2;
                            if ((p = ((TreeBin)f).putTreeVal(hash, key,
                                                           value)) != null) {
                                oldVal = p.val;
                                if (!onlyIfAbsent)
                                    p.val = value;
                            }
                        }
                    }
                }
                if (binCount != 0) {
                    if (binCount >= TREEIFY_THRESHOLD)
                        treeifyBin(tab, i);
                    if (oldVal != null)
                        return oldVal;
                    break;
                }
            }

we first lock the node f, and then verify that the current location is f ? if no, we just exit and take a loop again. if it's true, this node has been locked by us. the next step is we must put the key/value into the slot(datastrucs). ConcurrentHashMap divides the node into two categories, linked list nodes, and non-linked nodes. The linked list nodes' hashCode is greater than or equal to 0(because the value is get from &HASH_BITS(0x7fffffff)). The non-linked list nodes' hashCode is less than 0, and its' value is fixed(MOVED(-1) for forwarding nodes TREEBIN(-2) for roots of trees RESERVED(-3) for transient reservations ).

We track this list to see if there is a match with the entry entry, or to the tail node.


                    if (fh >= 0) {
                            binCount = 1;
                            for (Node e = f;; ++binCount) {
                                K ek;
                                if (e.hash == hash &&
                                    ((ek = e.key) == key ||
                                     (ek != null && key.equals(ek)))) {
                                    oldVal = e.val;
                                    if (!onlyIfAbsent)
                                        e.val = value;
                                    break;
                                }
                                Node pred = e;
                                if ((e = e.next) == null) {
                                    pred.next = new Node(hash, key,
                                                              value, null);
                                    break;
                                }
                            }
                        }

if has a match, wo update the value according to the onlyIfAbsent variable. else we just add a node to the list tail, and it become the tail node.

both case, the putVal operation has done(break).

let's see the other case, the slot node is TREEBIN, this means that the nodes in this slot has been contructed into a tree.

                     else if (f instanceof TreeBin) {
                            Node p;
                            binCount = 2;
                            if ((p = ((TreeBin)f).putTreeVal(hash, key,
                                                           value)) != null) {
                                oldVal = p.val;
                                if (!onlyIfAbsent)
                                    p.val = value;
                            }
                        }

this moment, we just call the putTreeVal method of the TreeBin slot node. And we update the corresponding value according to the onlyIfAbsent variable and the return value of putTreeVal method(if the return value isn't null, means that the entry(Key/Value) already exits before.(For the same reason, we also skipped putTreeVal).

now, There is no third case, why? I means RESERVED(node)(MOVED node has been processed before). Because the RESERVED(node) is reserved for compute interface that related to functional programming. so let's skip it too.

the next step is:


                if (binCount != 0) {
                    if (binCount >= TREEIFY_THRESHOLD)
                        treeifyBin(tab, i);
                    if (oldVal != null)
                        return oldVal;
                    break;
                }

it is follow the both linked nodes and tree nodes logic unit. and we can see binCount is 2 fixed in the tree nodes case, and it is the lookup times in the linked nodes case. so if the binCount is greater than or equals to TREEIFY_THRESHOLD(8), we transform from the list into a redBlack tree! this is the beginning of every tree. A enough lookup times on a single slot is the reason for the existence of the tree

and then, we return the oldVal, and we just exit the outer loop.

then we reach the end of putVal:


        addCount(1L, binCount);
        return null;

this means that if the entry has exits, we won't reach here, and we already return the old value. if we reach here, implies the there is no entry match the key, then add one to the total count of entrys, return null.

thus, the operation putVal has been explored.

then let's explore the addCount function:


    private final void addCount(long x, int check) {
        CounterCell[] as; long b, s;
        if ((as = counterCells) != null ||
            !U.compareAndSwapLong(this, BASECOUNT, b = baseCount, s = b + x)) {
            CounterCell a; long v; int m;
            boolean uncontended = true;
            if (as == null || (m = as.length - 1) < 0 ||
                (a = as[ThreadLocalRandom.getProbe() & m]) == null ||
                !(uncontended =
                  U.compareAndSwapLong(a, CELLVALUE, v = a.value, v + x))) {
                fullAddCount(x, uncontended);
                return;
            }
            if (check <= 1)
                return;
            s = sumCount();
        }
        if (check >= 0) {
            Node[] tab, nt; int n, sc;
            while (s >= (long)(sc = sizeCtl) && (tab = table) != null &&
                   (n = tab.length) < MAXIMUM_CAPACITY) {
                int rs = resizeStamp(n);
                if (sc < 0) {
                    if ((sc >>> RESIZE_STAMP_SHIFT) != rs || sc == rs + 1 ||
                        sc == rs + MAX_RESIZERS || (nt = nextTable) == null ||
                        transferIndex <= 0)
                        break;
                    if (U.compareAndSwapInt(this, SIZECTL, sc, sc + 1))
                        transfer(tab, nt);
                }
                else if (U.compareAndSwapInt(this, SIZECTL, sc,
                                             (rs << RESIZE_STAMP_SHIFT) + 2))
                    transfer(tab, null);
                s = sumCount();
            }
        }
    }

first,we will add the to the baseCount:

        CounterCell[] as; long b, s;
        if ((as = counterCells) != null ||
            !U.compareAndSwapLong(this, BASECOUNT, b = baseCount, s = b + x)) {

look this code, it means that if the counterCells is not null or the atomic operation update failed. so we will update the cell in the counterCells:

        CounterCell a; long v; int m;
            boolean uncontended = true;
            if (as == null || (m = as.length - 1) < 0 ||
                (a = as[ThreadLocalRandom.getProbe() & m]) == null ||
                !(uncontended =
                  U.compareAndSwapLong(a, CELLVALUE, v = a.value, v + x))) {
                fullAddCount(x, uncontended);
                return;
            }

this implies that the counterCells is null or its length is 0 or the got cell is null or we update the cell value failed-we will update the cell in counterCells continuelly, and then we will return immediately. If all of the above conditions is wrong, then means we have add our value to the cell correctly.(its logic is so clear) this case, the code follow is :


            if (check <= 1)
                return;
            s = sumCount();

if we put the entry just through 0 or 1 iterations, just return and no need for sumCount. else we just sum the all value in the counterCells and baseCount.Then we use this variable to count the current total number.

then we look closer at the rest logic(we have update the baseCount or counterCells), we will doubled the table capacity when the size is greater than three quarters of the table capacity:


        if (check >= 0) {
            Node[] tab, nt; int n, sc;
            while (s >= (long)(sc = sizeCtl) && (tab = table) != null &&
                   (n = tab.length) < MAXIMUM_CAPACITY) {
                int rs = resizeStamp(n);
                if (sc < 0) {
                    if ((sc >>> RESIZE_STAMP_SHIFT) != rs || sc == rs + 1 ||
                        sc == rs + MAX_RESIZERS || (nt = nextTable) == null ||
                        transferIndex <= 0)
                        break;
                    if (U.compareAndSwapInt(this, SIZECTL, sc, sc + 1))
                        transfer(tab, nt);
                }
                else if (U.compareAndSwapInt(this, SIZECTL, sc,
                                             (rs << RESIZE_STAMP_SHIFT) + 2))
                    transfer(tab, null);
                s = sumCount();
            }
        }

so, if the count of entrys is greater than or equals to sizeCtl(three quarters of the slots) & the table is not null & the length is less than MAXIMUM_CAPACITY(1<<30). then we compute the rs(resizeStamp).


    int rs = resizeStamp(n);
    static final int resizeStamp(int n) {
        return Integer.numberOfLeadingZeros(n) | (1 << (RESIZE_STAMP_BITS - 1));
    }
    public static int numberOfLeadingZeros(int i) {
        // HD, Figure 5-6
        if (i == 0)
            return 32;
        int n = 1;
        if (i >>> 16 == 0) { n += 16; i <<= 16; }
        if (i >>> 24 == 0) { n +=  8; i <<=  8; }
        if (i >>> 28 == 0) { n +=  4; i <<=  4; }
        if (i >>> 30 == 0) { n +=  2; i <<=  2; }
        n -= i >>> 31;
        return n;
    }

the RESIZE_STAMP_BITS is 16.so here we just got a value the(rs) low 16bit is 1 and add the number(1-27) of zeros of the leading of n. And I found some extra errors in the code!

at first, sc is greater than 0, so it execute this codes:


        else if (U.compareAndSwapInt(this, SIZECTL, sc,(rs << RESIZE_STAMP_SHIFT) + 2))
              transfer(tab, null);

thus, sizeCtl is the low 32bit is 1 and the low 16bits is 2,if we take the low16bits as count of resizers, then it is 2*16 - 1. then, we look at the most interesting, the most difficult, and the most important part.


    /**
     * Moves and/or copies the nodes in each bin to new table. See
     * above for explanation.
     */
    private final void transfer(Node[] tab, Node[] nextTab) {
        int n = tab.length, stride;
        if ((stride = (NCPU > 1) ? (n >>> 3) / NCPU : n) < MIN_TRANSFER_STRIDE)
            stride = MIN_TRANSFER_STRIDE; // subdivide range
        if (nextTab == null) {            // initiating
            try {
                @SuppressWarnings("unchecked")
                Node[] nt = (Node[])new Node,?>[n << 1];
                nextTab = nt;
            } catch (Throwable ex) {      // try to cope with OOME
                sizeCtl = Integer.MAX_VALUE;
                return;
            }
            nextTable = nextTab;
            transferIndex = n;
        }
        int nextn = nextTab.length;
        ForwardingNode fwd = new ForwardingNode(nextTab);
        boolean advance = true;
        boolean finishing = false; // to ensure sweep before committing nextTab
        for (int i = 0, bound = 0;;) {
            Node f; int fh;
            while (advance) {
                int nextIndex, nextBound;
                if (--i >= bound || finishing)
                    advance = false;
                else if ((nextIndex = transferIndex) <= 0) {
                    i = -1;
                    advance = false;
                }
                else if (U.compareAndSwapInt
                         (this, TRANSFERINDEX, nextIndex,
                          nextBound = (nextIndex > stride ?
                                       nextIndex - stride : 0))) {
                    bound = nextBound;
                    i = nextIndex - 1;
                    advance = false;
                }
            }
            if (i < 0 || i >= n || i + n >= nextn) {
                int sc;
                if (finishing) {
                    nextTable = null;
                    table = nextTab;
                    sizeCtl = (n << 1) - (n >>> 1);
                    return;
                }
                if (U.compareAndSwapInt(this, SIZECTL, sc = sizeCtl, sc - 1)) {
                    if ((sc - 2) != resizeStamp(n) << RESIZE_STAMP_SHIFT)
                        return;
                    finishing = advance = true;
                    i = n; // recheck before commit
                }
            }
            else if ((f = tabAt(tab, i)) == null)
                advance = casTabAt(tab, i, null, fwd);
            else if ((fh = f.hash) == MOVED)
                advance = true; // already processed
            else {
                synchronized (f) {
                    if (tabAt(tab, i) == f) {
                        Node ln, hn;
                        if (fh >= 0) {
                            int runBit = fh & n;
                            Node lastRun = f;
                            for (Node p = f.next; p != null; p = p.next) {
                                int b = p.hash & n;
                                if (b != runBit) {
                                    runBit = b;
                                    lastRun = p;
                                }
                            }
                            if (runBit == 0) {
                                ln = lastRun;
                                hn = null;
                            }
                            else {
                                hn = lastRun;
                                ln = null;
                            }
                            for (Node p = f; p != lastRun; p = p.next) {
                                int ph = p.hash; K pk = p.key; V pv = p.val;
                                if ((ph & n) == 0)
                                    ln = new Node(ph, pk, pv, ln);
                                else
                                    hn = new Node(ph, pk, pv, hn);
                            }
                            setTabAt(nextTab, i, ln);
                            setTabAt(nextTab, i + n, hn);
                            setTabAt(tab, i, fwd);
                            advance = true;
                        }
                        else if (f instanceof TreeBin) {
                            TreeBin t = (TreeBin)f;
                            TreeNode lo = null, loTail = null;
                            TreeNode hi = null, hiTail = null;
                            int lc = 0, hc = 0;
                            for (Node e = t.first; e != null; e = e.next) {
                                int h = e.hash;
                                TreeNode p = new TreeNode
                                    (h, e.key, e.val, null, null);
                                if ((h & n) == 0) {
                                    if ((p.prev = loTail) == null)
                                        lo = p;
                                    else
                                        loTail.next = p;
                                    loTail = p;
                                    ++lc;
                                }
                                else {
                                    if ((p.prev = hiTail) == null)
                                        hi = p;
                                    else
                                        hiTail.next = p;
                                    hiTail = p;
                                    ++hc;
                                }
                            }
                            ln = (lc <= UNTREEIFY_THRESHOLD) ? untreeify(lo) :
                                (hc != 0) ? new TreeBin(lo) : t;
                            hn = (hc <= UNTREEIFY_THRESHOLD) ? untreeify(hi) :
                                (lc != 0) ? new TreeBin(hi) : t;
                            setTabAt(nextTab, i, ln);
                            setTabAt(nextTab, i + n, hn);
                            setTabAt(tab, i, fwd);
                            advance = true;
                        }
                    }
                }
            }
        }
    }

This code is extremely elegant in partitioning the entire table, allowing multiple threads to migrate data concurrently to the next table, while ensuring consistency. Those who see the migration data at the moment will join in the assistance, and complete the last piece of data migration thread, which is responsible for reconfiguring table to nextTab.

Clone this wiki locally