Skip to main content

Why initialCapacity of Hashtable is 11 while the DEFAULT_INITIAL_CAPACITY in HashMap is 16 and requires a power of 2


Comparison of HashMap and Hashtable source code in jdk 1.6, I saw below codes inside HashMap




/**
* The default initial capacity - MUST be a power of two.
*/
static final int DEFAULT_INITIAL_CAPACITY = 16;

int capacity = 1;
while (capacity < initialCapacity)
capacity <<= 1;



however, in Hashtable,i saw below codes?




table = new Entry[initialCapacity];

public Hashtable() {
this(11, 0.75f);
}



so my question is: why hashMap requires a power of 2 as initial capacity? and while hashtable choose 11 as the default initial capacity? i assume this has nothing to do with the thing that hashtable is thread safe and does not allow null key or values.



thx.


Source: Tips4allCCNA FINAL EXAM

Comments

  1. The following article addresses this question in some detail: HashMap requires a better hashCode() - JDK 1.4 Part II.

    According to that article, the main reason to move to power-of-two sizes was that bit masking is faster than integer division. This is not without adverse consequences, which are explained by one of the original authors:


    Joshua Bloch: The downside of using a power-of-two is that the resulting hash table is very sensitive to the quality of the hash function (hashCode). It is imperative that any change in the input must affect the low order bits of the hash value. (Ideally, it should affect all bits of the hash value with equal likelihood.) Because we have no assurance that this is true, we put in a secondary (or "defensive") hash function when we switched to the power-of-two hash table. This hash function is applied to the results of hashCode before masking off the low order bits. Its job is to scatter the information over all the bits, and in particular, into the low order bits. Of course it has to run very fast, or you lose the benefit of switching to the power-of-two-sized table. The original secondary hash function in 1.4 turned out to be insufficient. We knew that this was a theoretical possibility, but we thought that it didn't affect any practical data sets. We were wrong. The replacement secondary hash function (which I developed with the aid of a computer) has strong statistical properties that pretty much guarantee good bucket distribution.

    ReplyDelete
  2. This could help:

    http://www.concentric.net/~Ttwang/tech/primehash.htm

    Basicly, if I remember correctly, when you have a hash table with a size that is power of 2, it's easy to get a hash function based on the less relevant bits of the key.

    Using a prime number (as in 11) as the size of the table, makes collision on the table rows less likely, so inserting is "cheaper".

    ReplyDelete
  3. Hashtable uses pseudo-prime number table sizes and grows the size of the table relatively slower. HashMap uses a power of 2 as the bit wise and is faster than using modulus.

    Ironically, a modulus of a power of 2 means a good hashCode() is needed as the top bits would be ignored so HashMap has a method to rearrange the hashCode you get to avoid this issue meaning it can actually be slower. :Z

    ReplyDelete
  4. The requirement for the table size to be a power of two is an implementation detail, not known to the users of the class -- that is why the c'tor silently adjusts the value to the next larger power of two instead of flagging an error.

    The Hashtable implementation assumes that the hash may not be evenly distributed, so it tries to use a number of bins that is prime in the hope of avoiding peaks in the frequency distribution of the hash.

    The combination of these two implementation details leads to bad performance.

    (e.g. a primitive hash function would be

    int hash(String s, int nBins) {
    return s[0] % nBins;
    }


    If nBins is 32, e and E end up in the same bin, so the distribution of hash values correlates with the distribution of occurence of letters, which has distinct peaks -- so the frequency distribution would have a peak at 32.)

    ReplyDelete

Post a Comment

Popular posts from this blog

[韓日関係] 首相含む大幅な内閣改造の可能性…早ければ来月10日ごろ=韓国

div not scrolling properly with slimScroll plugin

I am using the slimScroll plugin for jQuery by Piotr Rochala Which is a great plugin for nice scrollbars on most browsers but I am stuck because I am using it for a chat box and whenever the user appends new text to the boxit does scroll using the .scrollTop() method however the plugin's scrollbar doesnt scroll with it and when the user wants to look though the chat history it will start scrolling from near the top. I have made a quick demo of my situation http://jsfiddle.net/DY9CT/2/ Does anyone know how to solve this problem?

Why does this javascript based printing cause Safari to refresh the page?

The page I am working on has a javascript function executed to print parts of the page. For some reason, printing in Safari, causes the window to somehow update. I say somehow, because it does not really refresh as in reload the page, but rather it starts the "rendering" of the page from start, i.e. scroll to top, flash animations start from 0, and so forth. The effect is reproduced by this fiddle: http://jsfiddle.net/fYmnB/ Clicking the print button and finishing or cancelling a print in Safari causes the screen to "go white" for a sec, which in my real website manifests itself as something "like" a reload. While running print button with, let's say, Firefox, just opens and closes the print dialogue without affecting the fiddle page in any way. Is there something with my way of calling the browsers print method that causes this, or how can it be explained - and preferably, avoided? P.S.: On my real site the same occurs with Chrome. In the ex