Skip to main content

Is it good practice to use java.lang.String.intern()?



The Javadoc about String.intern() doesn't give much detail. (In a nutshell: It returns a canonical representation of the string, allowing interned strings to be compared using == )





  • When would I use this function in favor to String.equals() ?



  • Are there side effects not mentioned in the Javadoc, i.e. more or less optimization by the JIT compiler?



  • Are there further uses of String.intern() ?




Comments

  1. When would I use this function in favor to String.equals()


    when you need speed since you can compare strings by reference (== is faster than equals)


    Are there side effects not mentioned in the Javadoc?


    The primary disadvantage is that you have to remember to make sure that you actually do intern() all of the strings that you're going to compare. It's easy to forget to intern() all strings and then you can get confusingly incorrect results. Also, for everyone's sake, please be sure to very clearly document that you're relying on the strings being internalized.

    The second disadvantage if you decide to internalize strings is that the intern() method is relatively expensive. It has to manage the pool of unique strings so it does a fair bit of work (even if the string has already been internalized). So, be careful in your code design so that you e.g., intern() all appropriate strings on input so you don't have to worry about it anymore.

    (from JGuru)

    EDIT

    As Michael Borgwardt said:
    Third disadvantage: interned strings can't be garbage collected, so it's a potential for a memory leak.

    ReplyDelete
  2. This has (almost) nothing to do with string comparison. String interning is intended for saving memory if you have many strings with the same content in you application. By using String.intern() the application will only have one instance in the long run and a side effect is that you can perform fast reference equality comparison instead of ordinary string comparison (but this is usually not advisable because it is realy easy to break by forgetting to intern only a single instance).

    ReplyDelete
  3. Am not aware of any advantages, and if there were in one would think that equals() would itself use intern() internally (which it doesn't).

    Busting intern() myths

    ReplyDelete
  4. String.Intern() is definitely garbage collected in modern JVMs.
    The following NEVER runs out of memory, because of GC activity:

    Java -cp . -Xmx128m UserOfIntern

    public class UserOfIntern {
    public static void main(String[] args) {
    Random random = new Random();
    System.out.println(random.nextLong());
    while (true) {
    String s = String.valueOf(random.nextLong());
    s = s.intern();
    }
    }
    }


    See more on the myth of non GCed String.intern() allocations here.

    ReplyDelete
  5. A commonly overlooked disadvantage with string interning is that it adds the string object to a "static pool" of strings in non-heap memory (at least, it seems to do that in the Sun VM). Once in there, they don't get garbage collected.

    If your application doesn't deal with an arbitrary number of strings (as it generally would during input data processing), then interning won't cause a problem. However, if you intern every string that comes through the door, then you'll bust your non-heap memory pool. In other words, you get a memory leak.

    ReplyDelete
  6. When would I use this function in favor to String.equals()


    Given they do different things, probably never.

    Interning strings for performance reasons so that you can compare them for reference equality is only going to be of benefit if you are holding references to the strings for a while - strings coming from user input or IO won't be interned.

    That means in your application you receive input from an external source and process it into an object which has a semantic value - an identifier say - but that object has a type indistinguishable from the raw data, and has different rules as to how the programmer should use it.

    It's almost always better to create a UserId type which is interned ( it's easy to create a thread-safe generic interning mechanism ) and acts like an open enum, than to overload the java.lang.String type with reference semantics if it happens to be a User ID.

    That way you don't get confusion between whether or not a particular String has been interned, and you can encapsulate any additional behaviour you require in the open enum.

    ReplyDelete
  7. I would examine intern and ==-comparison instead of equals only in the case of equals-comparison being bottleneck in multiple comparisons of string. This is highly unlikely to help with small number of comparisons, because intern() is not free. After aggressively interning strings you will find calls to intern() getting slower and slower.

    ReplyDelete
  8. I would vote for it not being worth the maintenance hassle.

    Most of the time, there will be no need, and no performance benefit, unless you're code does a lot of work with substrings. In which case the String class will use the original string plus an offset to save memory. If your code uses substrings a lot, then I suspect that it'll just cause your memory requirements to explode.

    ReplyDelete
  9. The real reason to use intern is not the above.
    You get to use it after you get out-of-memory error. Lots of the string in a typical program are String.substring() of other big string [think of taking out a user-name from a 100K xml file.
    The java implementation is that , the substring holds a reference to the original string and the start+end in that huge string. (The thought behind it is a reuse of the same big string)

    After 1000 big files , from which you only save 1000 short names , you will keep in memory the whole 1000 files!
    Solution: in this scenario just use smallsubstring.intern()

    ReplyDelete
  10. I am using intern to save memory, I hold a large amount of String data in memory and moving to use intern() saved a massive amount of memory. Unfortunately although it use alot less memory the memory it does use is stored in PermGen memory not Heap and it is difficult to explain to customers how to increase the allocation of this type of memory.

    So is there an alternative to intern() for reducing memory consumption, (the == versus equals performance benefits is not a aissue for me)

    ReplyDelete
  11. Comparing strings with == is much faster than with equals()

    5 Time faster, but since String comparision usually represents only a small percentage of the total execution time of an application, the overall gain is much smaller than that, and the final gain will be diluted to a few percent.

    String.intern() pull the string away from Heap and put it in PermGen

    String internalized are put in a different storage area : Permanent Generation which is an area of the JVM that is reserved for non-user objects, like Classes, Methods and other internal JVM objects. The size of this area is limited and the is much precious than heap. Being this area smaller than Heap there are more probability to use all the space and get an OutOfMemoryException.

    String.intern() string are garbage collected

    In the new versions of JVM also internalized string are garbage collected when not referenced by any object.

    Keeping in mind the above 3 point you could deduct that String intern() could be useful only in few situation when you do a lot of string comparison, however it is better don't use internal string if you don't know exactly what you are doing ...

    ReplyDelete

Post a Comment

Popular posts from this blog

[韓日関係] 首相含む大幅な内閣改造の可能性…早ければ来月10日ごろ=韓国

div not scrolling properly with slimScroll plugin

I am using the slimScroll plugin for jQuery by Piotr Rochala Which is a great plugin for nice scrollbars on most browsers but I am stuck because I am using it for a chat box and whenever the user appends new text to the boxit does scroll using the .scrollTop() method however the plugin's scrollbar doesnt scroll with it and when the user wants to look though the chat history it will start scrolling from near the top. I have made a quick demo of my situation http://jsfiddle.net/DY9CT/2/ Does anyone know how to solve this problem?

Why does this javascript based printing cause Safari to refresh the page?

The page I am working on has a javascript function executed to print parts of the page. For some reason, printing in Safari, causes the window to somehow update. I say somehow, because it does not really refresh as in reload the page, but rather it starts the "rendering" of the page from start, i.e. scroll to top, flash animations start from 0, and so forth. The effect is reproduced by this fiddle: http://jsfiddle.net/fYmnB/ Clicking the print button and finishing or cancelling a print in Safari causes the screen to "go white" for a sec, which in my real website manifests itself as something "like" a reload. While running print button with, let's say, Firefox, just opens and closes the print dialogue without affecting the fiddle page in any way. Is there something with my way of calling the browsers print method that causes this, or how can it be explained - and preferably, avoided? P.S.: On my real site the same occurs with Chrome. In the ex