Skip to main content

Is it good practice to use java.lang.String.intern()?



The Javadoc about String.intern() doesn't give much detail. (In a nutshell: It returns a canonical representation of the string, allowing interned strings to be compared using == )





  • When would I use this function in favor to String.equals() ?



  • Are there side effects not mentioned in the Javadoc, i.e. more or less optimization by the JIT compiler?



  • Are there further uses of String.intern() ?




Comments

  1. When would I use this function in favor to String.equals()


    when you need speed since you can compare strings by reference (== is faster than equals)


    Are there side effects not mentioned in the Javadoc?


    The primary disadvantage is that you have to remember to make sure that you actually do intern() all of the strings that you're going to compare. It's easy to forget to intern() all strings and then you can get confusingly incorrect results. Also, for everyone's sake, please be sure to very clearly document that you're relying on the strings being internalized.

    The second disadvantage if you decide to internalize strings is that the intern() method is relatively expensive. It has to manage the pool of unique strings so it does a fair bit of work (even if the string has already been internalized). So, be careful in your code design so that you e.g., intern() all appropriate strings on input so you don't have to worry about it anymore.

    (from JGuru)

    EDIT

    As Michael Borgwardt said:
    Third disadvantage: interned strings can't be garbage collected, so it's a potential for a memory leak.

    ReplyDelete
  2. This has (almost) nothing to do with string comparison. String interning is intended for saving memory if you have many strings with the same content in you application. By using String.intern() the application will only have one instance in the long run and a side effect is that you can perform fast reference equality comparison instead of ordinary string comparison (but this is usually not advisable because it is realy easy to break by forgetting to intern only a single instance).

    ReplyDelete
  3. Am not aware of any advantages, and if there were in one would think that equals() would itself use intern() internally (which it doesn't).

    Busting intern() myths

    ReplyDelete
  4. String.Intern() is definitely garbage collected in modern JVMs.
    The following NEVER runs out of memory, because of GC activity:

    Java -cp . -Xmx128m UserOfIntern

    public class UserOfIntern {
    public static void main(String[] args) {
    Random random = new Random();
    System.out.println(random.nextLong());
    while (true) {
    String s = String.valueOf(random.nextLong());
    s = s.intern();
    }
    }
    }


    See more on the myth of non GCed String.intern() allocations here.

    ReplyDelete
  5. A commonly overlooked disadvantage with string interning is that it adds the string object to a "static pool" of strings in non-heap memory (at least, it seems to do that in the Sun VM). Once in there, they don't get garbage collected.

    If your application doesn't deal with an arbitrary number of strings (as it generally would during input data processing), then interning won't cause a problem. However, if you intern every string that comes through the door, then you'll bust your non-heap memory pool. In other words, you get a memory leak.

    ReplyDelete
  6. When would I use this function in favor to String.equals()


    Given they do different things, probably never.

    Interning strings for performance reasons so that you can compare them for reference equality is only going to be of benefit if you are holding references to the strings for a while - strings coming from user input or IO won't be interned.

    That means in your application you receive input from an external source and process it into an object which has a semantic value - an identifier say - but that object has a type indistinguishable from the raw data, and has different rules as to how the programmer should use it.

    It's almost always better to create a UserId type which is interned ( it's easy to create a thread-safe generic interning mechanism ) and acts like an open enum, than to overload the java.lang.String type with reference semantics if it happens to be a User ID.

    That way you don't get confusion between whether or not a particular String has been interned, and you can encapsulate any additional behaviour you require in the open enum.

    ReplyDelete
  7. I would examine intern and ==-comparison instead of equals only in the case of equals-comparison being bottleneck in multiple comparisons of string. This is highly unlikely to help with small number of comparisons, because intern() is not free. After aggressively interning strings you will find calls to intern() getting slower and slower.

    ReplyDelete
  8. I would vote for it not being worth the maintenance hassle.

    Most of the time, there will be no need, and no performance benefit, unless you're code does a lot of work with substrings. In which case the String class will use the original string plus an offset to save memory. If your code uses substrings a lot, then I suspect that it'll just cause your memory requirements to explode.

    ReplyDelete
  9. The real reason to use intern is not the above.
    You get to use it after you get out-of-memory error. Lots of the string in a typical program are String.substring() of other big string [think of taking out a user-name from a 100K xml file.
    The java implementation is that , the substring holds a reference to the original string and the start+end in that huge string. (The thought behind it is a reuse of the same big string)

    After 1000 big files , from which you only save 1000 short names , you will keep in memory the whole 1000 files!
    Solution: in this scenario just use smallsubstring.intern()

    ReplyDelete
  10. I am using intern to save memory, I hold a large amount of String data in memory and moving to use intern() saved a massive amount of memory. Unfortunately although it use alot less memory the memory it does use is stored in PermGen memory not Heap and it is difficult to explain to customers how to increase the allocation of this type of memory.

    So is there an alternative to intern() for reducing memory consumption, (the == versus equals performance benefits is not a aissue for me)

    ReplyDelete
  11. Comparing strings with == is much faster than with equals()

    5 Time faster, but since String comparision usually represents only a small percentage of the total execution time of an application, the overall gain is much smaller than that, and the final gain will be diluted to a few percent.

    String.intern() pull the string away from Heap and put it in PermGen

    String internalized are put in a different storage area : Permanent Generation which is an area of the JVM that is reserved for non-user objects, like Classes, Methods and other internal JVM objects. The size of this area is limited and the is much precious than heap. Being this area smaller than Heap there are more probability to use all the space and get an OutOfMemoryException.

    String.intern() string are garbage collected

    In the new versions of JVM also internalized string are garbage collected when not referenced by any object.

    Keeping in mind the above 3 point you could deduct that String intern() could be useful only in few situation when you do a lot of string comparison, however it is better don't use internal string if you don't know exactly what you are doing ...

    ReplyDelete

Post a Comment

Popular posts from this blog

Why is this Javascript much *slower* than its jQuery equivalent?

I have a HTML list of about 500 items and a "filter" box above it. I started by using jQuery to filter the list when I typed a letter (timing code added later): $('#filter').keyup( function() { var jqStart = (new Date).getTime(); var search = $(this).val().toLowerCase(); var $list = $('ul.ablist > li'); $list.each( function() { if ( $(this).text().toLowerCase().indexOf(search) === -1 ) $(this).hide(); else $(this).show(); } ); console.log('Time: ' + ((new Date).getTime() - jqStart)); } ); However, there was a couple of seconds delay after typing each letter (particularly the first letter). So I thought it may be slightly quicker if I used plain Javascript (I read recently that jQuery's each function is particularly slow). Here's my JS equivalent: document.getElementById('filter').addEventListener( 'keyup', function () { var jsStart = (new Date).getTime()...

Is it possible to have IF statement in an Echo statement in PHP

Thanks in advance. I did look at the other questions/answers that were similar and didn't find exactly what I was looking for. I'm trying to do this, am I on the right path? echo " <div id='tabs-".$match."'> <textarea id='".$match."' name='".$match."'>". if ($COLUMN_NAME === $match) { echo $FIELD_WITH_COLUMN_NAME; } else { } ."</textarea> <script type='text/javascript'> CKEDITOR.replace( '".$match."' ); </script> </div>"; I am getting the following error message in the browser: Parse error: syntax error, unexpected T_IF Please let me know if this is the right way to go about nesting an IF statement inside an echo. Thank you.