Skip to main content

If strings are immutable in .NET, then why does Substring take O(n) time?


Given that strings are immutable in .NET, I'm wondering why they have been designed such that string.Substring() takes O( substring.Length ) time, instead of O(1)?



i.e. what were the tradeoffs, if any?


Source: Tips4allCCNA FINAL EXAM

Comments

  1. Precisely because Strings are immutable, .Substring must make a copy of at least a portion of the original string. Making a copy of n bytes should take O(n) time.

    How do you think you would copy a bunch of bytes in constant time?



    EDIT: Mehrdad suggests not copying the string at all, but keeping a reference to a piece of it.

    Consider in .Net, a multi-megabyte string, on which someone calls .SubString(n, n+3) (for any n in the middle of the string).

    Now, the ENTIRE string cannot be Garbage Collected just because one reference is holding on to 4 characters?
    That seems like a ridiculous waste of space.

    Further, tracking references to substrings (which may even be inside substrings), and trying to copy at optimal times to avoid defeating the GC (as described above), makes the concept a nightmare. It is far simpler, and more reliable, to copy on .SubString, and maintain the straightforward immutable model.

    ReplyDelete
  2. Java (as opposed to .NET) provides two ways of doing Substring(), you can consider whether you want to keep just a reference or copy a whole substring to a new memory location.

    The simple .substring(...) shares the internally used char array with the original String object, which you then with new String(...) can copy to a new array, if needed (to avoid hindering garbage collection of the original one).

    I think this kind of flexibility is a best option for a developer.

    ReplyDelete
  3. Keeping sub-strings as references into the original would have some obvious benefit, but you asked for the drawbacks:


    A reference to a (small) substring would block the entire string for GC
    The GC would have to be able to find the mother-string from the substring, potentially expensive
    The start of the reference/pointer to the chars would not always be word-aligned
    The current 'hack' of strings having a Length and a '\0' would have to be abandoned
    The savings would be rare.

    ReplyDelete
  4. Despite what Eric may claim O(n) is always O(n). Big O is a statement about asymptotic complexity, that is the relative growth of two functions. The very definition of O is that:


    f(x) = O(g(x)) iff there exists a values of k and c such that f(x) <
    c|g(x)| for all values of x greater than k.


    Many programmers conflate big O with algorithm performance, but they are very different things.

    The most obvious solution to creating an O(1) algorithm for substring is to have a separate concrete class for the type returned by substring, and to use a string interface (say IString) instead of concrete types. This is a lot of extra complexity for minimal trade-off. See Eric's answer. People generally expect strings to be simple concrete types.

    ReplyDelete

Post a Comment

Popular posts from this blog

Why is this Javascript much *slower* than its jQuery equivalent?

I have a HTML list of about 500 items and a "filter" box above it. I started by using jQuery to filter the list when I typed a letter (timing code added later): $('#filter').keyup( function() { var jqStart = (new Date).getTime(); var search = $(this).val().toLowerCase(); var $list = $('ul.ablist > li'); $list.each( function() { if ( $(this).text().toLowerCase().indexOf(search) === -1 ) $(this).hide(); else $(this).show(); } ); console.log('Time: ' + ((new Date).getTime() - jqStart)); } ); However, there was a couple of seconds delay after typing each letter (particularly the first letter). So I thought it may be slightly quicker if I used plain Javascript (I read recently that jQuery's each function is particularly slow). Here's my JS equivalent: document.getElementById('filter').addEventListener( 'keyup', function () { var jsStart = (new Date).getTime()...

Is it possible to have IF statement in an Echo statement in PHP

Thanks in advance. I did look at the other questions/answers that were similar and didn't find exactly what I was looking for. I'm trying to do this, am I on the right path? echo " <div id='tabs-".$match."'> <textarea id='".$match."' name='".$match."'>". if ($COLUMN_NAME === $match) { echo $FIELD_WITH_COLUMN_NAME; } else { } ."</textarea> <script type='text/javascript'> CKEDITOR.replace( '".$match."' ); </script> </div>"; I am getting the following error message in the browser: Parse error: syntax error, unexpected T_IF Please let me know if this is the right way to go about nesting an IF statement inside an echo. Thank you.