Wednesday, December 10, 2008

Mythbuster: String and Stringbuilder in Java

In this new series I will go over some of the surprising facts of Java. It often starts with "You would think..." but at the end I lost the argument -- so here's the first one:

Let's start with this example of concatenating Strings (see http://stackoverflow.com/questions/47605/java-string-concatenation)

public class test {

public static void main(String[] args) {

String c = "c";

String a = "a";

String b = "b";

long start = System.currentTimeMillis();

for (long i = 0; i <>

c = c.concat(b);

}

long end = System.currentTimeMillis();

System.out.println(end - start);

start = System.currentTimeMillis();

for (long i = 0; i <>

c += b;

}

end = System.currentTimeMillis();

System.out.println(end - start);

start = System.currentTimeMillis();

StringBuilder sb = new StringBuilder(c);

for (long i = 0; i <>

sb.append(b);

}

int test = sb.toString().length();

end = System.currentTimeMillis();

System.out.println(end - start);

}

}

The output is the following:

C:\tmp>java test

296

5016

0

Now let's assume that we don't create new Strings every iteration and use code more like:

public class StringBuffer {

/**

* @param args

*/

public static void main(String[] args) {

String c = "c";

String a = "a";

String b = "b";

long start = System.currentTimeMillis();

for (long i = 0; i <>

c = a.concat(b);

}

long end = System.currentTimeMillis();

System.out.println(end - start);

start = System.currentTimeMillis();

for (long i = 0; i <>

c = a + b;

}

end = System.currentTimeMillis();

System.out.println(end - start);

start = System.currentTimeMillis();

for (long i = 0; i <>

c = new StringBuilder(a).append(b).toString();

}

// int test = sb.toString().length();

end = System.currentTimeMillis();

System.out.println(end - start);

}

}

With results:

15

63

15

What should we learn?

The arguments brought forward in (http://saloon.javaranch.com/cgi-bin/ubb/ultimatebb.cgi?ubb=get_topic&f=15&t=001714) doesn't really hold -- and yes, despite the "+" being internally converted to a StringBuilder still sucks.


But what about the surprising result with concat and the StringBuilder being par in the second example. That's really surprising.

My theory is that the biggest computational cost is instantiating a new object as seen in comparison to the two examples. (a+= a+b will make a new temporary String holding the result of a+b and then assign that whereas a.concat doesn't do that in the second example) You would think that the compiler/jvm would be able to reuse the objects internally for speed but it doesn't.

I would really prefer to use Strings in all cases (and I can live with saying concat instead of "+" though it's silly) because the immutability oft the String is a huge plus when it comes to concurrent programming. But as long as the performance in instatiating is that bad we might be stuck with the StringBuilder and a harder time making things concurrent.

Labels: ,