Thursday, August 25, 2011

Hey .intern() ... get me a String!

A lot of us that work in Java take Strings for granted. Some of us toss them around like primitive types paying little regard to what they represent or how they should be used. Because of this, some classic problems like "==" vs. ".equals()" arise. Just in case you don't know what I'm talking about here, in Java using == to test for equality between two objects (let's call them A and B), results in a comparison  of the reference to which A and B refer, NOT the value to which they refer. Notice that I said OBJECTS. Thats the key word here. I'm not talking about primitive types like int, short, char, boolean, etc... I'm talking about OBJECTS. Sometimes, people forget about Strings being objects so you'll see something like if ( A == B ){ ... I recently ran into a bug in a piece of code that was caused by just that. It actually went a little something like if( "VALUE" ==  srt ) { ...

If you read the above and scratched your head in confusion, I suggest taking some time to learn a little bit more about Strings, comparison operators, and a little bit more about Java in general before continuing onward.

If you've read the above and I've bored you, GOOD! This NEXT section is a little more interesting...


AND NOW, HERE'S WHERE THE WHEELS FALL OF THE CART!

There are times, in Java land, if you REALLY, REALLY, REALLY, know what you're doing, when you CAN use == to evaluate 2 Strings. I'm going to warn you though - I don't think I'd ever even use what I'm about to talk about unless I ABSOLUTELY HAD TO for some reason or another and ONLY after thorough testing.

And so, I introduce...


String.intern()

Since String objects are immutable in Java land, there are tons of little things you can do to produce all sorts of new String objects during runtime. Take for example, the following code:

String x = "FOO";

....

public String bar(String z)

{

return z+z;

}

...

String y = bar(x);

So now, our variable y is assigned the value "FOOFOO". What if, however, somewhere out in our program we had another variable w. And, lets just pretend that our variable w was assigned the value "FOOFOO" during runtime. Since Strings are immutable, we could have just used y to point to our w reference since we could, for all intents and purposes, consider them in this case to be equal.

Enter String.intern().  So, in a nutshell, .intern() is basically your way of saying "Hey... if there's already a String out there in Java land, point my value to that guy instead. I know he's not changing. I'll just use him!"

Under the hood, your JVM keeps a little pool of unique Strings floating around and happy for just this occasion.  If you call .intern(), the JVM lifeguard takes a gander into that pool to see if you look like anyone in there. If you do, he points you to that guy. If you don't, he throws you into that pool.

All of your String literals and compile time constants are in here. Runtime strings aren't. Not unless you invoke .intern().

So, that being said, if you use this method consistently throughout your program, you COULD replace your .equals() evaluations with ==.  Why would one do such  a thing?

Take a look at the String.equals() source (right around line 854). There's actually a LOT going on here. Especially down in that for loop! Using == might be a bit faster for evaluating equality in this case!

Here's the catch - you incur some up front loss in performance for some gains later on down the road. Calling .intern() isn't free OR cheap. In fact, you might see some people out there writing about how using if ( s0.intern() == s1.intern() ){...  is faster than if (s0.equals(s1)){...

I've run a few performance tests. Nothing too in depth (I was just curious to see for myself) but in most simple cases, the second evaluation is your safer bet. I encourage you to try it out for yourself! If you're really curious, take a look at the byte code! In some not-so simple cases however, you MAY achieve some performance gains. I wouldn't recommend s0.intern() == s1.intern() for evaluation, but rather s0 == s1 IF AND ONLY IF you have guaranteed that these Strings have been intern() 'd !

 

No comments:

Post a Comment