Popular Posts

Wednesday, February 01, 2012

What is String literal pool, why String is immutable or final .. & other interview questions & concepts.

How to create a String

There are two ways to create a String object in Java:
  • Using the new operator. For example, String str = new String("Hello");.
  • Using a string literal or constant expression). For example,String str="Hello"; (string literal) orString str="Hel" + "lo"; (string constant expression).
What is difference between these String's creations? In Java, the equals method can be considered to perform a deep comparison of the value of an object, whereas the == operator performs a shallow comparison. The equals method compares the content of two objects rather than two objects' references. The == operator with reference types (i.e., Objects) evaluates as true if the references are identical - point to the same object. With value types (i.e., primitives) it evaluates as true if the value is identical. The equals method is to return true if two objects have identical content - however, the equals method in the java.lang.Object class - the default equals method if a class does not override it - returns true only if both references point to the same object.

Let's use the following example to see what difference between these creations of string:

public class DemoStringCreation {
  public static void main (String args[]) {
    String str1 = "Hello";  
    String str2 = "Hello"; 
    System.out.println("str1 and str2 are created by using string literal.");    
    System.out.println("    str1 == str2 is " + (str1 == str2));  
    System.out.println("    str1.equals(str2) is " + str1.equals(str2));  
    
    String str3 = new String("Hello");  
    String str4 = new String("Hello"); 
    System.out.println("str3 and str4 are created by using new operator.");    
    System.out.println("    str3 == str4 is " + (str3 == str4));  
    System.out.println("    str3.equals(str4) is " + str3.equals(str4));  
    
    String str5 = "Hel"+ "lo";  
    String str6 = "He" + "llo"; 
    System.out.println("str5 and str6 are created by using string 
constant expression.");    
    System.out.println("    str5 == str6 is " + (str5 == str6));  
    System.out.println("    str5.equals(str6) is " + str5.equals(str6));
    String s = "lo";
    String str7 = "Hel"+ s;  
    String str8 = "He" + "llo"; 
    System.out.println("str7 is computed at runtime.");         
    System.out.println("str8 is created by using string constant 
expression.");    
    System.out.println("    str7 == str8 is " + (str7 == str8));  
    System.out.println("    str7.equals(str8) is " + str7.equals(str8));
    
  }
}
The output result is:
str1 and str2 are created by using string literal.
    str1 == str2 is true
    str1.equals(str2) is true
str3 and str4 are created by using new operator.
    str3 == str4 is false
    str3.equals(str4) is true
str5 and str6 are created by using string constant expression.
    str5 == str6 is true
    str5.equals(str6) is true
str7 is computed at runtime.
str8 is created by using string constant expression.
    str7 == str8 is false
    str7.equals(str8) is true
The creation of two strings with the same sequence of letters without the use of the new keyword will create pointers to the same String in the Java String literal pool. The String literal pool is a way Java conserves resources.

String Literal Pool

String allocation, like all object allocation, proves costly in both time and memory. The JVM performs some trickery while instantiating string literals to increase performance and decrease memory overhead. To cut down the number of String objects created in the JVM, the String class keeps a pool of strings. Each time your code create a string literal, the JVM checks the string literal pool first. If the string already exists in the pool, a reference to the pooled instance returns. If the string does not exist in the pool, a new String object instantiates, then is placed in the pool. Java can make this optimization since strings are immutable and can be shared without fear of data corruption. For example
public class Program
{
    public static void main(String[] args)
    {
       String str1 = "Hello";  
       String str2 = "Hello"; 
       System.out.print(str1 == str2);
    }
}
The result is
true

Unfortunately, when you use
String a=new String("Hello");
a String object is created out of the String literal pool, even if an equal string already exists in the pool. Considering all that, avoid new String unless you specifically know that you need it! For example
public class Program
{
    public static void main(String[] args)
    {
       String str1 = "Hello";  
       String str2 = new String("Hello");
       System.out.print(str1 == str2 + " ");
       System.out.print(str1.equals(str2));
    }
}
The result is
false true

A JVM has a string pool where it keeps at most one object of any String. String literals always refer to an object in the string pool. String objects created with the new operator do not refer to objects in the string pool but can be made to using String's intern() method. The java.lang.String.intern() returns an interned String, that is, one that has an entry in the global String pool. If the String is not already in the global String pool, then it will be added. For example
public class Program
{
    public static void main(String[] args)
    {
        // Create three strings in three different ways.
        String s1 = "Hello";
        String s2 = new StringBuffer("He").append("llo").toString();
        String s3 = s2.intern();
        // Determine which strings are equivalent using the ==
        // operator
        System.out.println("s1 == s2? " + (s1 == s2));
        System.out.println("s1 == s3? " + (s1 == s3));
    }
}
The output is
s1 == s2? false
s1 == s3? true
There is a table always maintaining a single reference to each unique String object in the global string literal pool ever created by an instance of the runtime in order to optimize space. That means that they always have a reference to String objects in string literal pool, therefore, the string objects in the string literal pool not eligible for garbage collection.

String Literals in the Java Language Specification Third Edition

 Each string literal is a reference to an instance of class String. String objects have a constant value. String literals-or, more generally, strings that are the values of constant expressions-are "interned" so as to share unique instances, using the method String.intern.
Thus, the test program consisting of the compilation unit:
package testPackage;
class Test {
        public static void main(String[] args) {
                String hello = "Hello", lo = "lo";
                System.out.print((hello == "Hello") + " ");
                System.out.print((Other.hello == hello) + " ");
                System.out.print((other.Other.hello == hello) + " ");
                System.out.print((hello == ("Hel"+"lo")) + " ");
                System.out.print((hello == ("Hel"+lo)) + " ");
                System.out.println(hello == ("Hel"+lo).intern());
        }
}
class Other { static String hello = "Hello"; }
and the compilation unit:
package other;
public class Other { static String hello = "Hello"; }
produces the output:
true true true true false true
This example illustrates six points:
  • Literal strings within the same class in the same package represent references to the same String object.
  • Literal strings within different classes in the same package represent references to the same String object.
  • Literal strings within different classes in different packages likewise represent references to the same String object.
  • Strings computed by constant expressions are computed at compile time and then treated as if they were literals.
  • Strings computed by concatenation at run time are newly created and therefore distinct.
The result of explicitly interning a computed string is the same string as any pre-existing literal string with the same contents.



Why String is immutable or final in Java


This is one of the most popular interview question on String in Java which starts with discussion of What is immutable object , what are the benefits of immutable object , why do you use it and which scenarios do you use it. This is some time also asked as "Why String is final in Java" .

It can also come once interviewee answers some preliminarily strings questions e.g. What is String pool ,
What is the difference between String and StringBuffer , What is the difference between StringBuffer and StringBuilder etc.

Though there could be many possible answer for this question and only designer of String class can answer this , I think below two does make sense

1)Imagine StringPool facility without making string immutable , its not possible at all because in case of string pool one string object/literal e.g. "Test" has referenced by many reference variables , so if any one of them change the value others will be automatically gets affected i.e. lets say

String A = "Test"
String B = "Test"

Now String B called "Test".toUpperCase() which change the same object into "TEST" , so A will also be "TEST" which is not desirable.

2)String has been widely used as parameter for many java classes e.g. for opening network connection you can pass hostname and port number as stirng , you can pass database URL as string for opening database connection, you can open any file by passing name of file as argument to File I/O classes.

In case if String is not immutable , this would lead serious security threat , I mean some one can access to any file for which he has authorization and then can change the file name either deliberately or accidentally and gain access of those file.

3)Since String is immutable it can safely shared between many threads ,which is very important for multithreaded programming and to avoid any
synchronization issues in Java.

4) Another reason of
Why String is immutable in Java is to allow String to cache its hashcode , being immutable String in Java caches its hashcode and do not calculate every time we call hashcode method of String, which makes it very fast as hashmap key to be used in hashmap in Java.  This one is also suggested by  Jaroslav Sedlacek in comments below.

5) Another good reason of Why String is immutable in Java suggested by Dan Bergh Johnsson on comments is: The absolutely most important reason that String is immutable is that it is used by the class loading mechanism, and thus have profound and fundamental security aspects.
Had String been mutable, a request to load "java.io.Writer" could have been changed to load "mil.vogoon.DiskErasingWriter"


I believe there could be some more very convincing reasons also , Please post those reasons as comments and I will include those on this post.

I think above reason holds good for another java interview questions
"Why String is final in Java"  also to be immutable you have to be final so that your subclass doesn't break immutability.  what do you guys think ?


String Imutability::::

==> Security, Thread safety, Performance,

One important reason was that the JVM is designed to enable secure "sandbox" operation, meaning that unknown Java code -- like an applet downloaded from the Internet -- can be run safely. Immutable Strings help ensure that safety.

Here's a concrete example: part of that safety is ensuring that an Applet can contact the server it was downloaded from (to download images, data files, etc) and not other machines (so that once you've downloaded it to your browser behind a firewall, it can't connect to your company's internal database server and suck out all your financial records.) Imagine that Strings are mutable. A rogue applet might ask for a connection to "evilserver.com", passing that server name in a String object. The JVM could check that this server name was OK, and get ready to connect to it. The applet, in another thread, could now change the contents of that String object to "databaseserver.yourcompany.com" at just the right moment; the JVM would then return a connection to the database!

You can think of hundreds of scenarios just like that if Strings are mutable; if they're immutable, all the problems go away. Immutable Strings also result in a substantial performance improvement (no copying Strings, ever!) and memory savings (can reuse them whenever you want.)

2 comments:

deepakdverma said...

Very Useful ... really appreciated.
Thanks a lot

rajinder said...

Nice article!
String Pool