Tuesday, February 15, 2011

Common memory leak causes in Java

Java may not have pointers, but memory leaks still happen. You can easily consume most of your memory on the heap if you're not taking care to free up memory you no longer need. The possibility for memory leaks seems to go up dramatically the more programmers work on a single project. This is especially true if some don't fully understand the java memory model.

In this post I'm going to cover the most common scenarios likely to cause a memory leak. In my next post I'll go over some powerful tools included in the current JDK which allow you to discover and hopefully fix memory leaks.

Common causes of memory leaks in Java

1) Static variables

Many junior programmers in java do not fully understand what static means. This misunderstanding is perhaps the most common cause of unintended memory leaks. To understand static variables, think of the scope of a variable, or, where does a variable live? If you declare a variable inside a  method, this is called a method scoped variable. It only exists while that method is running and is usually destroyed and released for garbage collection as soon as the method exits. If you declare a variable inside a class definition, then it is OBJECT scoped. It is NOT CLASS SCOPED. Remember a class is a blueprint for an object, and an instance of an object is completely different then the class from which it was built. The variable that is object scoped will be deferenced and ready for garbage collection as soon as all references to the object are destroyed. When you null out the only reference to an object, it is properly deferenced.

A STATIC VARIABLE is CLASS SCOPED. A static variable lives on the blueprint of an object, not the object itself. The classes in java are usually loaded right away at startup and are never deferenced until you shut down the JVM. If you declare a variable as static it will live for the entire lifetime of the JVM, unless it is individually nulled out. If a static variable references non static objects, it effectively makes those objects static as well. You can clear out a static collection, but depending on how the collection is cleared you may or may not be freeing up memory. Be careful. If you don't understand why something needs to be static, ask someone who does understand. Static is a surprisingly sticky subject and many career programmers still don't fully understand what it means.

2) Thread local variables

If you consume libraries from another party (apache, an internal group,wherever) be warned about these buggers. Java programmers who sometimes want to appear more clever then they are, utilize thread local variables to cache information on a thread, usually to speed up processing in a opaque way.

In the way that variables can be scoped to a method or an object or a class, thread local variables are scoped to a thread. Threads in Java are not always easy to trace and often stick around for the lifetime of the application.

If say, an xml parser, decides to put a ton of cached objects in a thread local variable, and forgets to clear it's cache after it is done, you will have that now useless cache taking up space in your heap for the lifetime of that thread. For more about discovering thread local variables please see my previous post.

3) Poorly implemented data structures

Poorly implemented data structures can be just as damaging and confusing as either of the other two common issues.  When you store data in a common array, you have to null out the indices of the array if you want the objects contained within to be deferenced. A seemingly simple thing like that can become easily obfuscated and forgotten when implementing a complicated data structure, like a specialized tree or specialized hash table.

The best thing to do in most scenarios is try to encourage the use of standard data structures (usually through sun or apache) that have been used by many people, thoroughly tested, and de-bugged.

You will sometimes run into career developers who have spent too much time implementing their own vector class or their own tree structures to ever consider using the standard tools. In fact, many of these 'blinded by undeserved ego' types will not even know of the common classes developed a decade ago. If you find yourself debugging memory leaks in your co-workers data structures to often, consider looking for a different development group to be apart of, or a different company. It's good to have a grounded understanding of how all the major data structures work, it's bad to assume that everything you do will somehow be magically better then other (almost always smarter) professionals.

Those are the most common causes of memory leaks I've seen in my career. I've caused some and fixed some. In my next post I'm going to chronicle using the new JDKs exciting version of jvisualvm. The included heap analysis tools have come along a way since jhat, and are now even comparable, in some ways, to expensive products such as jProfiler. Cool beans.



Wednesday, February 9, 2011

Inspecting thread local variables In Java example

I ran into an issue with some Thread Local Variables recently and came up with a quick way to actually see the buggers.

Using reflection you can get a good idea of the TLVs on your current thread.

package threadLocal;
import java.lang.reflect.Field;
import java.lang.reflect.Method;

public class sandbox {

    public static void main(String[] args) throws Exception {

        Field field1 = Thread.class.getDeclaredField("threadLocals");
        field1.setAccessible(true);
        Object o1 = field1.get(Thread.currentThread());
        Field field2 = o1.getClass().getDeclaredField("table");
        field2.setAccessible(true);
        Object[] o2 = (Object[]) field2.get(o1);
        for (Object temp : o2) {
            if (temp != null) {
                Field field3 = temp.getClass().getDeclaredField("value");
                field3.setAccessible(true);
                Object o3 = field3.get(temp);
                System.out.println(o3);
            }
        }
    }
}
This will print out all the TLVs on your current thread. You can also set them to null or whatever you want with a little more reflection. Particularly by nulling out all the buckets in the Object[] o2.  Thread Local Variables are rarely used responsibly in Java programming. I advise you don't use them unless you really have a rock solid reason to (and even when you think you do, you probably don't). They are often forgotten about and lead to many unintended memory leaks down the line.