понедельник, 8 февраля 2010 г.

ThreadLocal: history of performance improvment

From "Threading lightly, Part 3: Sometimes it's best not to share. Exploiting ThreadLocal to enhance scalability"

ThreadLocal performance
While the concept of a thread-local variable has been around for a long time and is supported by many threading frameworks including the Posix pthreads specification, thread-local support was omitted from the initial Java Threads design and only added in version 1.2 of the Java platform. In many ways, ThreadLocal is still a work in progress; it was rewritten for version 1.3 and again for version 1.4, both times to address performance problems.

In JDK 1.2, ThreadLocal was implemented in a manner very similar to Listing 2, except that a synchronized WeakHashMap was used to store the values instead of a HashMap. (Using WeakHashMap solves the problem of Thread objects not getting garbage collected, at some additional performance cost.) Needless to say, the performance of ThreadLocal was quite poor.

The version of ThreadLocal provided with version 1.3 of the Java platform is substantially better; it does not use any synchronization and so does not present a scalability problem, and it does not use weak references either. Instead, the Thread class was modified to support ThreadLocal by adding an instance variable to Thread that holds a HashMap mapping thread-local variables to their values for the current thread. Because the process of retrieving or setting a thread-local variable does not involve reading or writing data that might be read or written by another thread, you can implement ThreadLocal.get() and set() without any synchronization. Also, because the references to the per-thread values are stored in the owning Thread object, when the Thread gets garbage collected, so can its per-thread values.

Unfortunately, even with these improvements, the performance of ThreadLocal under Java 1.3 is still surprisingly slow. My rough benchmarks running the Sun 1.3 JDK on a two-processor Linux system show that a ThreadLocal.get() operation takes about twice as long as an uncontended synchronization. The reason for this poor performance is that the Thread.currentThread() method is quite expensive, accounting for more than two-thirds of the ThreadLocal.get() run time. Even with these weaknesses, the JDK 1.3 ThreadLocal.get() is still much faster than a contended synchronization, so if there is any significant chance of contention at all (perhaps there is a large number of threads, or the synchronized block is executed frequently, or the synchronized block is large), ThreadLocal may still be more efficient overall.

Under the newest version of the Java platform, version 1.4b2, performance of ThreadLocal and Thread.currentThread() has been improved significantly. With these new improvements, ThreadLocal should be faster than other techniques such as pooling. Because it is simpler and often less error-prone than those other techniques, it will eventually be discovered as an effective way to prevent undesired interactions between threads.

Комментариев нет:

Отправить комментарий