2001-03-28 The Java Specialists' Newsletter [Issue 015] - Implementing a
SoftReference based HashMap
Author: Dr. Heinz M. Kabutz
You can subscribe from our home page:
http://www.javaspecialists.co.za (which also hosts all previous issues,
available free of charge
Welcome to the 15th issue of "The Java(tm) Specialists' Newsletter", this
week we are looking at an extremely useful addition to the Java language, namely
soft and weak references. In a nutshell, the main difference between the two
seems to be that the SoftReference has some notion of remembering when last it
was accessed, which may be used by the GC (if it feels like it) to release
little-used SoftReferences first.
Next week I will be doing some training in Germany, so I will not be able to
send out a newsletter, as I have not decided on a list server yet. Hopefully the
week after that you'll have your normal fix again. Please continue forwarding
this newsletter to your local JUG, friends and family who are interested in
Java. If you do send the newsletter to a closed user list, kindly tell me, I
like having an indication of how many souls are reading my newsletter.
Similarly, if you quote from my newsletter, be so kind as to attribute the
source.
Many thanks to Sydney Redelinghuys for his input and corrections in this
newsletter. (Sydney wants to spend a bit of time working in London, so if you
are in the London area and would like the services of one of the most brilliant
Java programmers I know, please send Sydney an
email.
Once again I have seen the value in actually testing my code, as there was a
serious error in my code which I only found through my "unit" test.
Implementing a SoftReference based HashMap
Java is slow. Java is a memory hog.
But, if I have to write a network application, I would not hesitate for a
second to use Java. Why? Because Java shines in threaded, network-based
applications and because over the years, I have recognised the weaknesses of
Java and have learnt (the hard way) what I should do, or not do, to avoid
running into too serious problems. Recognising the problems takes a while to
learn, which is why most Java jobs require you to have at least 2 years of Java
programming experience.
One of the most common places of memory wastage is in hash maps, so SUN have
provided a WeakHashMap to minimize memory usage in caches implemented using
maps. A WeakHashMap stores the keys using WeakReference objects, which means in
layman's terms that as soon as the key is not referenced from somewhere else in
your program, the entry may be removed and is available for garbage collection.
(Have a look at the JavaDocs for the java.util.WeakHashMap and
java.lang.ref.WeakReference for more information) Say you write a middle tier
that provides an OO view of your database via an application server. What you
really want is for the values to be automatically released, rather than the
keys. If you put all the objects into a normal HashMap, you can easily run out
of memory when you access many different objects from different clients. But if
you store the objects in a WeakHashMap they are cleared as soon as your clients
is not referencing them anymore. What you really want, however, is to only have
the objects released when the VM is running low on memory, since that is when
you get problems.
Enter SoftReferences. As far as I understand, a SoftReference will only be
garbage collected when the VM is running low on memory and the object it is
pointing to is not accessed from a normal (hard) reference. This is probably a
better option to use for the HashMap values than a WeakReference, but the
default JDK collections don't include a GC-friendly HashMap based on values and
neither does it provide a SoftReference based HashMap.
Before I show you a SoftHashMap implementation, based on ideas by Sydney
(what's up doc?) Redelinghuys, I need to explain some ideas which will make our
SoftHashMap more optimal.
- Each time we change the map (put, remove, clear) or ask for its size, we
first want to go through the map and throw away entries for which the values
have been garbage collected. It is quite easy to find out which soft
references have been cleared. You can give the SoftReference a ReferenceQueue
to which it is added when it is garbage collected.
- We don't want our hash map to be bullied by the garbage collector, so we
provide the option of the map itself keeping a hard link to the last couple of
objects (typically 100).
- The SoftHashMap will use a variant of the Decorator pattern to add this
functionality to an internally kept java.util.HashMap. I'm busy working on a
Design Patterns course based on the GOF book, let me know if you want further
information.
Without further ado, here comes the SoftHashMap:
//: SoftHashMap.java
import java.util.*;
import java.lang.ref.*;
public class SoftHashMap extends AbstractMap {
/** The internal HashMap that will hold the SoftReference. */
private final Map hash = new HashMap();
/** The number of "hard" references to hold internally. */
private final int HARD_SIZE;
/** The FIFO list of hard references, order of last access. */
private final LinkedList hardCache = new LinkedList();
/** Reference queue for cleared SoftReference objects. */
private final ReferenceQueue queue = new ReferenceQueue();
public SoftHashMap() { this(100); }
public SoftHashMap(int hardSize) { HARD_SIZE = hardSize; }
public Object get(Object key) {
Object result = null;
// We get the SoftReference represented by that key
SoftReference soft_ref = (SoftReference)hash.get(key);
if (soft_ref != null) {
// From the SoftReference we get the value, which can be
// null if it was not in the map, or it was removed in
// the processQueue() method defined below
result = soft_ref.get();
if (result == null) {
// If the value has been garbage collected, remove the
// entry from the HashMap.
hash.remove(key);
} else {
// We now add this object to the beginning of the hard
// reference queue. One reference can occur more than
// once, because lookups of the FIFO queue are slow, so
// we don't want to search through it each time to remove
// duplicates.
hardCache.addFirst(result);
if (hardCache.size() > HARD_SIZE) {
// Remove the last entry if list longer than HARD_SIZE
hardCache.removeLast();
}
}
}
return result;
}
/** We define our own subclass of SoftReference which contains
not only the value but also the key to make it easier to find
the entry in the HashMap after it's been garbage collected. */
private static class SoftValue extends SoftReference {
private final Object key; // always make data member final
/** Did you know that an outer class can access private data
members and methods of an inner class? I didn't know that!
I thought it was only the inner class who could access the
outer class's private information. An outer class can also
access private members of an inner class inside its inner
class. */
private SoftValue(Object k, Object key, ReferenceQueue q) {
super(k, q);
this.key = key;
}
}
/** Here we go through the ReferenceQueue and remove garbage
collected SoftValue objects from the HashMap by looking them
up using the SoftValue.key data member. */
private void processQueue() {
SoftValue sv;
while ((sv = (SoftValue)queue.poll()) != null) {
hash.remove(sv.key); // we can access private data!
}
}
/** Here we put the key, value pair into the HashMap using
a SoftValue object. */
public Object put(Object key, Object value) {
processQueue(); // throw out garbage collected values first
return hash.put(key, new SoftValue(value, key, queue));
}
public Object remove(Object key) {
processQueue(); // throw out garbage collected values first
return hash.remove(key);
}
public void clear() {
hardCache.clear();
processQueue(); // throw out garbage collected values
hash.clear();
}
public int size() {
processQueue(); // throw out garbage collected values first
return hash.size();
}
public Set entrySet() {
// no, no, you may NOT do that!!! GRRR
throw new UnsupportedOperationException();
}
}
And here comes some test code that demonstrates to a certain degree that I'm
not talking complete bullsh*t. Soft and weak references are quite difficult to
experiment with as there is a lot of freedom left to the writers of the JVM of
how they must implement them. I wish the implementation would hold back longer
before removing these references, that the JVM would wait until it is really
running low on memory before clearing them, but unfortunately I am not the one
who wrote the JVM. I have tried to question the authors of the java.lang.ref
package to find out what the strategy is for references in future versions, but
have not had any response yet.
//: SoftHashMapTest.java
import java.lang.ref.*;
import java.util.*;
public class SoftHashMapTest {
private static void print(Map map) {
System.out.println("One=" + map.get("One"));
System.out.println("Two=" + map.get("Two"));
System.out.println("Three=" + map.get("Three"));
System.out.println("Four=" + map.get("Four"));
System.out.println("Five=" + map.get("Five"));
}
private static void testMap(Map map) {
System.out.println("Testing " + map.getClass());
map.put("One", new Integer(1));
map.put("Two", new Integer(2));
map.put("Three", new Integer(3));
map.put("Four", new Integer(4));
map.put("Five", new Integer(5));
print(map);
byte[] block = new byte[10*1024*1024]; // 10 MB
print(map);
}
public static void main(String[] args) {
testMap(new HashMap());
testMap(new SoftHashMap(2));
}
}
Until next week, and please remember to forward this newsletter and send me
your comments.
Heinz
Copyright 2000-2003 Maximum Solutions, South Africa
Reprint Rights. Copyright subsists in all the material
included in this email, but you may freely share the entire email with anyone
you feel may be interested, and you may reprint excerpts both online and offline
provided that you acknowledge the source as follows: This material from The
Java(tm) Specialists' Newsletter by Maximum Solutions (South Africa). Please
contact Maximum Solutions for
more information.
Java and Sun are trademarks or registered trademarks of Sun Microsystems,
Inc. in the United States and other countries. Maximum Solutions is independent
of Sun Microsystems, Inc.
11728 bytes more | 7 comments | | Story By Dr. Kabutz | Score: 5
|