Android developers must be observing Lint warnings recently to replace some of their HashMaps with SparseArrays with a promise of memory optimization. Good for us ! There are few classes we should learn to use, like ArrayMap and SimpleArrayMap. There are also multiple variants of SparseArrays. This post will describe these classes along with their internals.
Let’s start with some code showing how to use these classes
java.util.HashMap<String, String> hashMap = new java.util.HashMap<String, String>(16); hashMap.put("key", "value"); hashMap.get("key"); hashMap.entrySet().iterator(); android.util.ArrayMap<String, String> arrayMap = new android.util.ArrayMap<String, String>(16); arrayMap.put("key", "value"); arrayMap.get("key"); arrayMap.entrySet().iterator(); android.support.v4.util.ArrayMap<String, String> supportArrayMap = new android.support.v4.util.ArrayMap<String, String>(16); supportArrayMap.put("key", "value"); supportArrayMap.get("key"); supportArrayMap.entrySet().iterator(); android.support.v4.util.SimpleArrayMap<String, String> simpleArrayMap = new android.support.v4.util.SimpleArrayMap<String, String>(16); simpleArrayMap.put("key", "value"); simpleArrayMap.get("key"); //simpleArrayMap.entrySet().iterator(); <- will not compile android.util.SparseArray<String> sparseArray = new android.util.SparseArray<String>(16); sparseArray.put(10, "value"); sparseArray.get(10); android.util.LongSparseArray<String> longSparseArray = new android.util.LongSparseArray<String>(16); longSparseArray.put(10L, "value"); longSparseArray.get(10L); android.util.SparseLongArray sparseLongArray = new android.util.SparseLongArray(16); sparseLongArray.put(10, 100L); sparseLongArray.get(10);
Let’s discuss these classes one by one. All Java collections are primarily based on Arrays. We need to understand how HashMaps work before we look into alternatives.
HashMap is basically an Array of HashMap.Entry objects (Entry is an inner class of HashMap). On a high-level, the instance variables in Entry class are :
- A non-primitive key
- A non-primitive value
- Hashcode of the object
- A pointer to next Entry
Note that keys and values are all non-primitives. This is a design decision made by Java engineers, and we have to live with it. Inserting a primitive comes at a cost of autoboxing.
When an object is inserted in the HashMap :
- Hashcode of the key is calculated and Entry class’ hashCode variable is populated.
- Another method, java.util.HashMap.indexFor() is applied on the hashCode which you can think of as a modulo function using the size of Entry[ ], and determines the index of this Entry in the Entry[ ]. This index is called ‘bucket’.
- If there is a pre-existing element in this bucket, the new element is inserted with the last element pointing to new one – essentially making the bucket a LinkedList.
Queries can now be done with O(1) complexity :
- Input key’s hashCode is calculated
- java.util.HashMap.indexFor() is applied on this hashCode and we get the bucket/index of the Entry object like querying an array.
O(1) time is a magic all developers want. But space is another constraint. Especially on mobile. Drawbacks of HashMaps are :
- Autoboxing means extra objects created with each insertion. This will impact memory usage as well as Garbage Collection.
- The HashMap.Entry objects themselves are an extra layer of objects to be created and garbage collected.
- Buckets are rearranged each time HashMap is compacted or expanded. This is an expensive operation which grows with number of objects.
- Hashing is cool, but if implemented poorly will take us back to O(N).
- A related disadvantage of hashing which most people ignore is that we still need to store both the key and the hashcode. This redundancy helps with tackling collision. Non-hash solutions can help in this regard too.
ArrayMap uses 2 arrays. The instance variables used internally are Object[ ] mArray to store the objects and the int mHashes to store hashCodes. When an key/value is inserted :
- Key/Value is autoboxed.
- Key object is inserted at the next available position in mArray[ ].
- Value object is also inserted in the position next to key’s position in mArray[ ].
- The hashCode of key is calculated and placed in mHashes[ ] at the next available position.
For searching a key :
- Key’s hashCode is calculated
- Binary search is done for this hashCode in the mHashes array. This implies time complexity increases to O(logN).
- Once we get the index of hash, we know that key is at 2*index position in mArray and value is at keyIndex+1 position.
This still does not solve the problem of autoboxing, as put() still takes an Object as input. But it does create one less object (HashMap.Entry). Is it worth trading off O(1) search complexity ? Metrics say yes for most apps.
android.util.ArrayMap works only on API level 19 (Kitkat) onwards. Support library brings the same functionality to older platforms.
As you would have noticed in the code snippet posted earlier (line#21), this class does not offer entrySet() method to iterate. If you go through it’s documentation, it trims off many other methods from standard Java Collections API. So why use SimpleArrayMap ? Use it to trim down your APK size at the cost of losing interoperability with other Java containers. This way, Proguard (code optimization and obfuscation tool, which is likely to be a part of your build generation) can trim off most of those unused Collections API code – and hence making your APK size smaller. The internal working of this class is same as android.util.ArrayMap.
Like ArrayMaps, SparseArrays also use 2 arrays at their core. One is int[ ] called mKeys and the second is Object[ ] called mValues. As the names suggest, one is for keys and another for values 🙂
When a key/value is inserted :
- The int key (and not it’s hash) is stored in the next available position in mKeys[ ]. So no autoboxing of the key anymore.
- The value Object is stored in the next available position in mValues[ ]. Value is still autoboxed.
For a query :
- The key is searched using binary search (refer to android.util.ContainerHelpers.binarySearch() method) in the mKeys array. This means, search complexity is still O(log N).
- The key index is used to retrieve the value from the mValues array.
Compared to HashMap, we got rid of Entry object and the key object. We have given up hashing, and are relying on binary search. On compaction/expansion, there is a lighter overhead now.
Pre-Kitkat (API level >= 19) use android.support.v4.util.SparseArrayCompat
SparseArray accepts only int primitives as keys. With LongSparseArray, we can use long as keys too. Implementation is same as SparseArray.
Pre-Kitkat (API level >= 19) use android.support.v4.util.LongSparseArray
android.util.SparseIntArray, android.util.SparseLongArray and android.util.SparseBooleanArray
For cases where keys are integer primitives and values are either integer, long or boolean primitives; use SparseIntArray, SparseLongArray and SparseBooleanArray respectively. Their implementation is same as SparseArrays, the advantage being the mValues array is a primitive array. Which means neither key, nor value is boxed, and we save 3 objects (Entry, Key and Value) compared to HashMap implementation, while losing search complexity from O(1) to O(log N).
Using the SpareArray and ArrayMap implementations will certainly reduce the number of objects creation. Performance difference should not be significant (less than 50%) for collections with hundreds of items. In summary, it’s good to migrate to ArrayMap and SparseArray implementations for new code; and back-porting should be easy as API signatures match.
N.B : Though they sound like arrays, SparseArrays and ArrayMaps do not guarantee preserving the order of insertion. Be mindful while iterating.