Element data/storage

One of the more exciting hidden features of jQuery is its ability to bind data to any DOM element. What’s special about this is that it’s almost completely unobtrusive – the data is not actually bound to the DOM element at all. A numerical pointer is added to the element as an "expando" property; this number points to that element’s assigned spot within jQuery’s cache. The cache is just a regular object and each element’s data is added as a nested object.

MooTools also offers its own variation of element storage. It uses the same method of creating a numerical pointer to a global hashtable. According to this blog post the Prototype library will also be implementing it soon.

// jQuery:
    // Set data:
    $(elem).data('customProperty', 12345);
    // Get data:
    $(elem).data('customProperty');
 
// MooTools:
    // Set data:
    elem.store('customProperty', 12345);
    // Get data:
    elem.retrieve('customProperty');

The point?

Before all this fancy element storage stuff was used the widely accepted method of adding data to elements was to literally add each piece of data as a separate expando property. Not only was this obtrusive but adding a lot of data could massively slow down DOM manipulation (or so I’ve heard). The new method makes much more sense and is much faster apparently.

Creating your own

Implementing your own element storage mechanism is very simple. Like I said, it’s just a case of creating a cache, adding to it and retrieving from it. Here’s a basic attempt:

// WITH ENCAPSULATION:
(function(){
 
    var cache = [0],
        expando = 'data' + +new Date();
 
    function data(elem) {
 
        var cacheIndex = elem[expando],
            nextCacheIndex = cache.length;
 
        if(!cacheIndex) {
            cacheIndex = elem[expando] = nextCacheIndex;
            cache[cacheIndex] = {};
        }
 
        return {
            get : function(key) {
                return cache[cacheIndex][key];
            },
            set : function(key, val) {
                cache[cacheIndex][key] = val;
                return val;
            }
        }
 
    }
 
    window.data = data;
 
})();

Everything is wrapped inside a closure so that the global namespace is not polluted. The cache is only accessible to functions within the same scope. The data function is exposed at the end of the closure so you have access to it. If you’re wondering why the cache is initiated with one value (0) it’s so that checking the expando property doesn’t ever return 0 (a falsey value) – this would make checking for a non existent property a tad harder.

Usage is quite simple:

var myElem = document.getElementById('id');
 
// Setting data:
data(myElem).set('customProperty', 12345);
 
// Retrieving data:
data(myElem).get('customProperty'); // returns 12345

If you’re not a fan of encapsulation and would prefer more direct access to each element’s dataset then this probably suits you more:

// WITHOUT ENCAPSULATION:
(function(){
 
    var cache = [0],
        expando = 'data' + +new Date();
 
    function data(elem) {
 
        var cacheIndex = elem[expando],
            nextCacheIndex = cache.length;
 
        if(!cacheIndex) {
            cacheIndex = elem[expando] = nextCacheIndex;
            cache[cacheIndex] = {};
        }
 
        return cache[cacheIndex];
 
    }
 
    window.data = data;
 
})();

It works on exactly the same principle, the only difference is that you have direct access to each element’s data object and you can traverse it using normal dot notation (like any object):

var myElem = document.getElementById('id');
 
// Setting data:
data(myElem).customProperty = 12345;
 
// Retrieving data:
data(myElem).customProperty; // returns 12345
 
// The data can be ANYTHING! -
data(myElem).info = {
    a : 123,
    b : [4,5,6]
};

For me, the second version (without encapsulation) is a little nicer, not only because it’s shorter but also because I have direct access. For something like this there’s no sense in having ‘get’ and ‘set’ methods – it’s just syntactic sugar.

Experimenting…

I was thinking about possible different methods of achieving this without the need for even one expando property – so the DOM element in question does not change at all.

You could have the cache’s indexes correspond to an element’s position within the DOM but this is unreliable because there is always the possibility that the DOM may change. I then tried to think of some way to store a reference to each element and then look it up when needed but that really didn’t work out. Another way would be to identify certain unique properties of each element and use them as pointers – but it’s likely that these unique properties will also be susceptible to change.

So, it seems there is no way to avoid the use of at least one expando property per element…

Thanks for reading! Please share your thoughts with me on Twitter. Have a great day!