Deep copying of Objects and Arrays

In my last post, “Really easy speed-testing“, I discussed the fastest method of duplicating an array, however I mentioned that to copy a array with complex types was an entirely different thing, requiring a different approach.

By “complex types” I mean things that have pointers as opposed to primitive types such as numbers and strings which have no pointers – when primitive types are passed to a function they’re copied; nothing of the original remains. Complex types include Arrays, Objects and Functions; these are all objects.

In my previous post I mistakenly mentioned “pass-by-reference” which JavaScript doesn’t actually support. Complex types like literal objects and arrays are not “passed-by-reference”; their pointers are passed by value and those pointers point to the respective object; this will hopefully become clearer if you carry on reading.

Say, for example, I have an array of arrays; something like this:

var arr1 = [ [1,2,3] , [1,2,3] , [1,2,3] ];

There are a total of four Array objects there, each of them contain values; some primitive and some not so. In this situation arr1 is merely a pointer to an object which exists somewhere in memory. So, one object exists, but it can have multiple pointers; here’s another “pointer”:

var arr2 = arr1;

arr2 now points to the object, not to arr1 itself but the object that it refers to. So we have two variables pointing to the same thing; this is not possible with primitive types. For example:

var X = 5;
var Y = X;

Changing the value held by either Y or X will not have any effect on the other – there are no pointers involved. If we go back to our array example, we can try changing a value within the array:

// Continuing from where we left off (above)
 
arr2[0] = 'replaced';
// arr2 = [ 'replaced' , [1,2,3] , [1,2,3] ];
// arr1 = [ 'replaced' , [1,2,3] , [1,2,3] ];
// arr2 === arr1; // This is TRUE

Because both arr1 and arr2 point to the same object the above results should make perfect sense. When you do something to arr2[0] you’re also doing something to arr1[0].

Copying the array

So, if we want to create a copy of arr1 we can’t just give it another pointer; we have to create a new array and fill it with the contents of the first array. This is quite simple in itself, and it is; have a look:

var theCopy = []; // An new empty array
for (var i = 0, len = arr1.length; i < len; i++) {
    theCopy[i] = arr1[i];
}
 
// "theCopy" = [ 'replaced' , [1,2,3] , [1,2,3] ]

So, we’ve created a new object called theCopy. The first value within this array is the string, ‘replaced’, and this is a totally new string; a copy of the original arr1[0]. But, within theCopy we also have two other values: two more arrays – both of which are objects and exist somewhere in memory (only accessible by pointers). When we attempted to copy arr1 we didn’t successfully copy those two inner arrays; all we did is create two more pointers: theCopy[1] and theCopy[2] now point to the same objects as arr1[1] and arr2[2]. This is not what we want.

So, to successfully copy the entire array we’ll have to create new arrays for each of the inner arrays and copy them in the same manner; by looping through all values. This can become a very expensive process if you continue looping through each array or object as it appears. But unfortunately, this is the reality of it; there’s no other way. We can limit repeating ourselves by using recursion, but the continual looping still occurs behind the facade of consecutive calling.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
function deepCopy(obj) {
    if (Object.prototype.toString.call(obj) === '[object Array]') {
        var out = [], i = 0, len = obj.length;
        for ( ; i < len; i++ ) {
            out[i] = arguments.callee(obj[i]);
        }
        return out;
    }
    if (typeof obj === 'object') {
        var out = {}, i;
        for ( i in obj ) {
            out[i] = arguments.callee(obj[i]);
        }
        return out;
    }
    return obj;
}

The first IF statement (line 2) simply checks that the passed obj is an array, and if it is, the block commences: A new array is created (line 3) and is filled with values from the original array. Before each assignment (line 5) the value is first passed to arguments.callee which refers to the deepCopy function itself (this is what recursion is). If the value passed to deepCopy is a primitive type it won’t pass either of the IF statements and so it will simply be returned (line 16). If however, it is an object or an array, then the respective IF block executes, thus copying the object. Any sub-arrays or sub-objects (or sub-sub-sub-objects) will be treated in the same manner, hence why it’s called a “deep copy”.

The point?

So, why would you want to copy an array or object; what’s the point!?

Whenever you want to manipulate the data held in a data-set without effecting the original data-set you’ll first want to make a copy of it. This is a common requirement; almost every well-constructed jQuery plugin uses this method to merge default settings with user-defined settings. There are other situations where copying objects is a requirement but they’re not all that common. This post was just to show you the commonly overlooked details associated with copying objects and arrays in JavaScript; it’s useful to know…

Thanks for reading! Please share your thoughts with me on Twitter. Have a great day!