Cross-domain requests with JSONP? Safe?

To Mr. Joe Average web developer JSON may just seem like yet another pointless buzzword but to the enquiring mind it is so much more!

For me, the opportunities JSON presents are much more appreciated when looking at JSONP, a slight variation which stands for ‘JavaScript Object Notation with Padding‘. They both essentially refer to the same thing but just to clear, there is a difference:

// JSON: 
{
    'name' : 'john',
    'age' : 23,
    'hobbies' : ['football','cooking','rock climbing'],
    'temper' : 'moody'
}
 
// JSONP:
newPerson({
    'name' : 'john',
    'age' : 23,
    'hobbies' : ['football','cooking','rock climbing'],
    'temper' : 'moody'
});

As you can see, the only difference is that JSONP format requires some type of callback function with the object passed as a parameter whereas regular JSON is just a regular inline JavaScript object (the theory behind JSONP is that the callback name is specified by the requester (within the query string), not by the party which receives the request).

So, imagine you have a PHP script on one domain (www.foo.com) which outputs JSONP like that shown above. Now imagine that you link to that file with a script tag on your domain (www.bar.com):

<!-- www.bar.com -->
<script type="text/javascript" src="www.foo.com/jsonp.php"></script>

Assuming you have a function called ‘newPerson’ within that document then it will run and you can do what you like with that JSON data. So, you might have previously defined the newPerson function within the page:

function newPerson(o) {
    alert(o.name);
}

The above code will alert the name of the person defined in the JSONP data…

Okay, that’s pretty boring; all we’re doing is requesting some static script when our page loads and then firing a function! But, what if we were to create new script tags on the fly and then relay the data through some type of callback function which we have control over, and even better, what if we could specify the query string which is passed to the other domain’s JSON script? Doing this would create a reliable data interchange medium through which we can send and receive data, a bit like Ajax!

The purpose of JSONP is to overcome the boundaries of the infamous same-domain-policy which restricts XHR requests to the same domain meaning that you cannot make Ajax requests to other domains. There’s no need to worry about that with JSONP because it doesn’t even require Ajax to work; all it’s doing is using script tags and callbacks…

Problems / solutions:

Unfortunately JSONP has some obvious drawbacks, one of them being security! The real question is, do you trust that other domain? Since the other domain is essentially being given FULL access to your page (via JavaScript) you should be absolutely sure that you trust them. If they wanted they could quite easily vandalize your page, redirect the user, or worse: steal your cookies (which leads to a whole bunch of potential problems).

There are, of course, other ways to emulate the potential power of cross-domain XHR: creating a server-side proxy on your domain would mean you have complete access to any domain/page of your choice (not necessarily JSON/P), and then you can use same-domain (Duh!) Ajax requests to access that proxy.

A very simple version can be created quite easily:

PHP proxy:

// Get content of file specified in URL parameter:
echo file_get_contents($_GET['url']);
 
// e.g. thisScript.php?url=http%3A%2F%2Fwww.google.com 
// ('URL' must be encoded)

We can then simply load this page like any other page on our own domain using our trusty XHR! A jQuery example:

var url = encodeURIComponent('http://www.google.com')
$.ajax({
    url: 'theProxyScript.php?url=' + url,
    success: function(data) {
        // Alert title of google.com:
        alert( $('<div/>').append(data).find('title').text() );
    }
});

The main problem with this is that we don’t sanitise the retrieved URL (in the PHP script) – it could contain anything, perhaps malicious JavaScript made to steal our precious cookies! And how do we know the request is from the same domain – by checking the referrer? – it’ll probably do us little good becuase something like that can easily be spoofed by the client!

No PHP required? (for you)

This was working quite well for me; I could finally make asynchronous cross-domain requests (with the lethal risk of XSS). I can scrape search results from Google, market data from Yahoo and the latest posts from all my favourite blogs!

There are two problems with this though: Firstly, it’s illegal to scrape Google search results without using their APIs (apparently) plus you need to have PHP, and be prepared to get double-taxed on bandwidth – it requires a local PHP script as a medium of the requests.

So, why not create a remotely stored PHP script which outputs JSONP, and one that anyone can use?

<?php
 
    $callback = $_GET['callback'];
    $url = $_GET['url'];
 
    if(!preg_match('/^[\w]+$/',$callback)||!preg_match('/^http:\/\/.+$/',$url)) exit;
 
    header('Content-type: text/javascript');
    header("Cache-Control: no-cache, must-revalidate");
 
    $source = preg_replace('/(\\n|\\r)/', '', file_get_contents($url));
 
    // OUTPUT JSONP:
    echo $callback;
    echo '({';
    echo '\'source\': ';
    echo '\'' . $source . '\'';
    echo '});';
 
?>

Problems:

  • Users can spoof headers – i.e. the page could potentially be rendered as regular HTML and thus leave us open to XSS attacks. To be honest though, if you have no cookies then XSS attacks are pretty puerile in what they can achieve. (AFAIK)
  • Nobody would trust a remote script (not managed by them) – well maybe some really naive people.

In fact, this JSONP solution offers no real benefit over the locally stored PHP proxy apart from the saving you’ll make on bandwidth *Yeh, right!

What did we learn from this post?

Not much:

  • Using JSONP / JSONPP is only beneficial when working with trusted parties.
  • There is always a risk of XSS – even if you trust the web service, how can you be assured they won’t be hacked?
  • Server-side proxies are the BEST option, as long as you sanitise the output before showing it, this includes:
    • Wiping all <script> elements from the output.
    • Wiping all element event attributes (onClick / onLoad etc.).
    • Wiping all hrefs prefixed with the ‘javascript’ pseudo protocol (e.g. javascript:void(alert('XSS!!!'))).
    • Acutally, XSS is such a nasty thing, let’s just forget this!
    • BEST ALTERNATIVE
      • Use a server-side selector engine to only return the data you require!

Conclusion?

Don’t trust anyone! Don’t risk XSS! Enjoy server-side validation / sanitisation methods!

Sorry if this post was boring and/or pointless, just a stream of thought / discussion…

Second conclusion:

JSON itself is very boring and IMO a totally uninspired format, it doesn’t really deserve a name! It’s just a weedy alternative to XML! I’m not saying it doesn’t have its uses…

FYI, this is probably the last post of 2008! Merry Christmas and HAPPY NEW YEAR! :-)