You’re not from around here, are you? – The Same Origin Policy

Recently, I’ve been seeing a lot of posts on things like StackOverflow, #jquery IRC, and other places asking about doing things like manipulating iframes. And in each and every case, the petitioning (though rarely aspiring) coder gets an answer about how to get, set, add, or remove some aspect of the iframe document object. And, inevitably, they come back and say that it doesn’t work. Now, at this point, there can be only one of two possible reasons for that.

These two reasons are:

  1. The person giving the advice is an idiot.
  2. The person asking for the advice is an idiot.

Pardon me for being blunt, but I’ve been watching a lot of House recently, and when it comes to diagnosing problems, there are worse role models to have. Now, I consider myself to have some small facility in the realm of JS, jQuery, and the like. And when I review (and/or offer) advice in these situations, it always is sound, concise, elegantly-written, and accurate. So, this tends to eliminate option #1 from above.

So, that leaves option #2, which is not quite as flattering to the person who is asking the question. But, it is universally proven true by some variation of the question ‘Is the iframe on the same domain as your page?’ And inevitably, the infamous rest of the story comes out, and the petitioner reveals that no, the iframe is pulling from another domain entirely, but they were hoping to get the s00p3r s3kr1t mad haxx to be able to defeat the Same Origin Policy.

Yes, I capitalized that. There’s a reason. It’s important. To start off with, let’s get our terminology solidified. What do I mean when I say the Same Origin Policy (SOP)? Well, to quote this page:

Simply stated, the SOP states that JavaScript code running on a web page may not interact with any resource not originating from the same web site. The reason this security policy exists is to prevent malicious web coders from creating pages that steal web users' information or compromise their privacy. 

Of course, it would be grand if you could just load up an iframe with content from anywhere and reach your grubby little fingers in there and change the DOM, grab/modify the content, get the images, whatever you wanted to do. When this happens to someone’s house it’s called either vandalism or larceny, depending on the outcome. This is also a good time to quote the ever-famous ‘Golden Rule’ of ‘Do unto others as you would have them do unto you’. Or something like that.

Let’s get a bit more technical, by going to the w3.org page on SOP:

The same-origin policy restricts which network messages one origin can send to another. For example, the same-origin policy allows inter-origin HTTP requests with GET and POST methods but denies inter-origin PUT and DELETE requests. Additionally, origins can use custom HTTP headers when sending requests to themselves but cannot use custom headers when sending requests to other origins.

What this means is that things like form submissions, good old hyperlinks, and the like can go from one origin(domain) to another, but anything that would potentially change the data on the receiving end of the request is not allowed. If requests like GET and POST were not allowed, there would be no web, because websites on different domains would not be able to link to each other.

Now, what most people don’t realize is that the Same Origin Policy can impact two sites on the same root domain, but on different subdomains from each other. For example, let’s say I had alpha.donburks.com and omega.donburks.com. These sites could request information from one another, but JavaScript on http://alpha.donburks.com/page.html could only read information from omega.donburks.com/page.html. If it loaded that data into an iframe, it could not change the DOM or even delete that iframe, due to the SOP.

Most of the time, as coders, when we are dealing with the “permissions” on accessing information, we think of the holy trinity of R, W, and RW. These are also known as read, write, and read/write. However, when discussing the SOP, we have to bring a new initial/acronym into the mix: LBDT. Look, But Don’t Touch. You can see the content in the visiting iframe, you just can’t touch it.

Browsers are remarkably adept at enforcing this security policy, and they do it by looking at a few different factors in order to determine whether the SOP applies. These factors are primarily the following:

  1. Protocols must match. This means http vs. https vs. ftp vs. file vs telnet vs smtp, etc.
  2. Hostnames must match. This is the FULL hostname, not just the righthand domain name. (alpha.donburks.com is NOT the same as donburks.com, and definitely not the same as donburks.com or metrolyrics.com)
  3. Ports must match (if specified). Web traffic is “assumed” to be on port 80. But if, for whatever reason, you have chosen to specify a different port (alpha.donburks.com:4201 for example) for this particular web traffic, only something else on that same port (from same protocol and hostname) will be granted access.

The only other condition that can allow SOP to result in a grant of access is if both documents that are being compared set their document.domain property to the same domain. Rules on protocols and ports still apply.

There are a number of security nightmares that are connected to the SOP. A lot of them are connected to people trying to get “clever” with document.domain, or using JSONP to pass data back and forth. For example, what if both hosts are IP-based, not domain-name based? Also, what do you do if one domain resolves to multiple IPs? Are they on the same host or not? That particular behavior doesn’t have hard and fast rules, but I can tell you from personal experience that it will totally fubar things like Captcha’s and other such tools on a domain that is behind a load balancer, or on a CDN.

Cookies

It should be noted that cookies do also fall under a version of the SOP, however they do NOT have the same protocol and port restrictions that http requests do. This means a cookie set on https://omega.donburks.com can be read on http://omega.donburks.com:12345 without issue. Whether it would be relevant or not is separate. The facility to read it and write to it is there.

XMLHttpRequest

No good discussion of SOP would be complete without discussing XmlHttpRequest traffic, or XHR’s. There are a few variations of the SOP for XHR traffic, and these are very important to note.

  1. XHR traffic does not care about document.domain. No really, it doesn’t.
  2. MSIE is different than all other browsers (Yeah, Yeah. I know) in that it rigidly enforces port comparisions on XHR traffic, whereas it may not always be so strict for other “proper” DOM access same-origin checks
  3. You are not guaranteed a uniform behavior cross-platform when it comes to passing things like headers back and forth.

I hope that this has been illuminating when it comes to the Same Origin Policy. Just like you wouldn’t want someone to be able to reach through your window, grab your laptop, iPhone, cat, husband/wife/girlfriend/brother/turtle and paint them pink before taking them away to their own house or worse, replacing them back where they found them, please do not try to do the same thing to other domains’ web content. Generally, these things are considered unethical at best, and usually regarded as being downright douche-y.

Please feel free to comment below, though let’s keep the flaming to a minimum. For a really in-depth resource on SOP, please check out this page for more details (includes charts!)