Attack of the monster frames (a mini-retrospective)

begood · October 15, 2010

Some 15 years ago, the introduction of HTML frames caused a significant uproar in the (still young) web development community. The outraged purists asserted that frames were bound to ruin everything: incompatible with many of the browsers and search engines of the old; bringing a significant potential to break navigation or printing; unfamiliar and confusing to users; and simply against the original vision supposedly laid out by the founding fathers of the web. Today, these criticisms seem rather arbitrary: although framed navigation had its share of amusing missteps (not any worse than most other HTML features, I'd argue), the frames have become an important and unobtrusive part of the modern web, and a valuable content compartmentalization tool. But shockingly, even if for all the wrong reasons, the original detractors had one thing right: in a sense, they turned out to be our doom.

How so? Recall that framed browsing dates back to the days of the web being a simple tool for distributing static content - and in that context, the technology warranted no special consideration from the security community; but as our browsers morphed into de facto operating systems for increasingly complex, dynamic applications - well, we quickly discovered that the ability to selectively embed fully functional, third-party content on unrelated and potentially malicious websites is pretty bad news.

One of the earliest problems - with early reports dating back to at least 2004, and variants still being discovered several years later - is the realization that frames are implemented using essentially the same model as standalone windows; this model allows any website in possession of window name (or its DOM handle) to navigate it at will. This property is mostly harmless when dealing with proper windows equipped with an address bar - but is a disaster for seamlessly framed regions on trusted websites: if malicious-site.com can open trusted-application.com in a new window, and then navigate that application's frames to any other location - it can, essentially, silently hijack the UI.

Following this discovery, Adam Barth and others spent a fair amount of time proposing a better approach, and convincing several browser vendors to implement it; but even today, certain unavoidable weaknesses in this model prevail.

The next notable milestone: clickjacking - a seemingly obvious threat essentially ignored by the security community (perhaps in hope it disappears), until extravagantly publicized by Jeremiah Grossman and Robert 'RSnake' Hansen in 2008. The idea behind the attack is simple: if a frame containing trusted-application.com is placed on malicious-site.com, and then partly obscured or made transparent - the user can be easily tricked into thinking he is interacting with the UI of malicious-site.com - but end up sending the UI event to trusted-application.com, instead.

As the name implies, their analysis focused on mouse clicks - which in a sense, did the attack some disservice: the reporting led the community to assume that only certain exceedingly simple UI actions (such as the "like" buttons on social networking sites) could be realistically targeted - and that the attacker would still be facing difficulties computing the right alignment of visual elements for all targeted systems, browsers, and screen resolutions. But that's simply not true.

To demonstrate other perils of cross-domain frames, I posted a proof-of-concept exploit for an attack I jokingly dubbed strokejacking - showing that with the use of onkeydown events, selective keystroke redirection across domains can be used to perform very complex UI actions in the targeted application, far beyond what is possible with clickjacking alone. I also discussed reverse strokejacking - an even more depressing variant where evil embeddable gadgets on a targeted site are able silently intercept user input by playing with the focus() method. These reports received very little attention - but given the ridiculous name, that's perhaps for the best.

Since then, the situation with framed content has gotten even worse: not long ago, we witnessed this presentation from Paul Stone. Paul discussed drag-and-drop attacks on third-party frames: text selected in one obscured frame pointing to trusted-application.com could be unintentionally dragged and dropped into the area controlled by malicious-site.com - thus revealing the content across domains. Many researchers and browser vendors summarily dismissed this threat, on the grounds that the necessary interactions must complex and unusual - for example, triple-clicking or pressing Ctrl-A to select text - and therefore, that they are difficult to solicit; but this is incorrect.

What have we missed, then? Paul casually mentioned one special type of a common UI interaction we all frequently engage in on even the least interesting sites: using the scroll bar. Note that the act of grabbing the slider, dragging it down, and releasing it... is eerily similar to the act of selecting text, or dragging and dropping a selection across the page. The attack can be modified thus:

Create a page with an article that spans more than a single screen - or has a TEXTAREA with an EULA that needs to be scrolled to the end before the "I agree" button is enabled, instead.
Have a transparent IFRAME pointing to trusted-application.com that follows the mouse pointer.
As soon as the user clicks the slider and holds the mouse button, reposition the frame up in relation to the cursor. This ensures that the entire framed text is selected, regardless of mouse movement (yes, this works!).
Wait for mouse button to be released.
Reposition the frame so that the next click will begin to drag the selection.
While the user is interacting with the slider, move the frame away, and place a receiving TEXTAREA or contentEditable / designMode container under the mouse pointer.
Steal documents across domains!

There are some technical challenges that make this a bit more complicated than advertised - but these can be worked around in a majority of the browsers on the market. In the end, cross-domain frames proved to be a giant and completely unexpected attack surface; and very depressingly, we still have no idea how to properly address the problem once and for all. There simply are no simple and elegant solutions compatible with the modern web; and rest assured, browser vendors are extremely hesitant to experiment with complex heuristics instead. The only thing we decided to do to tackle the general threat is plastering the gaping holes over with X-Frame-Options - a naive opt-in mechanism that allows websites to refuse being framed across domains. Alas, this mechanism will never be used by all the sites that actually need it - and it offers no protection in more complex cases, such as the increasingly prevalent embeddable gadgets.

The history of information security is littered with disturbingly similar cases of browser features colliding with each other, or being incompatible with the natural evolution of the web. If you need another example, just look at the profound problems caused by differences between same-origin policies for JavaScript, cookies, plugins (Java in particular) - and for peripheral browser features, such as password managers.

Because of this, I often fear that we are bound to repeat the painful security lessons of framed browsing very soon; for example, I am simply intimidated by the rush to deploy some of the more complex and at times exotic features as a part of HTML5 - web sockets, workers, sandboxing, storage, application caches, notifications, CORS, UMP, and countless other new HTML, CSS, and JS extensions added there every other week.

Yes, it's called "job security". But at times, it tends to suck.

lcamtuf's blog: Attack of the monster frames (a mini-retrospective)

Sign In

Attack of the monster frames (a mini-retrospective)

Recommended Posts

begood

Join the conversation

Browse

Activity

Pages