Security: Block worms from your abusing your web apps

I came across this fascinating piece about a worm that spread quickly through the MySpace web site. What's remarkable is that the web site had pretty good security. The person who did it used some pretty clever techniques to make it work. I highly recommend reading the write-up, and especially his detailed explanation for how he worked around each security measure.

Briefly, since the site lets you enter HTML in your profile, he wrote very clever HTML that would get executed in other people's browsers whenever they viewed his profile. The executed HTML would automatically do various background posts (using ajax techniques) to automatically execute a requests and confirmation responses on their behalf. It would have the author added as a "friend" to the account, and then it would add the same html code to the new user's profile such that anyone viewing this profile would become "infected" in the same way. The worm spread to a million users in a matter of hours!

Myspace already had various security measures in place, such as blocking out dangerous tags, and removing strings such as "javascript" and "onreadystate" completely. He got around that by taking advantage of browser bugs. For example, even though myspace removed references to javascript, it would not remove java\\nscript, and it turns out IE will still recognize this as a javascript URL, even with embedded newlines in the name! As another example, even though onreadystate is blocked out, he could use his ability to execute JavaScript to reconstruct it with string concatenation:

 eval('xmlhttp.onread' + 'ystatechange = callback');

Fascinating. This got me thinking. What if you wanted to have a Comments feature on your web app written with Creator. You want to allow the user to enter some HTML, such that they can add emphasis, use paragraphs etc. But how do you prevent HTML that compromises your web site?

It seems like a really safe way would be to prevent any HTML attributes from being entered! As long as you don't allow attributes, and you don't allow the style or script tags, you should be safe. This may be overly restrictive, especially for a site like MySpace which tries to let users customize their pages visually. But at least for a comments feature it should be viable.

I took a quick stab at this with Creator. What we'll do is this: Use a TextArea component for the comment. We'll add a Validation event on the text area, and the validation will check the string for suspicious text. If it sees a problem, it will raise a validation error, and the messsage will be displayed in the Message component associated with the text area. I drop the components, then right click on the text area and choose Add Event Handler | validate, as shown below (click for full size):

Now we'll just need to write the code. Once you add the event handler, you're placed in the source editor and get to edit your code. By default you get a simple comment telling you roughly how to write a validate event handler:

    public void textField2_validate(FacesContext context, UIComponent component, Object value) {
        // TODO: Check the value parameter here, and if not valid, do something like this:
        // throw new ValidatorException(new FacesMessage("Not a valid value!"));


Here's a simple implementation - pretty naive. It doesn't try to properly parse the HTML, it just looks for occurrences of <, and when found makes sure that it's properly matched with nothing other than an approved tag and an optional / at the front or end. Am I forgetting to handle some other tricks worm developers can take advantage of?

Here's the event handler - shown as an image since many news readers completely butcher formatted text in attempts to make blog entries comply to their site style; click link for text version.

Here's how this looks at runtime:

(No, the error text didn't magically move from the top of the text area to the bottom; I moved it between taking the first screenshot and taking the deployment screenshot.)

P.S. Don't forget to hook up the Message component to the text area, such that it displays errors raised in the text area's validator. Do that by dropping it, then follow the advise listed in the default message area on screen text: Ctrl-Shift Drag from the message area to the text area (or vice versa). If you use a Message Group component, you don't have to "bind" it to any components; it will display error messages from any and all components on the page. It's a good habit to always have one during development.


Things must be terribly wrong with XML apis. Why do people always consider string manipulation easier than using a parser?

Posted by Matthias on October 19, 2005 at 08:45 PM PDT #

I thought of doing that, but the problem of course is that HTML is not XML. And requiring people leaving comments to write well formed XML is probably expecting a bit much.

Certainly for a production quality app you would use something like JTidy, or perhaps NekoHTML as a front end to Xerces to preprocess the input into an XML compatible form, then do simple DOM iteration over it.

But I wanted to keep the blog entry short and simple. The goal wasn't to be comprehensive (I wrote this code in 5 minutes) - but to generate ideas.

Posted by guest on October 20, 2005 at 02:07 AM PDT #

Have you ever tried validating hex-encoded inputs? Converting every ascii-code to hex, browsers will interpret the hex-encoded code successfully, while some code validators check inputs just for ascii-code. example for user@domain.test in the next line, but also possible for any other string.


Posted by hk on October 21, 2005 at 08:05 PM PDT #

My preferred approach would be to use HTML tidy to convert to XML, and then extract the tags I'd permit. In addition, I might use an XML parser to reencode the string, cleaning out entity equivalents. Stripping out attributes sound like a fine idea, although it means that you can't style your DIV. The problem that MySpace had was it used simple string checking when HTML is so relaxed and complex (in the sense of so many variations are equivalent) that a lot of broken fragments could pass through. In addition, MySpace did not check for CSS validity. If the CSS was parsed and outputted in a standard form, it would have been easy to pick up attempts to circumvent the javascript checking.

Posted by Chui Tey on November 07, 2005 at 08:54 AM PST #

Post a Comment:
Comments are closed for this entry.

Tor Norbye


« July 2016