XSS
Update (5/25/2007):
Sam Ruby ported this Sanitizer to HTML5lib. For most purposes, that’s a much more robust foundation, so all my future efforts will be devoted to the HTML5lib version.Files:
Rudimentary documentation is available.
The original version of the Sanitizer, described in this post, can be found here.
What free time which might, otherwise, have been devoted to blogging last week, was devoted to another matter.
On Monday, I discovered that Instiki, including my MathML-enabled branch, was vulnerable to Cross-Site Scripting (XSS). That is, visitors to an Instiki Wiki could inject malicious javascript code onto your page.
Rails has a built-in sanitization filter, but this was not being applied. So, my first impulse was to apply Rails’s built-in sanitization filter. Unfortunately, it suffers from two defects
- It has trouble with malformed HTML. This was not a problem for me, as I intended to apply it to the well-formed XHTML output by Maruku.
- Even on well-formed XHTML, it doesn’t actually work worth a damn. All but the lamest of script-injection tricks sail right past it.
I turned to Google, and found a Rails plugin that is supposed to improve upon it. It works considerably better, but was still not adequate to my needs.
Finally, I turned to Sam Ruby for advice. Sam pointed me to the sanitization code he wrote for the Universal FeedParser.
So I sat down and wrote a sanitization function1 for Instiki based, in part, on Sam’s Python code.
- It applies a white-list of XHTML+MathML+SVG elements, allowing through only the (extensive list of) known safe ones
- It applies a similar white-list of XHTML+MathML+SVG attributes.
- For attributes whose values are URIs (e.g.
href
,src
,xlink:href
, …), it applies a white-list of safe URI schemes (after normalizing, to foil attempts at obfuscating the URI). - Inline style attributes are parsed, and only a known safe (but still extensive) set of CSS properties and values are allowed.
- It handles case-sensitive element and attribute names, which is important for SVG, which uses camel-cased names.
- It comes with unit tests, lots of unit tests.
On learning of the vulnerability, over a week ago, I immediately emailed the maintainer of the main Instiki branch, to tell him about the vulnerability and of my intention to provide a fix. Three days later, after coding up my fix, I sent that to him, as well.
Eventually, after much pestering, my changes were committed to the Instiki SVN repository. But I still don’t know when Matthias is going to get around to releasing a new version. He indicated that he’s very busy, and I think I have failed to convince him of the urgency of the matter. Rather than waiting around for a new version, Instiki 0.11 users should fix their installations now. Doing so is easy enough. In fact, it’s almost surely easier and faster than installing a whole new version of Instiki (whenever that should happen to appear).Update: Instiki 0.11pl1 has been released. It contains the fix for this XSS attack, as well as some other miscellaneous fixes.
If you’re using my distribution of Instiki, you should download the latest version and follow the instructions to upgrade your Instiki installation.
If you’re using Instiki 0.11.0, you should download the latest release. If, for some reason, you don’t want to upgrade, then at a minimum:
- Download the following files from the Instiki SVN repository, placing them in the corresponding directories of your Instiki installation:
lib/chunks/engines.rb
(replacing the file that’s currently there)lib/sanitize.rb
lib/node.rb
test/unit/sanitize_test.rb
- Finally, restart Instiki.
If you’re using my branch of Instiki, please don’t use the above lib/chunks/engines.rb
file. It’s 0.11.0-specific. The file you want is in the distribution or in my BZR repository.
If you want to test whether your Instiki installation is vulnerable, try typing
[foo](javascript:alert\('bar'\);)
on a page which uses the Markdown (or Markdown+itex2MML) filter, or
"foo":javascript:alert('bar');
on a page which uses the Textile filter. Or try
<a href="bar" onclick="alert('fubar');return false;">foo</a>
or
<p style="-moz-binding:url('http://golem.ph.utexas.edu/~distler/blog/files/warning.xml#xss')">fubar</p>
(that’s
p{-moz-binding:url('http://golem.ph.utexas.edu/~distler/blog/files/warning.xml#xss')}. fubar
for you Textile users) or any one of the myriad of other script-injection tricks.
Rails
More generally, it’s a huge disappointment that Rails does not ship with a decent XSS-sanitization function built-in and enabled by default. I suppose that, if one is building a Rails app which doesn’t accept any user-input content, or which aggressively strips out all vestiges of HTML from that content, then one might not really need one. But people are building Wikis and Blogs and all kinds of “Web 2.0” applications using Rails, many of which either accept HTML, or accept some pseudo-markup that gets translated into HTML.
The fact that the built-in sanitization function
- is not enabled, by default
- is largely ineffective, when it is enabled
is a huge, potential security hole in each of those Rails applications.
This is not unknown. The Rails bug tracker is filled with open tickets suggesting that TextHelper#sanitize
is broken and needs to be fixed. Nothing in this blog post should come as a surprise to anyone in the Rail community. Well, OK, it was a little surprising that Instiki didn’t even avail itself of TextHelper#sanitize
. But, give the weakness of the latter, it hardly would have made much difference if it did.
I was, initially, somewhat torn as to whether to publicize this issue on my blog. But, given both the seriousness and the widespread nature of the problem, there’s really no alternative. I don’t think that I could even enumerate all the vulnerable Rails applications, let alone track down and contact their developers about implementing a fix.
At least this way, I’m coming to the table with code — sanitize_html(string)
— that can be used to fix those applications which turn out to be vulnerable.
1 Since I used the same HTML tokenizer that Rails’s built-in sanitizer does, my code probably also misbehaves on sufficiently malformed HTML. If you are trying to sanitize tag-soup, you need to parse it, using the same error-corrections that browsers do. Your only real hope is to use HTML5lib, of which a Ruby version doesn’t yet exist.
Re: XSS
I have a white_list helper plugin that I’ve been using in Beast/Mephisto/* that works great. I wrote it with the intention of replacing #sanitize with it in core, and it is currently a candidate for Rails 2.0.
Here’s the plugin if you want to check it out: http://svn.techno-weenie.net/projects/plugins/white_list/