Bulletholes
This blog, as well as the String Coffee Table and the n-Category Café, are served as application/xhtml+xml
to compatible browsers. They, therefore, need to be well-formed at all times. Otherwise, visitors will see a “yellow screen-of-death” instead of the desired content.
In order to ensure well-formedness, user-input is validated before it can be posted. A local copy of the W3C Validator is hooked into the “preview” function for comments and entries. And, in the case of comments, we rigourously enforce that comments validate before they can be posted.
That sounds great in theory. And, in practice, it seems to have worked quite well. One might even be forgiven for complacently thinking the arrangement bulletproof.
But, then Henri Sivonen came along1, to point out that one has been living in a fool’s paradise. The W3C Validator fails to even enforce well-formedness. Actually, the fault is not in the software written by the W3C, but in the onsgmls
SGML parser, which has only limited support for XML.
Far from being bulletproof, it was quite trivial to introduce non-well-formed content onto these blogs. That none of the previous six thousand or so comments have done so can be attributed either to dumb luck, or to the essential goodness of humanity. Needless to say, neither can be counted upon.
So, as a quick and dirty hack, if the W3C Validator says your comment is valid, I run it through a real XML parser, just to be sure. It seem a bit redundant, and the XML parser bails at the first well-formedness error (so it could take several passes to catch all the well-formedness errors missed by the W3C Validator). A better solution would be for someone to fix OpenSP 1.5.2, to ensure that onsgmls
actually checks for well-formedness, when operating in XML mode.
Update (11/27/2006):
It seems to me that there are only about 3 people in the world using it, but I might as well release an updated version of the MTValidate plugin.Version 0.4 of the plugin incorporates a new configuration option in /plugins/validator/config/validator.conf
. Setting
XHTML_Check = 1
runs ostensibly “valid” comments through a real XML parser, ensuring that they really are well-formed. To use this option, you’ll need the XML::LibXML
Perl Module.
The new version also incorporates yet more user-friendly error messages from version 0.74 of the W3C Validator.
1 In response to a bit of flamebait from Anne van Kesteren.
Re: Bulletholes
There are several other limitations as well