XHTML and error handling
What I want to touch with this post is how errors are handled when XHTML is served the way it should be. Let’s, for the sake of argument, say that we want to write and deliver XHTML (not wanting to turn this into a discussion whether we should write HTML or XHTML).
First, some general background information about how to send documents to the requesting web browser. It’s all about the media type type, described in XHTML Media Types:
- HTML
- Should be sent with the
text/htmlMIME type. - XHTML 1.0
- All flavors of XHTML 1.0,
strict,transitionalandframeset, should be sent with theapplication/xhtml+XMLMIME type, but may be sent astext/htmlwhen it conforms to Appendix C of the XHTML specification. - XHTML 1.1
- Should be sent with the
application/xhtml+XMLMIME type; Should not be sent with thetext/htmlMIME type.
So, what’s the difference? It’s that web pages sent as text/html is interpreted as HTML while those sent as application/xhtml+XML is received as a form of XML. However, this does not apply to IE, because it doesn’t even understand the application/xhtml+XML MIME type to begin with, but instead tries do download it as a file. So, no application/xhtml+XML for IE.
Aside from IE’s lack of support for it, and for what you need to consider described by Mark Pilgrim in his The Road to XHTML 2.0: MIME Types article, it means that when a web page is sent as application/xhtml+XML while containing an well-formedness error, the page won’t render at all.
The only thing displayed will be an error message when such an error occurs. This is usually referred to as draconian error handling, and its history is told in The history of draconian error handling in XML.
My thoughts about this started partly by seeing many web developers writing XHTML 1.1 web pages and then send them as text/html, and they were only using it because it was the latest thing, not for any features that XHTML 1.1 offers (this also goes for some CMS companies that have invalid XHTML 1.1 sent as text/html as default in their page templates for customers to take after). Sigh…
It is also partly inspired by an e-mail that I got a couple of months ago, when Anne was kind enough to bring an error on my web site to my attention, with the hilarious subject line:
dude, someone fucked up your XHTML
What had happened was that Faruk Ates had a entered a comment to one of my posts where his XHTML had been messed up (probably because of some misinterpretation by my WordPress system), hence ending up breaking the well-formedness of my web site so it didn’t render at all.
Because of that, and when using it for major public web sites, I really wonder if that’s the optimal way to handle an error. Such a small thing as an unencoded ampersand (example: & instead of &) in a link’s href attribute will result in the page not being well-formed, thus not rendered. Given the low quality of the CMSs out there, terrible output from many WYSIWYG editors, the “risk” (read:chance) of the code being valid and well-formed is smaller than of the code being incorrect. Many, many web sites out there don’t deliver well-formed code.
Personally, I agree with what Ben de Groot writes in his Markup in the Real World post. I prefer the advantages of XHTML when it comes to its syntax and what will be correct within it. However, Tommy once said to me that if you can’t guarantee valid XHTML you shouldn’t use. Generally, I see his point and think he’s right, but to strike the note Ben does, I can guarantee my part of it but there will always be factors like third party content providers, such as ad providers, sub-par tools for the web site’s administrator and so on. And for the reasons Ben mention, I’d still go for XHTML.
So, conclusively, I have to ask: do you think XHTML sent as text/html is ok, when it follows the Appendix C of the XHTML specification? Do you agree with me that having a web site break and show nothing but an error if something’s not well-formed isn’t good business practice?




