A picture of me taking it easy

I'm currently on parental leave till September. During that time, I will not read any e-mail or blog comments.

Until I'm back, please read through my archives, take a look at my code/applications and check out my pictures.

Have a great summer, and a splendid winter to you aussies and kiwis! :-)

Why XHTML?

Published on Monday, April 4th, 2005

This is a well-discussed and very important topic. Personally, presently I write XHTML for my web interface code, but lately I’ve started to stagger in my standpoint. For normal general web page design, what’s the gain? If you don’t extend the code with namespaces, use MathML, have your own DTDs and so on, why would you want to use XHTML?

Many people answer that question with: “It makes me write leaner code, code that has to validate and be more semantic correct”. Martin In Swedish wrote a post In Swedish recently why he uses XHTML (unfortunately, it’s in Swedish).

But I don’t agree with the argument that it has to be XHTML to achieve that particular goal. I think it’s more of a developer standpoint than using XHTML. If you’re really dedicated to what you do, you use the correct tags for the correct purposes (H1 for headings etc), you write as lean and minimal code as possible and you close all optional tags like LI, P and so on.
Basically, you can live up to that with using the HTML 4 Strict Doctype, and separating content (HTML) from layout/look (CSS) and interactivity (JavaScript).

Another reason people use it is that they might think it makes them better programmers, that they code ‘the real deal’. It might also be as a selling point, for the project manager or his/her like, to tell the customer that: “Yes, we know what we do, we code XHTML”.
But, unfortunately, very few do it all the way. As mentioned in Anne’s Quick quide to XHTML about Evan Goer’s test, 89% of the web sites tested didn’t validate and 99% used the incorrect MIME type!

Which leads to the MIME type issue. In a very talked-about piece, Ian Hickson writes that if you use XHTML it should be delivered as application/xhtml+XML (which, surprise, Internet Explorer doesn’t support) to the web browser, otherwise it will be perceived as ‘tag soup’ (however, not everyone agrees that it should be called ‘tag soup’). But what has happened is that people have gone to such length that they use something called content negotiation, which basically boils down to serving XHTML as application/xhtml+XML to those web browsers who support it, and HTML 4 as text/html to those who don’t.

When you deliver XHTML as application/xhtml+XML to a browser that supports it, it won’t even render the page if it’s incorrect, but instead throw an error. Generally, I think this is a good thing that forces the developer to write correct code. Alas, speaking from my point of view, working in projects where Content Management Systems don’t deliver correct XHTML, where the .NET Framework doesn’t deliver correct XHTML etc, serving it as application/xhtml+XML is impossible for me.

Yes, Content Management System manufacturers are getting more aware of this, ASP.NET 2.0 is supposed to deliver XHTML the way it should be delivered, but it’s still a long way ahead in the future.
So what are my (and many other developers’) options? To deliver XHTML that doesn’t validate (although the errors might be minor/make no difference to how it will be rendered) as text/html, or should we deliver plain old HTML?

One point is also that this will not affect the end users, as long as you write valid code in its context, be it XHTML or HTML.
An interesting sidenote to this is that MSN Search serves valid XHTML 1.0 Strict which validates, while Google serves a non-Doctype page which generates 242 errors

Conclusively, to go back to my initial question: why use XHTML if I only use it for standard HTML? Why go through the hassle if I don’t use any of the XHTML-specific features?
Anne brought up an interesting thing from Mozilla’s Web Author FAQ about how to serve HTML (and the thing that Mozilla, nor any other browser, loads XHTML incrementally, as opposed to HTML), and he comes to the conclusion that one should switch to HTML.

Also, Tommy made the statement a couple of weeks ago that XHTML Is Dead.

And these are two very talented persons that just don’t believe in XHTML anymore, and that saddens me and makes me think:
Are they right? Should I go back to writing HTML?

PS. Let me know what you think! Is XHTML or HTML the way to go? Write a comment and I’ll send you an invitation to use Google’s Gmail service (2 GB inbox(!), POP3 access etc). DS.

Posted in HTML/XHTML | 86 comments

86 comments

  • Tommy Olsson
    April 10th, 2005 at 0:24

    I can see a possible advantage of using XHTML if you have dynamically generated pages. Outputting well-formed X(HT)ML is slightly easier than outputting valid HTML, since you don’t have to make exceptions for those few element types that cannot have an end tag in HTML. Very minor point, though. :)

    Content negotiation (and, worse, the much more common content type negotiation) is silly, really. I say that although I do it myself, at the moment. If the content can be converted to — or at least interpreted as — HTML, you could probably use HTML in the first place. If you really want to use XHTML, though, and still cater to old browsers that don’t support application/xhtml+XML, it’s one way of doing it.

    One thing I don’t approve of, however, is what you said about the .NET framework not generating well-formed XHTML, forcing you to serve it as text/html. If you cannot guarantee well-formed markup, you must not use X(HT)ML! HTML 4.01 is the most recent standard that is well supported by browsers. There’s nothing wrong with using that until real XHTML can replace it. :) In my oh-so-humble opinion, of course.

  • Robert Nyman - author
    April 10th, 2005 at 0:25

    Tommy,

    Thanks for your insightful comment!
    The key phrase in your comment is “If you cannot guarantee well-formed markup, you must not use X(HT)ML!”.

    That’s what I’m going for. Unfortunately, usually I can’t do that in my day to day work. And even if I could, why use XHTML if I don’t make usage of the extra features it offers?

  • kristin vÃ¥gberg
    April 10th, 2005 at 0:26

    I vote for XHTML even though the page is served with the MIME-type text/html or even if the validation has some errors. Why?
    The reason is that the application/webpages we develope needs to be forwardcompatible. A webpage is not rewritten every second year and the merge into right MIME-type when it´s time - will be easier. The developers and companies you work for will also incorporate the understanding of XHTML nature and it will not be a chock for them when it´s time for the change.

  • Robert - author
    April 10th, 2005 at 0:29

    First, I’d like to say that I agree that XHTML is a good thing
    in the sense that developers need to learn how to code (X)HTML properly
    (or should I say strict?).

    But I’m not sure if I’d call XHTML 1 forward compatible, since XHTML 2 will NOT be backwards compatible with XHTML 1.

    Regarding
    sending XHTML as text/html, especially with validating errors, is not
    an option for me. If I’d do that, it feels like I’m a fake in my
    profession, that I claim to do the right thing, but then I don’t go all
    the way.
    Kind of like trying to do something that is over my head and I can’t do it a 100%.

    Just to even out the discussion, Tantek is pro-XHTML,
    and actually wrote a post a bit more than a year ago, also titled Why
    XHTML? (I didn’t know about this post before I posted mine).

    What to code, what code…
    XHTML or HTML?

  • kristin vÃ¥gberg
    April 10th, 2005 at 0:33

    I understand what you mean, but there will be several years from now until enought amount of browsers will understand XHTML 2.0. As IE is so slow to implement W3C standard, I suppose there will be a gap with about 5 years where we will have IE understanding XHTML documents with the right MIME-type, until XHTML 2.0 is understod.

    I don´t think it is “fake” to send XHTML as text/html. There will be a time when we can send the right MIME-type and let us code for the future. It gives the companies and there developers the knowledge about where the Internet is aheading -> XML.

  • Robert - author
    April 10th, 2005 at 0:34

    Well, it’s a tough question…
    For people that feel that way, coding XHTML, using content negotiation etc is the way to go.

    But for me, sending (in some cases) invalid XHTML as text/html doesn’t sound like a good option.

  • Tommy Olsson
    April 10th, 2005 at 0:34

    Sending XHTML as text/html is a temporary workaround for older browsers. If you can’t (or don’t want to) do content negotiation or, at least, content type negotiation, you can still serve it as text/html to all browsers as long as you follow the compatibility guidelines in Appendix C of the XHTML 1.0 specification.

    However … even if you send it as text/html for now, it must still be correct XHTML. You should, at any time, be able to switch the media type over to application/xhtml+XML and everything should still work. If you cannot do that, you are not ready for XHTML, but should stick to HTML.

    Using XHTML along with HTML-only practices that require that it be served as text/html is outright silly. I’ve seen people who use an XHTML doctype, serve it as text/html, and use document.write() etc., and still claim that they are “stricter” than HTML and that they are “future compatible”.

    I’ll say this again: while serving XHTML as text/html may be acceptable during a transitional period, it must still work when served as real XHTML. If not, there is absolutely no point in using an XHTML doctype. None.

  • Robert - author
    April 10th, 2005 at 0:35

    Tommy,

    I’ve also read the Appendix C of the XHTML 1.0 specification, and I totally agree with you.
    For now, I think I can overlook sending XHTML as text/html as long as I follow above mentioned guidelines.

    However, if using XHTML, it has to validate. What’s the point otherwise?
    To have almost-validating XHTML just for the purpose of using it and for educating other developers is, for me, not acceptable.

  • Jarvklo
    April 10th, 2005 at 0:36

    Well I’m not trying to start any flame wars here or anyting - but… ;)

    I would like to pitch in my two favourite cents for y’all to ponder for a while…

    Why not start asking the question the other way around - IE. ask yourselves the question: Why not XHTML - and see where you’ll end up when you objectively analyse the answers you get?

    I mean - Call upp http://w3.org/TR/html and you get a fresh copy of the XHTML 1.0 recommendation…

    Appendix C is alive and kicking and people seems to slowly get the message…

    XHTML roughly seems to equal web standards compliance in many corporate minds (i.e. amongst the moneypeople) if you believe some of the growing amount of hype that surrounds each new commercial “CSS-redesign”

    Support for “XML features” seems to be growing with each new browser release.

    and so on…

    Think about it ;)

  • Jeroen Mulder
    April 10th, 2005 at 0:38

    I recently wrote about it as well. I agreed with the fact that from a technological point of view XHTML isn’t very promising these days, but was disagreeing with the advocacy of not using XHTML.

    XHTML as a brand serves a ‘devine’ (if you’d like to call it like that) purpose. It brings people closer to the wild and woolly world of semantics, CSS and improved accessibility.

  • Robert - author
    April 10th, 2005 at 0:38

    jarvklo,

    Not taken as flaming at all, which I’m not really interested in. I prefer a mature discussion and argumenting without getting an overly heated discussion.

    > ask yourselves the question: Why not XHTML
    Well, that’s the question, isn’t it…? :-)
    I needed to sum it up and get some knowing people to comment it, to get a perspective of it!

    > XHTML roughly seems to equal web standards compliance in many corporate minds
    Oh, definitely! I’ve noticed this seems to spread more and more too.

  • Robert - author
    April 10th, 2005 at 0:40

    jeroen,

    I just read your post about XHTML and the thing that got to me was the last paragraph:”You and me know how to write good markup in HTML4. However, the inexperienced authors often do not…”

    I
    think that is a very important point, because if we were to avoid XHTML
    and code everything in HTML 4 it would probably open up a floodgate of
    nasty HTML, since more things would pass the validation.

    So, to sum all of this up:

    While
    we might not really take advantage of everything XHTML has to offer,
    while we might not always serve it as application/xhtml+XML, as long as
    we follow the Appendix C HTML Compatibility guidelines
    when serving it as text/html and make sure our code validates, it’s
    better from a semantic, correct coding, future XML thinking and
    business standpoint.

    Is this the general consensus?

  • Rimantas
    April 10th, 2005 at 0:40

    Why not XHTML?
    Because of SHORTTAG YES (http://www.w3.org/TR/REC-html40/sgml/sgmldecl.html)
    The fact that browsers did not implement it is a poor excuse.

  • Milan Negovan
    April 10th, 2005 at 1:44

    I’ve always been strongly against serving content as application/xhtml+XML on a business site. The margin for error is too big and the price of errors is not justifiable.

    Personally,
    I prefer XHTML. I try to get my code to validate as much as possible,
    but if it fails to validate 100% I know why it is so, and I just move
    on—I don’t get stuck on it.

  • Anonymous
    April 10th, 2005 at 1:45

    >Why not XHTML

    Really the question is why not XHTML served as HTML.

    Because it is no better in any way whatsoever than HTML 4 coded as described in this article . Anything about “temporary”, “forward-compatible”, “better standards”, “better structured”, etc., is an excuse or a lie.

    Because it breaks many basic javascripts.

    Because it is apparently slower to display in Mozilla.

    Because it has a few annoying restrictions (target and such).

    Because I personally can’t figure out how to get my editor (homesite+) to do XHTML tags unless the page I’m working on actually has the doctype on it, which usually isn’t the case with dynamic sites.

    The only reason to use XHTML served as XHTML is if it is displayed faster than HTML 4, but no one to my knowledge has yet demonstrated that to be the case. But I still would not use XHTML served as XHTML because one error shuts down your website.

  • Devon
    April 10th, 2005 at 1:45

    I use XHTML simply because then any ol’ HTML and XML parser can read my files/website. There’s more and more XML parsers out there and it’s relatively easy for someone to make a new one.

    If I made HTML pages, it would automatically be unreadable by every XML parser out there. Whereas XHTML (tho it’s non-proper HTML) is actually readable by a huge majority of HTML parsers because they’re all so soft on errors.

    I cannot and will not count on an XML parser to be soft on errors. It would be silly to.

    So the question I answer to myself is… do I want to cut out a load of modern day parsers (and future ones) that could crawl my site or do I just want to limit myself to a select group of parsers?

    It’s similar to browser checking or object checking. Which is smarter? Object checking. Why? Because otherwise you’re always updating, to keep up with the changes.

  • Robert - author
    April 10th, 2005 at 1:47

    Milan,

    Good to see you here!
    One important thing you mention in your article is:
    “ASP.NET isn’t ready yet to produce markup that can be served as XML with the application/xhtml+XML content type.”

    And this is a pretty big issue, lots of people develop with the .NET Framework (I was musing a little about this in XHTML and its value. The comments are in Swedish, I’m afraid).

    But
    with how to serve XHTML aside, what you mention is a big question:
    whether it’s ok or not to serve XHTML with a few (minor) validation
    errors, if you know about them.If you write XHTML and serve it as
    text/html, should you automatically be able to switch to serving it as
    application/xhtml+XML and it will work (as Tommy thinks, stated above),
    or is it ok to have minor validation errors until .NET/your Content
    Management System is ready for it, but still deliver XHTML?

  • Robert - author
    April 10th, 2005 at 1:48

    > The only reason to use XHTML served as XHTML is if it is displayed faster than HTML 4, but no one to my knowledge has yet demonstrated that to be the case. But I still would not use XHTML served as XHTML because one error shuts down your website.

    One reason might be that you want to be so strict about your code that it has to validate, that it shouldn’t be allowed to be rendered if it’s not valid.

  • Robert - author
    April 10th, 2005 at 2:17

    > So the question I answer to myself is… do I want to cut out a load of modern day parsers (and future ones) that could crawl my site or do I just want to limit myself to a select group of parsers?

    Of course you want to be as future compatible as possible, but it might also be a business decision. If you have a web site, intranet etc and your target audience will only use web browsers on a computer (PC, mac and so on), taking the extra time to deliver XHTML the right way may not be economically ok within your project.

    > It’s similar to browser checking or object checking. Which is smarter? Object checking. Why? Because otherwise you’re always updating, to keep up with the changes.

    Object checking is definitely the way you want to go, but there are always cases where you need to cater to browser specific bugs as well (for instance, where it claims to support something according to object checking, but then doesn’t/has buggy support for it).

  • Tommy Olsson
    April 10th, 2005 at 2:17

    “Why not XHTML
    Unless you can guarantee well-formed markup, you must not use XHTML. It doesn’t have to be valid, necessarily, but it absolutely must be well-formed.

    As Rimantas said, the fact that HTML specifies SHORTTAG YES should be a good reason to stay away from XHTML-P (XHTML-pretend, served as text/html) as well.

    I’m astonished when people say that they serve their “XHTML” as text/html, because the error handling is too strict when served as application/xhtml+XML. If you produce sloppy code that doesn’t even pass a simple well-formedness check, you are definitely not ready to use X(HT)ML.

    Why are people so hell-bent on using XHTML markup, but so reluctant to fulfill all the requirements?

  • Anonymous
    April 10th, 2005 at 2:18

    You should have a FxCop tool for best practice analyze of UI coding! :)

  • Robert - author
    April 10th, 2005 at 2:19

    Tommy,

    > If you produce sloppy code that doesn’t even pass a simple well-formedness check, you are definitely not ready to use X(HT)ML.

    I agree that you preferably shouldn’t use XHTML if you can’t have it well-formed.
    However, like you mentioned “It doesn’t have to be valid…”, does that mean that you think it’s ok to serve a XHTML page as text/html (following above-mentioned Appendix C guidelines, of course) with minor validation errors such as a name attribute on a FORM tag, an input type=”hidden” and a language attribute on a script tag (these examples are the most common automatically generated from the .NET Framework)?

    > Why are people so hell-bent on using XHTML markup, but so reluctant to fulfill all the requirements?

    This is just my perception, but I don’t think people are intentionally reluctant to fulfill the requirements, I think that circumstances they can’t control might do that.
    But they still feel that the advantages of using XHTML outweighs minor validation errors (that doesn’t affect the well-formedness), and that it is ok for some refactoring to take place if/when, later on, switching to serving it as application/xhtml+XML.

    Also, it might be important from a business point of view: “XHTML is what is getting companies to become aware of Web Standards. Not HTML.” from Faruk Ates The case for XHTML.

  • Jeroen Mulder
    April 10th, 2005 at 2:20

    Robert,

    I think that sums it up very well! Not sure if it is the general consensus, but it is my consensus. ;-)

    As I described in my original entry — I’ll never drop XHTML as a brand, even though I am barely/not using XHTML’s technological advantages at all.

    Perhaps XHTML as the brand is decieving and all, but right now it seems to be the lowest level of entry to ‘the other side’ for the lesser informed authors. They all know HTML (sort of)..

  • Robert Wellock
    April 10th, 2005 at 2:20

    As we know an XHTML document must be a well-formed XML document but not necessarily Valid.

    For a brief period of time, i.e. a couple of days I had about three occurrences of some files I had edited that were XHTML served as application/xhtml+XML that were well-formed though not validated - as I forgot - which was semi-embarrassing even though obviously they displayed correctly.

    Why quite a few people don’t make use of the eXtensibility is because even if they did for the general public who uses MS Explorer you end up having to compromise.

  • Matthijs
    April 10th, 2005 at 2:21

    From a practical point of view: what’s the difference? I mean, i learned how to build sites from sites like alistapart and zeldmans book dwws. I know how to use CSS and seperate content from presentation. The only thing I knew about doctypes was that that was the stuff that goes at the top of the page, so-to-speak. Only lately it seems everybody is screaming it’s soo bad and evil to put an XHTML-doctype (served as text) at the top op your pages. But the only thing I know is that it doesn’t make any difference for my websites if I put an HTML or XHTML type up there. The only thing I must do if i would like to change the XHTML to HTML is get rid of the closing slashes in the img and br tags, isn’t it? So, I understand XHTML served as text isn’t real XHTML, but for a ‘normal’ (quotes!) website, does it matter?

  • Robert - author
    April 10th, 2005 at 2:22

    Well, that’s what the whole discussion here is about. :-)Is it worth it? What are the advantages of XHTML over HTML etc?

    Doctypes
    matter in the sense that they trigger different rendering modes in web
    browsers: a strict HTML 4, strict XHTML 1.0 or XHTML 1.1 doctype
    triggers strict rendering, XHTML 1.0 Transitional triggers Almost
    Standards Mode in Mozilla and the other ones (or lack of) triggers
    Quirks mode.Read more about that here.

    XHTML
    is a little more than just closing every tag, it comes to allowed
    attributes etc (and, of course, the possibility for other
    usage/extensions such as namespaces, MathML and so on).
    There’s an excellent article about developing with web standards over at 456bereastreet.com.

  • Matthijs
    April 10th, 2005 at 2:22

    Robert, yes I understand it’s a bit more then just closing tags! (ok, my comment was a little oversimplified ;) What I was trying to say is that I don’t know any different than to code in XHTML. If I wanted to change the doctype to HTML, I would have to go to (w3)school again to learn how to code properly in HTML, so-to-speak. Or could I just change my XHTML-strict for HTML-strict without touching my code? And then, what would be the difference?

    The rendering you mention is indeed something I have experienced. With websites I made I noticed that my CSS worked best if I used the xhtml1.0 strict type.
    But, this is an interesting discussion, and as I am no expert on this area, very educative.

    I can understand the question: “is it worth it, what are the advantages of XHTML over HTML?”
    But for me, and maybe for a lot of others, the question is also: is it worth the effort to change back XHTML to HTML?

  • Robert - author
    April 10th, 2005 at 2:23

    Matthijs,

    I didn’t want to come off as condescending, just maybe over-explanatory. :-)

    You raise some interesting questions in your comment:

    > Or could I just change my XHTML-strict for HTML-strict without touching my code?

    I may go out on a limb here, but to achieve that, all you (should) have to do is to remove the ‘/’ closing of tags, as for such tags as LINK, META, BR etc.
    Remove the namespaces in the HTML element.
    This assumes that you’ve only used XHTML as normal HTML for layout purposes.

    > is it worth the effort to change back XHTML to HTML?

    This situation is a bit different, it’s normally teaching how to do it the other way around! :-)
    However, from my point of view, as long as your XHTML validates, is according to the above-mentioned Appendix C and works fine for you, I see no need to switching back just for the sake of it.
    But if it doesn’t validate or if you have other problems, switching back to strict HTML 4 and doing what I mentioned in the answer above should do the trick.

    > But, this is an interesting discussion, and as I am no expert on this area, very educative.
    Thank you, I hope it’s a giving topic and discussion!

    Also, send me an e-mail at robnyman@gmail.com so I can send you an invitation to Gmail, as promised in the post.

  • Mojo Jojo
    April 10th, 2005 at 2:23

    One of the advantages of validated mark-up is that you’re not relying on browser error correction. If you send XHTML as text/html (ensuring that the code will be parsed as HTML rather than as XHTML) you eliminate that advantage, you’re sending invalid HTML (since valid XHTML is *not* valid HTML) to browsers and thus are relying on their error correction to sort it out.

    You could send the XHTML as application/xhtml+XML but that has its own problems (lack of incremental loading in Gecko browsers is a pretty major one).

    So I’d say there is no real world advantage to XHTML but plenty of disadvantages.

  • Tommy Olsson
    April 10th, 2005 at 2:24

    > does that mean that you think it’s ok to serve…
    I don’t think I’d use the word “OK”, but at least it’s “acceptable”. As Robert Wellock said, the only requirement for XML is that it’s well-formed. Validation is optional. Of course, you lose the right to be upset if things don’t work if you don’t have a valid document. :)

    > I don’t think people are intentionally reluctant to fulfill the requirements…
    There are quite a few who complain that Mozilla et al throw up the Yellow Screen of Death just because they’ve forgotten to close one of their nested TABLEs, or because they can’t be bothered to encode ampersands properly in their URLs. So they send it as text/html and think that they are still using XHTML and are really future compatible. :)

    > From a practical point of view: what’s the difference?
    Matthijs, there are some major differences between HTML and XHTML served as application/xhtml+XML. The latter enforces well-formedness requirements; it requires the right XML namespace for the root element; tags, attributes and CSS selectors become case-sensitive; you cannot hide script code or CSS rules within SGML comments anymore; you must use DOM functions (the namespace-aware versions, like createElementNS) instead of document.write() or document.myElement.innerHTML.

    XHTML may look a lot like HTML, but it’s really XML with some built-in semantics that browsers are familiar with. Don’t be fooled by the similarities; they are very different beasts. Also, with properly-served XHTML you can use XML features like incorporating markup from other XML namespaces (e.g. SVG or MathML).

  • Milan Negovan
    April 10th, 2005 at 2:25

    Robert, this has been an awesome discussion so far! I appreciate insightful comments.

    I’m a very pragmatic person. I write code each and every day. To me the question of whether to engage in content negotiation boils down to the question: “Is it going to affect my business?” I’m not trying to be selfish here, really. I’m trying to be reasonable. I’m a very meticulous person, but I have my sane limits.

    My preference is XHTML 1.0 served as text/html.

    I choose XHTML because it introduces at least a little bit of discipline to this HTML chaos. I’d choose a strongly-typed language over a loosely-typed one any day of the week. XHTML is somewhat closer to this paradigm than plain vanilla HTML with a very lax spec.

    I also choose the text/html MIME type because the true XML ones aren’t supported that well. User clients (browsers) can’t deal with even slight parsing slips in a nice enough way. This is where I see pragmatism: your business gets hurt… for a noble cause of code purity?

    Psssttt, Anne, feel free to disagree. :) I know you do, but I develop enterprise software and there’s no way in hell all 100% elements in a large product close or nest properly. That’s just the circle of life.

  • Jarvklo
    April 10th, 2005 at 2:25

    Oh well - I still don’t get it…

    Yes - I know the academiae.
    Yes - I know all the implications of sending XHTML as text/html to ancient browsers.
    Yes - I am fully aware of the SGML versus XML implications.
    Yes - I know all of the above, et cetera, et cetera, and I’ve heard the reasoning for not doing XHTML over and over again ad nauseum

    But I still don’t see how dropping the habit of using validating XHTML served as whatever MIME-type that is deemed appropriate by the W3C (which IMHO includes text/html following the guidelines in Appendix C and the (infamous?) media type note) in favour of HTML will help the web evolve… :p

  • Robert - author
    April 10th, 2005 at 2:27

    Mojo Jojo,

    I agree that it is a shame that most developers don’t/can’t take advantage of sending XHTML as application/xhtml+XML, to ensure that it is well-formed, but I totally understand the business reasons why they don’t. Who dares to take the risk that a page won’t render at all if someone adds something invalid to it? And, of course, we have the incremental loading thing…

    But I think it’s a bit too harsh to say that “So I’d say there is no real world advantage to XHTML but plenty of disadvantages”.
    The advantages, to me, are mentioned above with helping the web to evolve, to educate programmers in coding as strict as possible etc.
    Not to downplay these two, but what major disadvantages do you see except for lack of incremental loading and being regarded as tag soup when sent as text/html?

    (And as I wrote to Matthijs, send me an e-mail at robnyman@gmail.com so I can send you an invitation to Gmail, if you’re interested.)

    Tommy,

    I think it’s good that you clarified the difference between being well-formed and validating. Does that mean that, in such a case I mentioned above with those errors, you might yourself deliver something with such validation errors in one of your projects? Or would you without a question go for strict HTML 4 in such a case (sorry for just throwing questions back at you every time you comment :-))?

    > or because they can’t be bothered to encode ampersands
    I really hope that this problem doesn’t originate in lazy developers, but instead a CMS, commenting function on a web site or similar that delivers such code.

    Also, thanks for explaining even more to Matthijs about the differences.

    Milan,

    I think you’re the one whose situation is closest to mine. If it were solely up to me, I’d code perfectly well-formed and valid strict XHTML delivered as application/xhtml+XML.
    However, circumstances (depending on what project it is) pose problems to me such as the validating errors in the .NET Framework that you discuss in your article, CMS systems might spit out weird code and so on.
    And, from a business point of view when it comes to serving XHTML as application/xhtml+XML, as I wrote above to Mojo Jojo: “Who dares to take the risk that a page won’t render at all if someone adds something invalid to it?”.

    > Robert, this has been an awesome discussion so far!
    I bow my head in gratitude for your nice comment!

    Jarvklo,

    I share your opinion that XHTML has ignited a spark for the web to evolve, which is great! But the price for evolving is too high if the code isn’t even well-formed (as in, would break if served as application/xhtml+XML).

  • Robert - author
    April 10th, 2005 at 2:27

    To sum it up, there seem to be two camps (and then I don’t mean the obvious pro- and con XHTML ones):

    One camp consists of people that are leaning more towards being purists and wants to serve XHTML correctly and have it well-formed and validating, no matter the cost. If one can’t live up to that, one shouldn’t use XHTML.

    The other camp comes more from a practical business perspective (with this, not saying that the first camp is all about theory).They want coding to evolve and be as strict as possible, but given (most of) the tools available on the market they’re aware that serving it as application/xhtml+XML is not an option for them, that things might contain validation errors (but hopefully not well-formedness errors).

    I’m not interested in having a heated debate where people fight for their particular standpoint. How do we make these two camps meet? Is it even possible? I want to reach a middle-ground, what’s acceptable, where can we set the bar so it suits the majority?
    Is it, for the time being, serving XHTML as text/html (according to Appendix C), perhaps having validation errors of smaller significance (such as invalid attributes) but keeping it well-formed?

    I think it’s really important that we, instead of whining about it, try to find some common grounds, for the sake of the web’s future.

  • Jewel
    April 10th, 2005 at 2:28

    Chiming in with a newbie’s point of view if that is ok :-) When I first began making websites a few years ago, I used WYSIWYG editors and never really got to grips with correct HTML, doctypes or anything like that. If the site displayed in IE, that was all I knew or cared about. Last year I started to learn about CSS and web standards, and understood that this required the use of XHTML. I then built my site using XHTML, and actually learned how to code properly. I am now reading many articles questioning the use of XHTML not served as XML, but it must be said that I expect there are quite a few newbies like me who only began serving acceptable web pages because we learned XHTML. To go “back” to HTML4 is not an option for us as we would have to learn it afresh. You really wouldnt want to see the sort of websites we used to build before….As someone once said, “The road to hell is paved with nested tables and spacer gifs” Our sites certainly qualified for that description!

  • Robert - author
    April 10th, 2005 at 2:28

    Jewel,

    I’m interested in hearing everyone’s view (even though I don’t regard you as a newbie)!

    Good on you for learning how to code correctly! And judging by your web site, you’ve come a long way.

    My personal opinion is as I said to Matthijs:
    “…as long as your XHTML validates, is according to the above-mentioned Appendix C and works fine for you, I see no need to switching back just for the sake of it.”.

    What you mention is interesting, because I’ve heard a lot of people that got into correct coding through XHTML, and then started using CSS more, separating looks from HTML and so on like kind of a bundle with learning how to do things right.
    And this is important, because it opens up the eyes of developers of how to actually do things the way they’re supposed to be done.

  • Jewel
    April 10th, 2005 at 2:29

    Well, although I have learned an enormous amount in the last 12 months, I still feel like a newbie in the blogging world (or blogosphere as I have heard it called). I am however, beginning to feel confident enough to start joining in by adding comments here and there, so that is progress isnt it?
    Thanks for a very interesting discussion.

  • Daniel
    April 10th, 2005 at 2:30

    Why XHTML?

    I agree with Milan that I brings order to chaos. Just think where the web might be had we required well-formed XHTML from the beginning. I would bet that browsers would be a lot further along, with slimmer code-bases not needing all of their quirks-mode conditionals.

    It is sad that we are allowed Appendix C. It rewards lazy attitudes (to write invalid code) that have plagued the web forever. That said, content-negotiation becomes a necessary evil, at least for a few more years. However, this shouldn’t harm anything.

  • Jones
    April 10th, 2005 at 2:30

    My dear friends…

    THE HORSE IS DEAD.

  • Robert - author
    April 10th, 2005 at 2:30

    Jewel,

    Feeling ready to take part by commenting is definitely progress! :-)

    > Thanks for a very interesting discussion.
    Thank you for reading it and participating in it!

    Daniel,

    It would’ve been an interesting situation if the Mozilla family, IE 6 and Safari, for instance, only had supported well-formed and validating XHTML served as application/xhtml+XML.
    How the market would’ve had to change their products, how developers would’ve had to code correctly and so on.

    A Brave New World!

  • Matthijs
    April 10th, 2005 at 2:31

    First of all, I’m sorry but I only have more questions than answers here.
    Mojo Jojo said:
    “If you send XHTML as text/html (ensuring that the code will be parsed as HTML rather than as XHTML) you eliminate that advantage, you’re sending invalid HTML (since valid XHTML is *not* valid HTML) to browsers and thus are relying on their error correction to sort it out”

    To get things clear: does this mean the w3validator is nonsense? That I shouldn’t have to bother to validate my webpages (XHTML served as text), as it doesn’t matter anyway? I might be a bit confused here…

    @Tommy, thanks for your explanation. It’s getting clearer bit by bit. However, could someone fill in the gap here: what is the practical difference between placing a XHTML strict or HTML strict doctype at the top of my webpages? (that is, assuming the allready mentioned differences in coding are dealt with). For example: what if I downloaded a copy of wordpress and use it for a weblog. Should I bother to change the template to HTML?

    I’m not trying to go against the arguments used against serving XHTML as text here, please let that be clear. I’m just trying to learn things and make a point from a practical point of view. That is, I want to make websites and make sure they are coded as best as possible, seperating content, presentation and behavior, being accessible, etc etc. I think in this discussion one must not lose sight of the fact (?) that still a lot (most?) of webdesigners/agencies haven’t even heard of doctypes, let alone use them. If I look around at what code is produced for websites even by big companies…

    I agree with Robert here, that it’s important to find some common grounds, some consensus and bring the message out there, to improve the web.

  • Mojo Jojo
    April 10th, 2005 at 2:31

    @Robert

    Some other disadvantages of XHMTL would be: the need to implement content negotiation, missing features such as document.write() (though it could be argued that losing document.write() is an advantage…), newbie confusion with things like name attributes and background-color styles on the body element, case-sensitivity (try validating this page :)), etc.

    Perhaps the phrase “no real world advantages” was going a little far, as you point out (and Jewel confirmed) there are valid “marketing” reasons for jumping on the XML “bandwagon”. Perhaps I should modify it to say there are “no real world advantages to individual web masters/designers/developers/etc”. If the XHTML blurb encourages more people to embrace standards then I can live with it.

    @Matthijs

    In one sense yes, you’re validating against an XHTML doctype but telling browsers to treat your code as HTML, what the validator says could be considered irrelevant. However, you should still validate your pages even if you send XHTML as text/html. Doing so will massively increase the chances of browsers correctly understanding your page (after all, HTML and XHTML while different are still fairly similar), it will mean that your pages won’t break if they’re ever sent with the correct MIME type (for example imagine if the next version of Apache were to automatically do XHTML content negotation by default) and it will catch any silly mark-up mistakes you make.

  • Robert - author
    April 10th, 2005 at 2:32

    Matthijs,

    The validator is not nonsense, you definitely should make sure that your code validates, for the reasons Mojo Jojo mentions above.

    > what is the practical difference between placing a XHTML strict or HTML strict doctype at the top of my webpages…
    Basically, the practical difference (if you’re only using XHTML as HTML, i.e. none of the extra functionaliy it offers) is non-existent. Both of them will trigger the standards mode layout. The difference of the content, however, is that XHTML served as text/html is regarded as tag soup by the web browsers, but accepted by them (due to their error correction, as Mojo Jojo states).

    > …wordpress and use it for a weblog. Should I bother to change the template to HTML?
    I know nothing about the code that WordPress generates so I’ll leave this question to someone else. But to me, if it generates valid XHTML, I see no need to recode it to HTML.

    > webdesigners/agencies haven’t even heard of doctypes, let alone use them
    This, to me, is an even bigger reason to get people to write correct code, be it strict HTML 4 or XHTML. Sorry for the cliché now, but the web will only be as good as we developers make it. There’s a huge difference between doing it correctly and doing it so it might, hopefully work, if one is lucky.

    We need to spread the knowledge how to do things right. And agree what is right enough. :-)

    Mojo Jojo,

    Regarding the disadvantages you mention:

    - Content negotiation:
    The question is if this is necessary, or if we can be content with sending it as text/html to all web browsers (to avoid the incremental loading problem etc), for the moment.

    - Newbie confusion with things like background-color styles on the body element, case-sensitivity
    The other way around, I see this as a reason to use it, to learn people how to code properly!

    - try validating this page :)
    I know… :-(
    I had some minor errors in the template I use, but they should be corrected now. The Blogger commenting functionality, for some reason, generates upper-case tags and allows people to use deprecated elements. It saddens me, but is something that I can’t affect for now.

    > If the XHTML blurb encourages more people to embrace standards then I can live with it.
    That’s the hope!

  • Tommy Olsson
    April 10th, 2005 at 2:33

    Wow, this debate is really raging on, huh? The poor horse must be mincemeat by now :)

    “But I still don’t see how […] will help the web evolve…”
    It won’t. But I still haven’t seen a satisfactory explanation of how using an XHTML doctype on old-skool tag soup that would crash and burn when attempted to serve it as real XHTML will help the web evolve, either. ;)

    Robert: Would I serve invalid, but well-formed XHTML? I don’t know, to be perfectly honest. It would be an indication that something is not quite right somewhere in the publishing process. I think I’d start with attempting to rectify the problem, i.e. remove the invalid stuff. But if all else failed, yeah, I might consider that.

    “Last year I started to learn about CSS and web standards, and understood that this required the use of XHTML.”
    jewel: I’m sorry to hear that you were lied to. Web standards and CSS does in no way whatsoever require the use of XHTML. CSS works perfectly well with HTML, since it was designed for just that. XHTML came along a few years later.

    Many people seem to think that HTML must be a horrid soup of uppercase tags, omitted end tags, and presentational markup. They also seem to believe that XHTML somehow prevents this, even when sent as text/html. XHTML 1.0 is a reformulation of HTML 4.01 as an XML application. It contains nothing more, nothing less than HTML 4.01. It’s not more semantic. It’s a little bit more strict, as it requires well-formedness while HTML allows some end tags to be omitted, but only when served as real XHTML.

    XHTML served as text/html is, as has been mentioned before in this discussion, nothing more than badly written HTML. You can take any old tag soup HTML document from 1995 and slap an XHTML doctype on it, and it will look exactly the same.

    “what is the practical difference between placing a XHTML strict or HTML strict doctype at the top of my webpages?”
    Matthijs: The doctype declaration affects validation (and in many modern browsers also the rendering mode). Using an XHTML doctype means it should be validated as XHTML, so the W3C validator is not wrong. However, it’s not the doctype declaration but the media type (a.k.a. content type or MIME type) that determines how a user agent should interpret the document. A media type of text/html requires a user agent to interpret it as HTML. The doctype declaration has absolutely nothing to do with it.

    If you validate your XHTML, you make sure that your markup adheres to the syntactical rules of XHTML. When you serve that as text/html, the user agent interprets it as HTML and will have to rely on its error handling to fix the things that differ between the two.

    Separation between structure, presentation and behaviour is something to strive for, definitely. It has nothing, however, to do with XHTML vs HTML. It has a lot to do with Strict vs Transitional DTD, though.

    I know some people think I’m a sad old reactionary who wants evolution to stop with HTML 4.01. That is not quite true. I would love to see XHTML come to its full potential, but few of its contemporary proponents are probably prepared for the changes that would incur.

    Personally, I think it’s much more important, for the evolution of the web as we know it, to convince people to switch from a Transitional DTD to a Strict DTD. Whether they use HTML 4.01 Strict or XHTML 1.0 Strict is of far lesser importance, although I’ll stand by my earlier statement: if you use an XHTML doctype declaration, the document must work if sent as application/xhtml+XML. Even if you serve it as text/html due to the lack of browser support.

  • Robert - author
    April 10th, 2005 at 2:33

    Tommy,

    > Wow, this debate is really raging on, huh? The poor horse must be mincemeat by now :)
    Well, I guess… :-)
    I just hope that we (all of us) will at least get closer to each other and understand that we face totally different situations and circumstances.

    > …But if all else failed, yeah, I might consider that
    I think this is the case for many developers who aren’t lazy and try to get it to be correct, but face circumstances that they can’t control. Of course every developing environment (e.g. .NET Framework), CMS etc should be able to delvier valid strict XHTML. Unfortunately, this is not the case, hence things might not be valid even if the developer had the best intentions.

    > jewel: I’m sorry to hear that you were lied to
    There’s no connection, but it certainly didn’t harm coders and their learning that an XHTML hype seemed to coincide/get blended with a separation/CSS hype.

    > Many people seem to think that HTML must be a horrid soup of uppercase tags, omitted end tags, and presentational markup.
    > XHTML… It’s a little bit more strict
    But don’t you think that using HTML opens up for more sloppiness in the coding, especially when many developers work on the same code and not all developers are that experienced?
    Then all they can rely on is the validation, where they can get away with more bad habits in HTML.
    It’s easier to teach people XHTML in that sense that it has to be well-formed, that every tag has to be closed, as opposed to HTML where most tags have to be closed but there are exceptions to the rule, which will most probably lead to that developers start getting sloppy about closing tags and eventually stop closing some of the tags that have to be closed.

    > You can take any old tag soup HTML document from 1995 and slap an XHTML doctype on it, and it will look exactly the same.
    True, but I think/hope people don’t realize this and try to do it! :-)

    > I know some people think I’m a sad old reactionary who wants evolution to stop with HTML 4.01…
    I don’t think so! I think it’s good that you aim for the best and valid code possible, but (as written above) I just want to bring up more practical scenarios where one might not have full control etc, but really see an advantage with it anyway.

    > to convince people to switch from a Transitional DTD to a Strict DTD
    This is the least we have to do! This, together with the separation of content, presentation and behavior, is the most important things we have to do and inform others about. But XHTML is on a close second place… :-)

    > if you use an XHTML doctype declaration, the document must work if sent as application/xhtml+XML
    I agree.

    To sum it up, I think I have to say my current standpoint is this:

    The most desirable is, of course, to write well-formed and valid XHTML and deliver it with the XHTML MIME type.

    If using the XHTML MIME type isn’t an option (for whatever the reason, except lazy developers), one can either do content negotiation or serve it as text/html to all web browsers. Still ok, for now, since the current web browsers’ error handling have no problems with it.

    And, in a worst-case scenario, if it’s well-formed but has some minor validation errors (but it would still work when sent with the XHTML MIME type), it is ok.

    So, conclusively, I think the lowest I can go to use XHTML and get some of the benefits/avoidance of problems mentioned above is:
    Served as text/html, minor validation errors (such as an incorrect attribute) but still well-formed.

  • Daniel
    April 10th, 2005 at 2:34

    Tommy, you’ve said it well.

    I completely agree that a Strict DTD is the way to go. If you write XHTML it is easily converted to HTML. Personally, I write strict XHTML 1.1, and do content-negotiation and convert to HTML on-the-fly as necessary (I’ve even thought of converting my utf-8 data to latin-1, but I’m holding out).

    This may seem like an unneeded extra step, but it assures that I’m doing the right thing by all browsers.

    This is why I hate Appendix C. We should teach people that the ONLY way to do XHTML is as application/xhtml+XML and promote content-negotiation/conversion.

    There’s really no way to enforce this (old browsers will still try to render), but allowing Appendix C muddies the water.

  • Robert - author
    April 10th, 2005 at 2:34

    Daniel,

    As stated above, I agree a 100% about going with strict.
    But regarding your hate for Appendix C, is it mainly because you don’t want to deliver tag soup (i.e. XHTML as text/html) or that you think developers will be too lazy and take the easy way out?

  • Tommy Olsson
    April 10th, 2005 at 2:35

    > it certainly didn’t harm coders and their learning that an XHTML hype seemed to coincide/get blended with a separation/CSS hype
    Maybe it didn’t harm the coders, but it did irreparable harm to XHTML as a concept. :(

    > But don’t you think that using HTML opens up for more sloppiness in the coding
    I’m sure that sloppy developers will use it as an excuse for sloppy markup. I don’t argue with that, but I want to point out that you can write HTML 4.01 that is virtually indistiguishable from Appendix-C-style XHTML 1.0 (minus a few slash characters). Just because HTML allows some shortcuts, for historical reasons, doesn’t mean that you must or even should be sloppy.

    >> You can take any old tag soup HTML document from 1995 and slap an XHTML doctype on it, and it will look exactly the same.
    > True, but I think/hope people don’t realize this and try to do it! :-)
    Just take a look at any so-called XHTML site served as text/html out there. Like http://www.spv.se/ for instance. Look at that markup and tell me how it helps the web to evolve. Tell me how that is stricter than HTML. Tell me how XHTML really forces developers to separate structure from presentation and behaviour. Tell me how it is more future-proof than HTML. XHTML is no panacea. It can be abused just like the HTML it is, as long as it’s served as text/html. And if people get away with something easy, they will use it rather than doing it the right way if that’s harder. Unfortunately they are not exactly unique. I’d guess that 99.9% of all purported XHTML sites on the web are like that. That’s why I wrote the ‘XHTML is dead’ article a while back. (In that I vowed to stay out of this sort of discussions, too, but somehow I still find myself getting dragged into them. :))

    Daniel: Thanks. I use the same sort of content negotiation on my blog, at the moment. The next incarnation will probably use only HTML 4.01 Strict, though. I don’t use SVG or MathML or anything else that requires XHTML, so using it is just plain silly. There, I’ve admitted it. :)

  • Jarvklo
    April 10th, 2005 at 2:35

    Well..
    I had written long a comment here, but I decided against submitting it after a quick preview since I really would like to se Roberts wish for constructively trying to find a middle ground here come true…

    So I’ll say this instead:
    Do You remember how “impossible for commercial use” and “just plain impossible” CSS layouts were considered before sites like the CSS Zen Garden ? - why don’t we try to come up with an abundance of positive examples on XHTML “done just right” instead of just fighting the same flame war over and over as soon as someone tries to write something on the current subject ?

    Who’ll be the first to create a “30 days to a perfectly built and served XHTML 1.0 Strict based site” series ?

  • Drew Decker
    April 10th, 2005 at 2:36

    Hey good writeup. i wrote an introduction to XHTML and stylesheets myself on my site to bring the beginner in…

    http://www.dev-news.com/index.php?p=30

    Drew

  • Robert - author
    April 10th, 2005 at 2:36

    Tommy,

    > Just because HTML allows some shortcuts, for historical reasons, doesn’t mean that you must or even should be sloppy.

    Of course not. But in most real-life scenarios (at least the ones that I seem to run into), people less skilled in HTML are also involved (read: system developers with lack of respect for/interest in interface code knowledge). In those cases, I believe that giving them the option to code HTML is a bigger risk for getting it more sloppy than telling them to code XHTML and close all tags.

    > http://www.spv.se/
    It’s terrible.

    But I think we do agree on the most common denominator: That the XHTML has to be well-formed and, in that, being correct enough to be able to serve it as application/xhtml+XML. Regarding minor validation errors, I’m willing to overlook them if it lives up to the previous sentence.
    I still think writing XHTML well-formed (with possible minor validation errors) helps the web to evolve and will be stricter than with HTML.

    However, it won’t help with this: “Tell me how XHTML really forces developers to separate structure from presentation and behaviour”.

    > I don’t use SVG or MathML or anything else that requires XHTML, so using it is just plain silly. There, I’ve admitted it. :)
    I don’t agree that the only case where XHTML is motivated is when using features that require XHTML (see above motivation). :-)

    Jarvklo,

    Thank you for the respect to keep it on a constructive level.

    > Who’ll be the first to create a “30 days to a perfectly built and served XHTML 1.0 Strict based site” series?
    This would be great! I thought Zeldman’s book Designing with Web Standards would be this, but I heard that he unfortunately uses the XHTML Transitional Doctype in his examples… :-(

    Drew,

    > Hey good writeup.
    Thank you! And thanks for the link.

  • Tommy Olsson
    April 10th, 2005 at 2:37

    It’s funny, but in a project where you have developers who are “less skilled in HTML”, I see things completely opposite to your point of view. Unskilled developers should absolutely not work with XHTML, because even the slightest well-formedness error will kill the page when served properly. Unskilled developers should be educated, but until they are proficient, they should use HTML where the error handling is less draconic.

    The notion of allowing the “XHTML” from such developers to be served as text/html only, to avoid the Yellow Screen of Death for well-formedness errors, is something I consider very detrimental to the future of web standards. This is exactly the category that should use application/xhtml+XML, so that their errors are caught quickly.

  • Robert - author
    April 10th, 2005 at 2:37

    > Unskilled developers should absolutely not work with XHTML

    Unfortunately I don’t get to choose all the members and specify their skills in a project group.

    > This is exactly the category that should use application/xhtml+XML

    Yes, and that’s what I’m going for, that would be ideal. But i