Journal tags: xhtml2

1

Misunderstanding markup

The W3C announced last week that the XHTML 2 Working Group will wrap up at the end of this year. This should have been a straightforward, welcome announcement. Instead it has confused a lot of people who believe that it heralds the end of XHTML—see, for example, the comments on Zeldman’s blog post.

This confusion is understandable given the lamentable names that have been assigned to different technologies. This isn’t the first time this has happened…

sounds like it has something to do with Java. It doesn’t. Apart from some superficial syntactical similarities, they have nothing in common. Java is to JavaScript as ham is to hamster.

sounds like it has something to do with HTML. It doesn’t. DHTML is a catch-all term to describe the action of updating the CSS properties of HTML elements using JavaScript. I have my own catch-all term for the combination of HTML, CSS, and JavaScript; I call it web development.

And so to . You’d be forgiven for thinking it has something to do with or . It doesn’t.

XHTML 1.0 is simply a reformulation of HTML 4 with XML syntax:

  • lowercase tag and attribute names,
  • quoted attribute values,
  • mandatory closing tags for p and li elements,
  • a slash at the end of standalone elements like img, br, and meta.

XHTML 1.1 is the same reformulation but with the added unrealistic demand that documents must served with an XML mime-type.

XHTML 2, by contrast, has very, very little in common with HTML 4. It was an attempt at a fresh start, to create a theoretically “pure” vocabulary with little concern for backwards compatibility. It was, of course, doomed to failure:

This was a philosophically pure specification that was so backwardly incompatible that it nearly deprecated the img element.

Now that XHTML 2 is dead, some people think that this means XHTML is dead. It doesn’t.

Henri Sivonen apparently attempts to clear up this confusion by writing An Unofficial Q and A about the Discontinuation of the XHTML2 WG which, alas, is not very clearly written. Also, Henri, as pointed out by John Allsopp, your snark is showing:

There are two meanings to XHTML: technical and marketing. The technical kind (XHTML served using the application/xhtml+xml MIME type) is a formulation of HTML as an XML vocabulary. The marketing kind (XHTML served using the text/html MIME type) is processed just like HTML by browsers but the authors attempt to observe slightly different syntax rules in order to make it seem that they are doing something newer and shinier compared to HTML.

Belittling authors who prefer a stricter syntax is no way to explain technical differences between formats.

There are perfectly good reasons for choosing to use the XHTML syntax. Take, for example, Drew’s comment:

Whenever this argument surfaces, there seems to be the assumption that loose syntax is easier for beginners. This baffles me. In my experience simple, strict rules are much easier to learn and code to than loose rules with multiple shortcuts. I like XHTML because attributes must always be quoted. Tags must always be closed. These are simple rules that require no thought, and result in uniform, predictable markup.

I’m not saying that XHTML syntax is better or worse than HTML syntax. I’m saying it’s a personal choice. If you prefer a different syntax to me, that doesn’t mean that one of us is wrong. If I like Thai food and you prefer Italian, neither of us is wrong.

The death of XHTML 2 does not mean the death of XHTML syntax. If you want to continue to close all tags and quote all attributes, you can do so. You can either use the existing XHTML 1 spec or you can use HTML 5.

That’s right; HTML 5 allows you to use whichever syntax you are most comfortable with. Doctor Bruce has the diagnosis:

I like the XHTML syntax. It’s how I learned. I’m used to lowercase code, quoted attributes and trailing slashes on elements like br and img. They make me feel nice and comfy, like a cup of Ovaltine and The Evil Dead on the telly.

But you might not. You might want SHOUTY UPPERCASE tags, no trailing slashes and attribute minimisation. And, in HTML 5 you can choose.

Thanks to the “pave the cowpaths” principle, it’s up to you. As you like it. What you will. Whatever you want, whatever you like.

If you want, you can even serve your documents as application/xhtml+xml, instantly transforming them from HTML 5 into XHTML 5 …yes, another confusing name.

Just remember, XHTML 2, the spec, has nothing to do with XHTML, the syntax. XHTML lives on in HTML 5.

But, but, but…, I hear you cry, surely that does us no good because HTML 5 isn’t supported yet, right?

Define support. HTML 5, unlike XHTML 2, is designed to be backwards compatible. So here’s how you can take an existing XHTML 1 document and convert it to HTML 5…

Take this line:

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">

Replace it with:

<!DOCTYPE html>

Done.

XHTML 2 is dead. Long live XHTML …as HTML 5.

Update: What Zeldman said.