There is once again talk about versioned web pages. Unfortunately there is also the same continuing confusion between theory and reality.

A List Apart: Articles: Beyond DOCTYPE: Web Standards, Forward Compatibility, and IE8 The DOCTYPE switch is broken

Back in 1998, Todd Fahrner came up with a toggle that would allow a browser to offer two rendering modes: one for developers wishing to follow standards, and another for everyone else. The concept was brilliantly simple. When the user agent encountered a document with a well-formed DOCTYPE declaration of a current HTML standard (i.e. HTML 2.0 wouldn’t cut it), it would assume that the author knew what she was doing and render the page in “standards” mode (laying out elements using the W3C’s box model). But when no DOCTYPE or a malformed DOCTYPE was encountered, the document would be rendered in “quirks” mode, i.e., laying out elements using the non-standard box model of IE5.x/Windows.

The above referenced article goes on to describe a hideously complex new version mechanism, which if adopted is pretty much guaranteed to cause grief. We do not need anything so complex. DOCTYPE works, but what DOCTYPE means may not be what you expect.

This topic is very old and very familiar when developing distributed applications. If you have two independent machines with a network in the middle, sooner or later you are going to be running different software versions on the two (or more) machines. This forces you to think through the issues, and after twenty-odd years working on distributed applications, on this topic I have no doubt as to what works.

What you have to do is version the data. In a distributed application, this is the data that goes across the web. A single version number is sufficient. Note that this version number is not the version of the application.

The DOCTYPE version is all and exactly what we need. In theory the DOCTYPE was meant to indicate exact compliance with a particular W3C specification. In reality DOCTYPE means something else.

Up until the release of IE7, a web page that declared <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"> was tested and worked under IE6 (at a very high probability), and may have been tested on other then-current browsers. The DOCTYPE does NOT indicate that the page works in an HTML 4.01 compliant web browser. There was (and is?) no browser that exactly implemented HTML 4.01, so exact 4.01 compliance in the web page is impossible to assert. Lacking a reference implementation of the HTML standard, web developers can only test against the most widely-used web browser(s).

In effect the DOCTYPE string is the version on data - but only vaguely related to the W3C specification.

On the release of IE7, the Microsoft folk fumbled. In changing the interpretation of web pages (no matter how well-intentioned) they changed to meaning of the data. When you change the meaning of the data, you MUST bump the version number.

If you lose the confusion between theory and reality, the DOCTYPE is all we need to exactly version the data (HTML).