Clarifying HTML

2007-03-14

HTML and the DOM as currently defined are quite a hash. Given the history of web browsers, this is understandable, if unfortunate. Browser makers are not the only culprits, as the W3C HTML standards are ... a little funky.

Sounds like the W3C HTML working group is opening up to more general participation. The charter seems to have a bit more of a pragmatic flavor. This could be a good thing. Followed the steps for joining (as an individual).

Possible goals for reinventing HTML:

Do no harm (or at least as little as possible) : This means HTML5 documents should render in existing mainstream browsers (IE6/7, Firefox, Safari) as much as is practical.

Allow for extensibility : HTML in combination with Javascript is a programming platform. Each site is an application. The generated HTML is in effect a dialect, potentially with node and attribute names meaningful only to that site's application. Draw on Lisp/Scheme as a model (not for specifics).

Provide a clear/simple/minimal model : HTML and the DOM are at present quite a hash. There should be one clear model for a programmer to follow. This will mean removing some of the left-over debris. When in doubt, leave it out.

Digesting HTML, CSS, the DOM, Javascript, and some of the (many) variations tried in building web applications - took a while. With any new area I need to grok current state before I feel comfortable working. The end result is a personal internal model of how things work and could or should be used. In the end I believe the model for HTML programming could and should be both much simpler and more flexible. As a model I am back to drawing on a approach probably very familiar to Lisp/Scheme programmers.

An HTML document is simply a tree of typed nodes. With each node there are per-node attributes, and per-node-type common properties. Default behaviors and style are associated with each node type - some of which can be overridden (with varying degrees of success).

As a programmer, I can (to some degree) re-invent the presentation and behavior of existing HTML nodes to meet the needs of my application ... with lots of little exceptions. I would like to simplify the model by removing the exceptions. As a programmer, I could achieve the greatest clarity of expression by adding named attributes and nodes as needed for my application. An HTML page is a data structure, very much like a Lisp S-expression - and I think there is a benefit in treating it as such, and in making this treatment explicit.