TokenLists: Missing Web DNA

A long long time ago, in a browser far far away, Brendan Eich introduced what would become known as “DOM Level 0” – basically: Simple reflective properties that allowed you to access useful bits of what would later become “DOM” and twiddle with them.  It looked something like this…

document.forms[0].firstName.value = "Brian";

However, there is a long, complex and twisted history that led us to where we are today (see my A Brief(-ish) History of the Web Universe series of posts).  To sum up some key bits: CSS and the actual DOM were conceived of separately from thoughts of DOM0 and JavaScript.  Unlike its predecessor, the DOM was intended to be generic – in trying to bring a lot of people together and “fix” the Web, there was a lot of focus on trying to address the problems of SGML that made HTML so appealing in the first place – so we got DTDs for HTML and work began on all things XML.

The DOM was intended to serve all masters, and as such it dealt with basic attributes in a tree which could be serialized, manipulated and parsed and rewritten in any language with a common interface.  This meant that authors would use getAttribute(attributeName) and setAttribute(attributeName, value) to get and set attribute values respectively.  It seemed to those spec writers, then absolutely logical to create an attribute called “class” and allow a user to type:

element.setAttribute("class", "intro")

This was problematic in the browser though because DOM Level 0 was not only more well known, but far more terse/convenient for authors who really just wanted to deal with reflective properties and type something like:

element.class = "intro";

What’s more, not all attributes were reflective back with their properties.  Properties were just easier to deal with – they made sense for runtime properties, attributes seemed to do so less.  To resolve this we got the .className property which was reflective on the class attribute.  Problem solved… Except, not.

Simple Enough?

CSS says that any element can specify 0…N classes, not 0 or 1, in a space separated list.  In SGML/XML terms these were “NMTokens“.  It sounds quite simple – a space separated list of values with some simple constraints should work everywhere, and it does… kind of.

In the browser world, however, where we were messing with classes at runtime all over the place that needed to be reflected back (CSS wasn’t based on runtime properties, it was based on attributes) we began facing issues.  Someone would come along and write code like the above example, which assumed that it was a single value:

element.className = "intro";

The net result being that any existing class names at the time of execution were replaced with just one.  Some other person would assume they wanted to toggle a value and write something like:

// Toggle the 'selected' class
element.className = (element.className === "selected") ? "" : "selected";

The problem being two-fold:  First, it assumes it could === a single value, the second being that it can overwrite all the others.  We had problems removing classes, adding them, removing them, toggling them, finding out if it contained something.  It sounds trivial but it turns out that it wasn’t: Each time you wanted to touch the className you had to deal with deserializing the string, doing your work and re-serializing it without stepping on any of a number of landmines.  The net result was, as one might expect, we came up with libraries to help with this – however, they varied in quality and assumptions. It was still a mess.

When jQuery joined the W3C after becoming the most widely used solution, they lobbied to improve this situation (disclaimer, I represent jQuery in several W3C groups).  It wasn’t long before we had the .classList interface.  The world is much better with .classList at our disposal – finally we can be rid of the above problem.  Now users can write:

element.classList.add("intro");

It’s the missing interface developers always needed.

Problem Solved?

Sadly, I think not quite.  While it’s a major improvement, the trouble is that the NMTokens issue does not solely affect the class attribute, or even just in JavaScript.  It’s quite possible that you are thinking “Well, this probably isn’t something I need to worry about because I’ve never come across it”.  However, I think you will, and that’s the problem.  With efforts like WAI-ARIA and others, there are other NMTokens issues that you’ve probably not thought about before but eventually will have to.  The aria-describedby attribute is one example – a control can be described by multiple elements for different purposes.  For example, an <input> element may have associated helpful advice that appears in a tooltip popup and associated constraint validation errors.  Further, it works a lot like the class attribute and has similar challenge in JavaScript in that it frequently has to be actively maintained, not just written in markup, and that’s deceptively hard.  For example, an author should not associate an input with an errors collection until there are actually errors.  This sucks.  ARIA is hard enough without simple challenges like the one we faced prior to having .classList.

Good News and Bad

The good news is that standards makers had the foresight to create an interface for this type of problem called DOMTokenList with all the useful methods and properties that .classList exposes.  The .classList property holds a DOMTokenList.

The bad news is that it’s pretty much locked away and there’s no way to easily re-apply it to new things as they emerge.  We could continue to identify spec properties and create new things like .classList each time we find them.  For example, we could expose .ariaDescribedByList – and we might want to occasionally do that – but it’s not great.  It’s just additive.  Each time we do, the API of things to learn gets bigger, it also doesn’t expose these abilities to custom elements, and it doesn’t help with anything that isn’t specifically HTML (if you care about that sort of thing).

Alternatively, however, we could define a single new DOM method to expose any attribute this way.  This is actually pretty easy to do, minimal new API to learn and should reasonably work for everyone.  Jonathan Neal and I are providing a prollyfill for this (public domain) which allows people to ask for an attribute as a DOMTokenList and deal with it the same as they would .classList.  Because it’s a proposal, and should a standard ultimately arrive it may differ, we’ve underscored the method name, but here’s an example of its use…

element._asTokenList("aria-describedby").add("foo-help-text");

In Extensible Web terms, this isn’t asking for new additive functionality at all – it is explaining existing magic that already exists, but lies mostly dormant and unexposed in the bowels of the platform.  Given this interface, the .classList property, for example, is then merely legacy sugar for its equivalent .asTokenList(attr) accessor (which doesn’t require ‘name’ distinction either and deals with dasherized attributes just fine too):

element._asTokenList("class").add("intro");

I’d like to know what you think about this proposal – please provide comments here, find me on twitter (@briankardell) or find the topic “asTokenList” on Specifiction and let us know.

Update: Given that this has now gotten some discussion I’ve created a repository for it so that individuals can send pull requests and track issues – from this new location I have also changed the name of the method based on feedback to _tokenListFor(attr).

Thanks to the many people who proofread, looked at demos, discussed or gave thoughts on this as it developed, including Jonathan Neal, Bruce Lawson,Mathias Bynens, Simon St Laurent, Jake Archibald, and Alice Boxhall.

1 thought on “TokenLists: Missing Web DNA

  1. Pingback: Bruce Lawson’s personal site  : Reading List

Leave a comment