Journal tags: data

25

Tracking

I’ve been reading the excellent Design For Safety by Eva PenzeyMoog. There was a line that really stood out to me:

The idea that it’s alright to do whatever unethical thing is currently the industry norm is widespread in tech, and dangerous.

It stood out to me because I had been thinking about certain practices that are widespread, accepted, and yet strike me as deeply problematic. These practices involve tracking users.

The first problem is that even the terminology I’m using would be rejected. When you track users on your website, it’s called analytics. Or maybe it’s stats. If you track users on a large enough scale, I guess you get to just call it data.

Those words—“analytics”, “stats”, and “data”—are often used when the more accurate word would be “tracking.”

Or to put it another way; analytics, stats, data, numbers …these are all outputs. But what produced these outputs? Tracking.

Here’s a concrete example: email newsletters.

Do you have numbers on how many people opened a particular newsletter? Do you have numbers on how many people clicked a particular link?

You can call it data, or stats, or analytics, but make no mistake, that’s tracking.

Follow-on question: do you honestly think that everyone who opens a newsletter or clicks on a link in a newsletter has given their informed constent to be tracked by you?

You may well answer that this is a widespread—nay, universal—practice. Well yes, but a) that’s not what I asked, and b) see the above quote from Design For Safety.

You could quite correctly point out that this tracking is out of your hands. Your newsletter provider—probably Mailchimp—does this by default. So if the tracking is happening anyway, why not take a look at those numbers?

But that’s like saying it’s okay to eat battery-farmed chicken as long as you’re not breeding the chickens yourself.

When I try to argue against this kind of tracking from an ethical standpoint, I get a frosty reception. I might have better luck battling numbers with numbers. Increasing numbers of users are taking steps to prevent tracking. I had a plug-in installed in my mail client—Apple Mail—to prevent tracking. Now I don’t even need the plug-in. Apple have built it into the app. That should tell you something. It reminds me of when browsers had to introduce pop-up blocking.

If the outputs generated by tracking turn out to be inaccurate, then shouldn’t they lose their status?

But that line of reasoning shouldn’t even by necessary. We shouldn’t stop tracking users because it’s inaccurate. We should stop stop tracking users because it’s wrong.

Safari 15

If you download Safari Technology Preview you can test drive features that are on their way in Safari 15. One of those features, announced at Apple’s World Wide Developer Conference, is coloured browser chrome via support for the meta value of “theme-color.” Chrome on Android has supported this for a while but I believe Safari is the first desktop browser to add support. They’ve also added support for the media attribute on that meta element to handle “prefers-color-scheme.”

This is all very welcome, although it does remind me a bit of when Internet Explorer came out with the ability to make coloured scrollbars. I mean, they’re nice features’n’all, but maybe not the most pressing? Safari is still refusing to acknowledge progressive web apps.

That’s not quite true. In her WWDC video Jen demonstrates how you can add a progressive web app like Resilient Web Design to your home screen. I’m chuffed that my little web book made an appearance, but when you see how you add a site to your home screen in iOS, it’s somewhat depressing.

The steps to add a website to your home screen are:

Tap the “share” icon. It’s not labelled “share.” It’s a square with an arrow coming out of the top of it.
A drawer pops up. The option to “add to home screen” is nowhere to be seen. You have to pull the drawer up further to see the hidden options.
Now you must find “add to home screen” in the list

Copy
Add to Reading List
Add Bookmark
Add to Favourites
Find on Page
Add to Home Screen
Markup
Print

It reminds of this exchange in The Hitchhiker’s Guide To The Galaxy:

“You hadn’t exactly gone out of your way to call attention to them had you? I mean like actually telling anyone or anything.”

“But the plans were on display…”

“On display? I eventually had to go down to the cellar to find them.”

“That’s the display department.”

“With a torch.”

“Ah, well the lights had probably gone.”

“So had the stairs.”

“But look you found the notice didn’t you?”

“Yes,” said Arthur, “yes I did. It was on display in the bottom of a locked filing cabinet stuck in a disused lavatory with a sign on the door saying ‘Beware of The Leopard.’”

Safari’s current “support” for adding progressive web apps to the home screen feels like the minimum possible …just enough to use it as a legal argument if you happen to be litigated against for having a monopoly on app distribution. “Hey, you can always make a web app!” It’s true in theory. In practice it’s …suboptimal, to put it mildly.

Still, those coloured tab bars are very nice.

It’s a little bit weird that this stylistic information is handled by HTML rather than CSS. It’s similar to the meta viewport value in that sense. I always that the plan was to migrate that to CSS at some point, but here we are a decade later and it’s still very much part of our boilerplate markup.

Some people have remarked that the coloured browser chrome can make the URL bar look like part of the site so people might expect it to operate like a site-specific search.

I also wonder if it might blur “the line of death”; that point in the UI where the browser chrome ends and the website begins. Does the unified colour make it easier to spoof browser UI?

Probably not. You can already kind of spoof browser UI by using the right shade of grey. Although the removal any kind of actual line in Safari does give me pause for thought.

I tend not to think of security implications like this by default. My first thought tends to be more about how I can use the feature. It’s only after a while that I think about how bad actors might abuse the same feature. I should probably try to narrow the gap between those thoughts.

Submitting a form with datalist

I’m a big fan of HTML5’s datalist element and its elegant design. It’s a way to progressively enhance any input element into a combobox.

You use the list attribute on the input element to point to the ID of the associated datalist element.

<label for="homeworld">Your home planet</label>
<input type="text" name="homeworld" id="homeworld" list="planets">
<datalist id="planets">
 <option value="Mercury">
 <option value="Venus">
 <option value="Earth">
 <option value="Mars">
 <option value="Jupiter">
 <option value="Saturn">
 <option value="Uranus">
 <option value="Neptune">
</datalist>

It even works on input type="color", which is pretty cool!

The most common use case is as an autocomplete widget. That’s how I’m using it over on The Session, where the datalist is updated via Ajax every time the input is updated.

But let’s stick with a simple example, like the list of planets above. Suppose the user types “jup” …the datalist will show “Jupiter” as an option. The user can click on that option to automatically complete their input.

It would be handy if you could automatically submit the form when the user chooses a datalist option like this.

Well, tough luck.

The datalist element emits no events. There’s no way of telling if it has been clicked. This is something I’ve been trying to find a workaround for.

I got my hopes up when I read Amber’s excellent article about document.activeElement. But no, the focus stays on the input when the user clicks on an option in a datalist.

So if I can’t detect whether a datalist has been used, this best I can do is try to infer it. I know it’s not exactly the same thing, and it won’t be as reliable as true detection, but here’s my logic:

Keep track of the character count in the input element.
Every time the input is updated in any way, check the current character count against the last character count.
If the difference is greater than one, something interesting happened! Maybe the user pasted a value in …or maybe they used the datalist.
Loop through each of the options in the datalist.
If there’s an exact match with the current value of the input element, chances are the user chose that option from the datalist.
So submit the form!

Here’s how that translates into DOM scripting code:

document.querySelectorAll('input[list]').forEach( function (formfield) {
  var datalist = document.getElementById(formfield.getAttribute('list'));
  var lastlength = formfield.value.length;
  var checkInputValue = function (inputValue) {
    if (inputValue.length - lastlength > 1) {
      datalist.querySelectorAll('option').forEach( function (item) {
        if (item.value === inputValue) {
          formfield.form.submit();
        }
      });
    }
    lastlength = inputValue.length;
  };
  formfield.addEventListener('input', function () {
    checkInputValue(this.value);
  }, false);
});

I’ve made a gist with some added feature detection and mustard-cutting at the start. You should be able to drop it into just about any page that’s using datalist. It works even if the options in the datalist are dynamically updated, like the example on The Session.

It’s not foolproof. The inference relies on the difference between what was previously typed and what’s autocompleted to be more than one character. So in the planets example, if someone has type “Jupite” and then they choose “Jupiter” from the datalist, the form won’t automatically submit.

But still, I reckon it covers most common use cases. And like the datalist element itself, you can consider this functionality a progressive enhancement.

Reading

At the beginning of the year, Remy wrote about extracting Goodreads metadata so he could create his end-of-year reading list. More recently, Mark Llobrera wrote about how he created a visualisation of his reading history. In his case, he’s using JSON to store the information.

This kind of JSON storage is exactly what Tom Critchlow proposes in his post, Library JSON - A Proposal for a Decentralized Goodreads:

Thinking through building some kind of “web of books” I realized that we could use something similar to RSS to build a kind of decentralized GoodReads powered by indie sites and an underlying easy to parse format.

His proposal looks kind of similar to what Mark came up with. There’s a title, an author, an image, and some kind of date for when you started and/or finished reading the book.

Matt then points out that RSS gets close to the data format being suggested and asks how about using RSS?:

Rather than inventing a new format, my suggestion is that this is RSS plus an extension to deal with books. This is analogous to how the podcast feeds are specified: they are RSS plus custom tags.

Like Matt, I’m in favour of re-using existing wheels rather than inventing new ones, mostly to avoid a 927 situation.

But all of these proposals—whether JSON or RSS—involve the creation of a separate file, and yet the information is originally published in HTML. Along the lines of Matt’s idea, I could imagine extending the h-entry collection of class names to allow for books (or films, or other media). It already handles images (with u-photo). I think the missing fields are the date-related ones: when you start and finish reading. Those fields are present in a different microformat, h-event in the form of dt-start and dt-end. Maybe they could be combined:


<article class="h-entry h-event h-review">
<h1 class="p-name p-item">Book title</h1>
<img class="u-photo" src="image.jpg" alt="Book cover.">
<p class="p-summary h-card">Book author</p>
<time class="dt-start" datetime="YYYY-MM-DD">Start date</time>
<time class="dt-end" datetime="YYYY-MM-DD">End date</time>
<div class="e-content">Remarks</div>
<data class="p-rating" value="5">★★★★★</data>
<time class="dt-published" datetime="YYYY-MM-DDThh:mm">Date of this post</time>
</article>

That markup is simultaneously a post (h-entry) and an event (h-event) and you can even throw in h-card for the book author (as well as h-review if you like to rate the books you read). It can be converted to RSS and also converted to .ics for calendars—those parsers are already out there. It’s ready for aggregation and it’s ready for visualisation.

I publish very minimal reading posts here on adactio.com. What little data is there isn’t very structured—I don’t even separate the book title from the author. But maybe I’ll have a little play around with turning these h-entries into combined h-entry/event posts.

Request mapping

The Request Map Generator is a terrific tool. It’s made by Simon Hearne and uses the WebPageTest API.

You pop in a URL, it fetches the page and maps out all the subsequent requests in a nifty interactive diagram of circles, showing how many requests third-party scripts are themselves generating. I’ve found it to be a very effective way of showing the impact of third-party scripts to people who aren’t interested in looking at waterfall diagrams.

I was wondering… Wouldn’t it be great if this were built into browsers?

We already have a “Network” tab in our developer tools. The purpose of this tab is to show requests coming in. The browser already has all the information it needs to make a diagram of requests in the same that the request map generator does.

In Firefox, there’s a little clock icon in the bottom left corner of the “Network” tab. Clicking that shows a pie-chart view of requests. That’s useful, but I’d love it if there were the option to also see the connected circles that the request map generator shows.

Just a thought.

Server Timing

Harry wrote a really good article all about the performance measurement Time To First Byte. Time To First Byte: What It Is and Why It Matters:

While a good TTFB doesn’t necessarily mean you will have a fast website, a bad TTFB almost certainly guarantees a slow one.

Time To First Byte has been the chink in my armour over at thesession.org, especially on the home page. Every time I ran Lighthouse, or some other performance testing tool, I’d get a high score …with some points deducted for taking too long to get that first byte from the server.

Harry’s proposed solution is to set up some Server Timing headers:

With a little bit of extra work spent implementing the Server Timing API, we can begin to measure and surface intricate timings to the front-end, allowing web developers to identify and debug potential bottlenecks previously obscured from view.

I rememberd that Drew wrote an excellent article on Smashing Magazine last year called Measuring Performance With Server Timing:

The job of Server Timing is not to help you actually time activity on your server. You’ll need to do the timing yourself using whatever toolset your backend platform makes available to you. Rather, the purpose of Server Timing is to specify how those measurements can be communicated to the browser.

He even provides some PHP code, which I was able to take wholesale and drop into the codebase for thesession.org. Then I was able to put start/stop points in my code for measuring how long some operations were taking. Then I could output the results of these measurements into Server Timing headers that I could inspect in the “Network” tab of a browser’s dev tools (Chrome is particularly good for displaying Server Timing, so I used that while I was conducting this experiment).

I started with overall database requests. Sure enough, that was where most of the time in time-to-first-byte was being spent.

Then I got more granular. I put start/stop points around specific database calls. By doing this, I was able to zero in on which operations were particularly costly. Once I had done that, I had to figure out how to make the database calls go faster.

Spoiler: I did it by adding an extra index on one particular table. It’s almost always indexes, in my experience, that make the biggest difference to database performance.

I don’t know why it took me so long to get around to messing with Server Timing headers. It has paid off in spades. I wish I had done it sooner.

And now thesession.org is positively zipping along!

Timelines of the web

Recreating the original WorldWideWeb browser was an exercise in digital archeology. With a working NeXT machine in the room, Kimberly was able to examine the source code for the first every browser and discover a treasure trove within. Like this gem in HTUtils.h:

#define TCP_PORT 80 /* Allocated to http by Jon Postel/ISI 24-Jan-92 */

Sure enough, by June of 1992 port 80 was documented as being officially assigned to the World Wide Web (Gopher got port 70). Jean-François Groff—who worked on the World Wide Web project with Tim Berners-Lee—told us that this was a moment they were very pleased about. It felt like this project of theirs was going places.

Jean-François also told us that the WorldWideWeb browser/editor was kind of like an advanced prototype. The idea was to get something up and running as quickly as possible. Well, the NeXT operating system had a very robust Text Object, so the path of least resistance for Tim Berners-Lee was to take the existing word-processing software and build a hypertext component on top of it. Likewise, instead of creating a brand new format, he used the existing SGML format and added one new piece: linking with A tags.

So the WorldWideWeb application was kind of like a word processor and document viewer mashed up with hypertext. Ted Nelson complains to this day that the original sin of the web was that it borrowed this page-based metaphor. But Nelson’s Project Xanadu, originally proposed in 1974 wouldn’t become a working reality until 2014—a gap of forty years. Whereas Tim Berners-Lee proposed his system in March 1989 and had working code within a year. There’s something to be said for being pragmatic and working with what you’ve got.

The web was also a mashup of ideas. Hypertext existed long before the web—Ted Nelson coined the term in 1963. There were conferences and academic discussions devoted to hypertext and hypermedia. But almost all the existing hypertext systems—including Tim Berners-Lee’s own ENQUIRE system from the early 80s—were confined to a local machine. Meanwhile networked computers were changing everything. First there was the ARPANET, then the internet. Tim Berners-Lee’s ambitious plan was to mash up hypertext with networks.

Going into our recreation of WorldWideWeb at CERN, I knew I wanted to convey this historical context somehow.

The World Wide Web officially celebrates its 30th birthday in March of this year. It’s kind of an arbitrary date: it’s the anniversary of the publication of Information Management: A Proposal. Perhaps a more accurate date would be the day the first website—and first web server—went online. But still. Let’s roll with this date of March 12, 1989. I thought it would be interesting not only to look at what’s happened between 1989 and 2019, but also to look at what happened between 1959 and 1989.

So now I’ve got two time cones that converge in the middle: 1959 – 1989 and 1989 – 2019. For the first time period, I made categories of influences: formats, hypertext, networks, and computing. For the second time period, I catalogued notable results: browsers, servers, and the evolution of HTML.

I did a little bit of sketching and quickly realised that these converging timelines could be represented somewhat like particle collisions. Once I had that idea in my head, I knew how I would be spending my time during the hack week.

Rather than jumping straight into the collider visualisation, I took some time to make a solid foundation to build on. I wanted to be sure that the timeline itself would be understable even if it were, say, viewed in the first ever web browser.

I marked up each timeline as an ordered list of h-events:

<li class="h-event y1968">
  <a href="https://en.wikipedia.org/wiki/NLS_%28computer_system%29" class="u-url">
    <time class="dt-start" datetime="1968-12-09">1968</time>
    <abbr class="p-name" title="oN-Line System">NLS</abbr>
  </a>
</li>

With the markup in place, I could concentrate on making it look halfway decent. For small screens, the layout is very basic—just a series of lists. When the screen gets wide enough, I lay those lists out horzontally one on top of the other. In this view, you can more easily see when events coincide. For example, ENQUIRE, Usenet, and Smalltalk all happen in 1980. But the real beauty comes when the screen is wide enough to display everthing at once. You can see how an explosion of activity in the early 90s. In 1994 alone, we get the release of Netscape Navigator, the creation of HTTPS, and the launch of Amazon.com.

The whole thing is powered by CSS transforms and positioning. Each year on a timeline has its own class that gets moved to the correct chronological point using calc(). I wanted to use translateX() but I couldn’t get the maths to work for that, so I had use plain ol’ left and right:

.y1968 {
  left: calc((1968 - 1959) * (100%/30) - 5em);
}

For events before 1989, it’s the distance of the event from 1959. For events after 1989, it’s the distance of the event from 2019:

.y2014 {
  right: calc((2019 - 2014) * (100%/30) - 5em);
}

(Each h-event has a width of 5em so that’s where the extra bit at the end comes from.)

I had to do some tweaking for legibility: bunches of events happening around the same time period needed to be separated out so that they didn’t overlap too much.

As a finishing touch, I added a few little transitions when the page loaded so that the timeline fans out from its centre point.

Et voilà!

I fiddled with the content a bit after peppering Robert Cailliau with questions over lunch. And I got some very valuable feedback from Jean-François. Some examples he provided:

1971: Unix man pages, one of the first instances of writing documents with a markup language that is interpreted live by a parser before being presented to the user.

1980: Usenet News, because it was THE everyday discussion medium by the time we created the web technology, and the Web first embraced news as a built-in information resource, then various platforms built on the web rendered it obsolete.

1982: Literary Machines, Ted Nelson’s book which was on our desk at all times

I really, really enjoyed building this “collider” timeline. It was a chance for me to smash together my excitement for web history with my enjoyment of using the raw materials of the web; HTML and CSS in this case.

The timeline pales in comparison to the achievement of the rest of the team in recreating the WorldWideWeb application but I was just glad to be able to contribute a little something to the project.

Ch-ch-ch-changes

It’s browser updatin’ time! Firefox 65 just dropped. So did Chrome 72. Safari 12.1 is shipping with iOS 12.2.

It’s interesting to compare the release notes for each browser and see the different priorities reflected in them (this is another reason why browser diversity is A Good Thing).

A lot of the Firefox changes are updates to dev tools; they just keep getting better and better. In fact, I’m not sure “dev tools” is the right word for them. With their focus on layout, typography, and accessibility, “design tools” might be a better term.

Oh, and Firefox is shipping support for some CSS properties that really help with print style sheets, so I’m disproportionately pleased about that.

In Safari’s changes, I’m pleased to see that the datalist element is finally getting implemented. I’ve been a fan of that element for many years now. (Am I a dork for having favourite HTML elements? Or am I a dork for even having to ask that question?)

And, of course, it wouldn’t be a Safari release without a new made up meta tag. From the people who brought you such hits as viewport and apple-mobile-web-app-capable, comes …supported-color-schemes (Apple likes to make up meta tags almost as much as Google likes to make up rel values).

There’ll be a whole bunch of improvements in how progressive web apps will behave once they’ve been added to the home screen. We’ll finally get some state persistence if you navigate away from the window!

Updated the behavior of websites saved to the home screen on iOS to pause in the background instead of relaunching each time.

Maximiliano Firtman has a detailed list of the good, the bad, and the “not sure yet if good” for progressive web apps on iOS 12.2 beta. Thomas Steiner has also written up the progress of progressive web apps in iOS 12.2 beta. Both are published on Ev’s blog.

At first glance, the release notes for Chrome 72 are somewhat paltry. The big news doesn’t even seem to be listed there. Maximiliano Firtman again:

Chrome 72 for Android shipped the long-awaited Trusted Web Activity feature, which means we can now distribute PWAs in the Google Play Store!

Very interesting indeed! I’m not sure if I’m ready to face the Kafkaesque process of trying to add something to the Google Play Store just yet, but it’s great to know that I can. Combined with the improvements coming in iOS 12.2, these are exciting times for progressive web apps!

Sonic sparklines

I’ve seen some lovely examples of the Web Audio API recently.

At the Material conference, Halldór Eldjárn demoed his Poco Apollo project. It generates music on the fly in the browser to match a random image from NASA’s Apollo archive on Flickr. Brian Eno, eat your heart out!

At Codebar Brighton a little while back, local developer Luke Twyman demoed some of his audio-visual work, including the gorgeous Solarbeat—an audio orrery.

The latest issue of the Clearleft newsletter has some links on sound design in interfaces:

Why strong sound design is critical to successful products by Amber Case and Aaron Day,
UI Sounds: From Zero To Hero by Roman Zimarev, and
Form Validation with Web Audio by Ruth John.

I saw Ruth give a fantastic talk on the Web Audio API at CSS Day this year. It had just the right mixture of code and inspiration. I decided there and then that I’d have to find some opportunity to play around with web audio.

As ever, my own website is the perfect playground. I added an audio Easter egg to adactio.com a while back, and so far, no one has noticed. That’s good. It’s a very, very silly use of sound.

In her talk, Ruth emphasised that the Web Audio API is basically just about dealing with numbers. Lots of the examples of nice usage are the audio equivalent of data visualisation. Data sonification, if you will.

I’ve got little bits of dataviz on my website: sparklines. Each one is a self-contained SVG file. I added a script element to the SVG with a little bit of JavaScript that converts numbers into sound (I kind of wish that the script were scoped to the containing SVG but that’s not the way JavaScript in SVG works—it’s no different to putting a script element directly in the body). Clicking on the sparkline triggers the sound-playing function.

It sounds terrible. It’s like a theremin with hiccups.

Still, I kind of like it. I mean, I wish it sounded nicer (and I’m open to suggestions on how to achieve that—feel free to fork the code), but there’s something endearing about hearing a month’s worth of activity turned into a wobbling wave of sound. And it’s kind of fun to hear how a particular tag is used more frequently over time.

Anyway, it’s just a silly little thing, but anywhere you spot a sparkline on my site, you can tap it to hear it translated into sound.

Month maps

One of the topics I enjoy discussing at Indie Web Camps is how we can use design to display activity over time on personal websites. That’s how I ended up with sparklines on my site—it was the a direct result of a discussion at Indie Web Camp Nuremberg a year ago:

During the discussion at Indie Web Camp, we started looking at how silos design their profile pages to see what we could learn from them. Looking at my Twitter profile, my Instagram profile, my Untappd profile, or just about any other profile, it’s a mixture of bio and stream, with the addition of stats showing activity on the site—signs of life.

Perhaps the most interesting visual example of my activity over time is on my Github profile. Halfway down the page there’s a calendar heatmap that uses colour to indicate the amount of activity. What I find interesting is that it’s using two axes of time over a year: days of the month across the X axis and days of the week down the Y axis.

I wanted to try something similar, but showing activity by time of day down the Y axis. A month of activity feels like the right range to display, so I set about adding a calendar heatmap to monthly archives. I already had the data I needed—timestamps of posts. That’s what I was already using to display sparklines. I wrote some code to loop over those timestamps and organise them by day and by hour. Then I spit out a table with days for the columns and clumps of hours for the rows.

I’m using colour (well, different shades of grey) to indicate the relative amounts of activity, but I decided to use size as well. So it’s also a bubble chart.

It doesn’t work very elegantly on small screens: the table is clipped horizontally and can be swiped left and right. Ideally the visualisation itself would change to accommodate smaller screens.

Still, I kind of like the end result. Here’s last month’s activity on my site. Here’s the same time period ten years ago. I’ve also added month heatmaps to the monthly archives for my journal, links, and notes. They’re kind of like an expanded view of the sparklines that are shown with each month.

From one year ago, here’s the daily distribution of

And then here’s the the daily distribution of everything in that month all together.

I realise that the data being displayed is probably only of interest to me, but then, that’s one of the perks of having your own website—you can do whatever you feel like.

Empire State

I’m in New York. Again. This time it’s for Google’s AMP Conf, where I’ll be giving ‘em a piece of my mind on a panel.

The conference starts tomorrow so I’ve had a day or two to acclimatise and explore. Seeing as Google are footing the bill for travel and accommodation, I’m staying at a rather nice hotel close to the conference venue in Tribeca. There’s live jazz in the lounge most evenings, a cinema downstairs, and should I request it, I can even have a goldfish in my room.

Today I realised that my hotel sits in the apex of a triangle of interesting buildings: carrier hotels.

Looming above my hotel is 32 Avenue of the Americas. On the outside the building looks like your classic Gozer the Gozerian style of New York building. Inside, the lobby features a mosaic on the ceiling, and another on the wall extolling the connective power of radio and telephone.

The same architects also designed 60 Hudson Street, which has a similar Art Deco feel to it. Inside, there’s a cavernous hallway running through the ground floor but I can’t show you a picture of it. A security guard told me I couldn’t take any photos inside …which is a little strange seeing as it’s splashed across the website of the building.

I walked around the outside of 60 Hudson, taking more pictures. Another security guard asked me what I was doing. I told her I was interested in the history of the building, which is true; it was the headquarters of Western Union. For much of the twentieth century, it was a world hub of telegraphic communication, in much the same way that a beach hut in Porthcurno was the nexus of the nineteenth century.

For a 21st century hub, there’s the third and final corner of the triangle at 33 Thomas Street. It’s a breathtaking building. It looks like a spaceship from a Chris Foss painting. It was probably designed more like a spacecraft than a traditional building—it’s primary purpose was to withstand an atomic blast. Gone are niceties like windows. Instead there’s an impenetrable monolith that looks like something straight out of a dystopian sci-fi film.

Brutalist on the outside, its interior is host to even more brutal acts of invasive surveillance. The Snowden papers revealed this AT&T building to be a centrepiece of the Titanpointe programme:

They called it Project X. It was an unusually audacious, highly sensitive assignment: to build a massive skyscraper, capable of withstanding an atomic blast, in the middle of New York City. It would have no windows, 29 floors with three basement levels, and enough food to last 1,500 people two weeks in the event of a catastrophe.

But the building’s primary purpose would not be to protect humans from toxic radiation amid nuclear war. Rather, the fortified skyscraper would safeguard powerful computers, cables, and switchboards. It would house one of the most important telecommunications hubs in the United States…

Looking at the building, it requires very little imagination to picture it as the lair of villainous activity. Laura Poitras’s short film Project X basically consists of a voiceover of someone reading an NSA manual, some ominous background music, and shots of 33 Thomas Street looming in its oh-so-loomy way.

A top-secret handbook takes viewers on an undercover journey to Titanpointe, the site of a hidden partnership. Narrated by Rami Malek and Michelle Williams, and based on classified NSA documents, Project X reveals the inner workings of a windowless skyscraper in downtown Manhattan.

Metadata markup

When something on your website is shared on Twitter or Facebook, you probably want a nice preview to appear with it, right?

For Twitter, you can use Twitter cards—a collection of meta elements you place in the head of your document.

For Facebook, you can use the grandiosely-titled Open Graph protocol—a collection of meta elements you place in the head of your document.

What’s that you say? They sound awfully similar? Why, no! I mean, just look at the difference. Here’s how you’d mark up a blog post for Twitter:

<meta name="twitter:url" content="https://adactio.com/journal/9881">
<meta name="twitter:title" content="Metadata markup">
<meta name="twitter:description" content="So many standards to choose from.">
<meta name="twitter:image" content="https://adactio.com/icon.png">

Whereas here’s how you’d mark up the same blog post for Facebook:

<meta property="og:url" content="https://adactio.com/journal/9881">
<meta property="og:title" content="Metadata markup">
<meta property="og:description" content="So many standards to choose from.">
<meta property="og:image" content="https://adactio.com/icon.png">

See? Completely different.

Okay, I’ll attempt to dial down my sarcasm, but I find this wastage annoying. It adds unnecessary complexity, which in turn, I suspect, puts a lot of people off even trying to implement this stuff. In short: 927.

We’ve seen this kind of waste before. I remember when Netscape and Microsoft were battling it out in the browser wars: Internet Explorer added a proprietary acronym element, while Netscape added the abbr element. They both basically did the same thing. For years, Internet Explorer refused to implement the abbr element out of sheer spite.

A more recent example of the negative effects of competing standards was on display at this year’s Edge conference in London. In a session on front-end data, Nolan Lawson decried the fact that developers weren’t making more use of the client-side storage options available in browsers today. After all, there are so many to choose from: LocalStorage, WebSQL, IndexedDB…

(Hint: if developers aren’t showing much enthusiasm for the latest and greatest API which is sooooo much better than the previous APIs they were also encouraged to use at the time, perhaps their reticence is understandable.)

Anyway, back to metacrap.

Matt has written a guide to what you need to do in order to get a preview of your posts to appear in Slack. Fortunately the answer is not yet another collection of meta elements to place in the head of your document. Instead, Slack piggybacks on the existing combatants: oEmbed, Twitter Cards, and Open Graph.

So to placate both Twitter and Facebook (with Slack thrown in for good measure), your metadata markup is supposed to look something like this:

<meta name="twitter:card" content="summary">
<meta name="twitter:site" content="@adactio">
<meta name="twitter:url" content="https://adactio.com/journal/9881">
<meta name="twitter:title" content="Metadata markup">
<meta name="twitter:description" content="So many standards to choose from.">
<meta name="twitter:image" content="https://adactio.com/icon.png">
<meta property="og:url" content="https://adactio.com/journal/9881">
<meta property="og:title" content="Metadata markup">
<meta property="og:description" content="So many standards to choose from.">
<meta property="og:image" content="https://adactio.com/icon.png">

There are two things on display here: redundancy, and also, redundancy.

Now the eagle-eyed amongst you will have spotted a crucial difference between the Twitter metacrap and the Facebook metacrap. The Twitter metacrap uses the name attribute on the meta element, whereas the Facebook metacrap uses the property attribute. Technically, there is no property attribute in HTML—it’s an RDFa thing. But the fact that they’re using two different attributes means that we can squish the meta elements together like this:

<meta name="twitter:card" content="summary">
<meta name="twitter:site" content="@adactio">
<meta name="twitter:url" property="og:url" content="https://adactio.com/journal/9881">
<meta name="twitter:title" property="og:title" content="Metadata markup">
<meta name="twitter:description" property="og:description" content="So many standards to choose from.">
<meta name="twitter:image" property="og:image" content="https://adactio.com/icon.png">

There. I saved you at least a little bit of typing.

The metacrap situation is even more ridiculous for “add to homescreen”/”pin to start”/whatever else browser makers can’t agree on…

Microsoft:

<meta name="msapplication-starturl" content="https://adactio.com" />
<meta name="msapplication-window" content="width=800;height=600">
<meta name="msapplication-tooltip" content="Kill me now...">

Apple:

<link rel="apple-touch-icon" href="https://adactio.com/icon.png">

(Repeat four or five times with different variations of icon sizes, and be sure to create icons with new sizes after every. single. Apple. keynote.)

Fortunately Google, Opera, and Mozilla appear to be converging on using an external manifest file:

<link rel="manifest" href="https://adactio.com/manifest.json">

Perhaps our long national nightmare of balkanised metacrap is finally coming to an end, and clearer heads will prevail.

Hope

Cennydd points to an article by Ev Williams about the pendulum swing between open and closed technology stacks, and how that pendulum doesn’t always swing back towards openness. Cennydd writes:

We often hear the idea that “open platforms always win in the end”. I’d like that: the implicit values of the web speak to my own. But I don’t see clear evidence of this inevitable supremacy, only beliefs and proclamations.

It’s true. I catch myself saying things like “I believe the open web will win out.” Statements like that worry my inner empiricist. Faith-based outlooks scare me, and rightly so. I like being able to back up my claims with data.

Only time will tell what data emerges about the eventual fate of the web, open or closed. But we can look to previous technologies and draw comparisons. That’s exactly what Tim Wu did in his book The Master Switch and Jonathan Zittrain did in The Future Of The Internet—And How To Stop It. Both make for uncomfortable reading because they challenge my belief. Wu points to radio and television as examples of systems that began as egalitarian decentralised tools that became locked down over time in ever-constricting cycles. Cennydd adds:

I’d argue this becomes something of a one-way valve: once systems become closed, profit potential tends to grow, and profit is a heavy entropy to reverse.

Of course there is always the possibility that this time is different. It may well be that fundamental architectural decisions in the design of the internet and the workings of the web mean that this particular technology has an inherent bias towards openness. There is some data to support this (and it’s an appealing thought), but again; only time will tell. For now it’s just one more supposition.

The real question—when confronted with uncomfortable ideas that challenge what you’d like to believe is true—is what do you do about it? Do you look for evidence to support your beliefs or do you discard your beliefs entirely? That second option looks like the most logical course of action, and it’s certainly one that I would endorse if there were proven facts to be acknowledged (like gravity, evolution, or vaccination). But I worry about mistaking an argument that is still being discussed for an argument that has already been decided.

When I wrote about the dangers of apparently self-evident truisms, I said:

These statements aren’t true. But they are repeated so often, as if they were truisms, that we run the risk of believing them and thus, fulfilling their promise.

That’s my fear. Only time will tell whether the closed or open forces will win the battle for the soul of the internet. But if we believe that centralised, proprietary, capitalistic forces are inherently unstoppable, then our belief will help make them so.

I hope that openness will prevail. Hope sounds like such a wishy-washy word, like “faith” or “belief”, but it carries with it a seed of resistance. Hope, faith, and belief all carry connotations of optimism, but where faith and belief sound passive, even downright complacent, hope carries the promise of action.

Margaret Atwood was asked about the futility of having hope in the face of climate change. She responded:

If we abandon hope, we’re cooked. If we rely on nothing but hope, we’re cooked. So I would say judicious hope is necessary.

Judicious hope. I like that. It feels like a good phrase to balance empiricism with optimism; data with faith.

The alternative is to give up. And if we give up too soon, we bring into being the very endgame we feared.

Cennydd finishes:

Ultimately, I vote for whichever technology most enriches humanity. If that’s the web, great. A closed OS? Sure, so long as it’s a fair value exchange, genuinely beneficial to company and user alike.

This is where we differ. Today’s fair value exchange is tomorrow’s monopoly, just as today’s revolutionary is tomorrow’s tyrant. I will fight against that future.

To side with whatever’s best for the end user sounds like an eminently sensible metric to judge a technology. But I’ve written before about where that mindset can lead us. I can easily imagine Asimov’s three laws of robotics rewritten to reflect the ethos of user-centred design, especially that first and most important principle:

A robot may not injure a human being or, through inaction, allow a human being to come to harm.

…rephrased as:

A product or interface may not injure a user or, through inaction, allow a user to come to harm.

Whether the technology driving the system behind that interface is open or closed doesn’t come into it. What matters is the interaction.

But in his later years Asimov revealed the zeroeth law, overriding even the first:

A robot may not harm humanity, or, by inaction, allow humanity to come to harm.

It may sound grandiose to apply this thinking to the trivial interfaces we’re building with today’s technologies, but I think it’s important to keep drilling down and asking uncomfortable questions (even if they challenge our beliefs).

That’s why I think openness matters. It isn’t enough to use whatever technology works right now to deliver the best user experience. If that short-time gain comes with a long-term price tag for our society, it’s not worth it.

I would much rather an imperfect open system to a perfect proprietary one.

I have hope in an open web …judicious hope.

August in America, day twelve

Today was a travel day, but it was a short travel day: the flight from Tucson to San Diego takes just an hour. It took longer to make the drive up from Sierra Vista to Tucson airport.

And what a lovely little airport it is. When we showed up, we were literally the only people checking in and the only people going through security. After security is a calm oasis, free of the distracting TV screens that plague most other airports. Also, it has free WiFi, which was most welcome. I’m relying on WiFi, not 3G, to go online on this trip.

I’ve got my iPhone with me but I didn’t do anything to guarantee myself a good data plan while I’m here in the States. Honestly, it’s not that hard to not always be connected to the internet. Here are a few things I’ve learned along the way:

To avoid accidentally using data and getting charged through the nose for it, you can go into the settings of your iPhone and under General -> Cellular, you can switch “Cellular Data” to “off”. Like it says, “Turn off cellular data to restrict all data to Wi-Fi, including email, web browsing, and push notifications.”
If you do that, and you normally use iMessage, make sure to switch iMessage off. Otherwise if someone with an iPhone in the States sends you an SMS, you won’t get it until the next time you connect to a WiFi network. I learned this the hard way: it happened to me twice on this trip before I realised what was going on.
I use Google Maps rather than Apple Maps. It turns out you can get offline maps on iOS (something that’s been available on Android for quite some time). Open the Google Maps app while you’re still connected to a WiFi network; navigate so that the area you want to save is on the screen; type “ok maps” into the search bar; now that map is saved and zoomable for offline browsing.

Battle for the planet of the APIs

Back in 2006, I gave a talk at dConstruct called The Joy Of API. It basically involved me geeking out for 45 minutes about how much fun you could have with APIs. This was the era of the mashup—taking data from different sources and scrunching them together to make something new and interesting. It was a good time to be a geek.

Anil Dash did an excellent job of describing that time period in his post The Web We Lost. It’s well worth a read—and his talk at The Berkman Istitute is well worth a listen. He described what the situation was like with APIs:

Five years ago, if you wanted to show content from one site or app on your own site or app, you could use a simple, documented format to do so, without requiring a business-development deal or contractual agreement between the sites. Thus, user experiences weren’t subject to the vagaries of the political battles between different companies, but instead were consistently based on the extensible architecture of the web itself.

Times have changed. These days, instead of seeing themselves as part of a wider web, online services see themselves as standalone entities.

So what happened?

Facebook happened.

I don’t mean that Facebook is the root of all evil. If anything, Facebook—a service that started out being based on exclusivity—has become more open over time. That’s the cause of many of its scandals; the mismatch in mental models that Facebook users have built up about how their data will be used versus Facebook’s plans to make that data more available.

No, I’m talking about Facebook as a role model; the template upon which new startups shape themselves.

In the web’s early days, AOL offered an alternative. “You don’t need that wild, chaotic lawless web”, it proclaimed. “We’ve got everything you need right here within our walled garden.”

Of course it didn’t work out for AOL. That proposition just didn’t scale, just like Yahoo’s initial model of maintaining a directory of websites just didn’t scale. The web grew so fast (and was so damn interesting) that no single company could possibly hope to compete with it. So companies stopped trying to compete with it. Instead they, quite rightly, saw themselves as being part of the web. That meant that they didn’t try to do everything. Instead, you built a service that did one thing really well—sharing photos, managing links, blogging—and if you needed to provide your users with some extra functionality, you used the best service available for that, usually through someone else’s API …just as you provided your API to them.

Then Facebook began to grow and grow. I remember the first time someone was showing me Facebook—it was Tantek of all people—I remember asking “But what is it for?” After all, Flickr was for photos, Delicious was for links, Dopplr was for travel. Facebook was for …everything …and nothing.

I just didn’t get it. It seemed crazy that a social network could grow so big just by offering …well, a big social network.

But it did grow. And grow. And grow. And suddenly the AOL business model didn’t seem so crazy anymore. It seemed ahead of its time.

Once Facebook had proven that it was possible to be the one-stop-shop for your user’s every need, that became the model to emulate. Startups stopped seeing themselves as just one part of a bigger web. Now they wanted to be the only service that their users would ever need …just like Facebook.

Seen from that perspective, the open flow of information via APIs—allowing data to flow porously between services—no longer seemed like such a good idea.

Not only have APIs been shut down—see, for example, Google’s shutdown of their Social Graph API—but even the simplest forms of representing structured data have been slashed and burned.

Twitter and Flickr used to markup their user profile pages with microformats. Your profile page would be marked up with hCard and if you had a link back to your own site, it include a rel=”me” attribute. Not any more.

Then there’s RSS.

During the Q&A of that 2006 dConstruct talk, somebody asked me about where they should start with providing an API; what’s the baseline? I pointed out that if they were already providing RSS feeds, they already had a kind of simple, read-only API.

Because there’s a standardised format—a list of items, each with a timestamp, a title, a description (maybe), and a link—once you can parse one RSS feed, you can parse them all. It’s kind of remarkable how many mashups can be created simply by using RSS. I remember at the first London Hackday, one of my favourite mashups simply took an RSS feed of the weather forecast for London and combined it with the RSS feed of upcoming ISS flypasts. The result: a Twitter bot that only tweeted when the International Space Station was overhead and the sky was clear. Brilliant!

Back then, anywhere you found a web page that listed a series of items, you’d expect to find a corresponding RSS feed: blog posts, uploaded photos, status updates, anything really.

That has changed.

Twitter used to provide an RSS feed that corresponded to my HTML timeline. Then they changed the URL of the RSS feed to make it part of the API (and therefore subject to the terms of use of the API). Then they removed RSS feeds entirely.

On the Salter Cane site, I want to display our band’s latest tweets. I used to be able to do that by just grabbing the corresponding RSS feed. Now I’d have to use the API, which is a lot more complex, involving all sorts of authentication gubbins. Even then, according to the terms of use, I wouldn’t be able to display my tweets the way I want to. Yes, how I want to display my own data on my own site is now dictated by Twitter.

Thanks to Jo Brodie I found an alternative service called Twitter RSS that gives me the RSS feed I need, ‘though it’s probably only a matter of time before that gets shuts down by Twitter.

Jo’s feelings about Twitter’s anti-RSS policy mirror my own:

I feel a pang of disappointment at the fact that it was really quite easy to use if you knew little about coding, and now it might be a bit harder to do what you easily did before.

That’s the thing. It’s not like RSS is a great format—it isn’t. But it’s just good enough and just versatile enough to enable non-programmers to make something cool. In that respect, it’s kind of like HTML.

The official line from Twitter is that RSS is “infrequently used today.” That’s the same justification that Google has given for shutting down Google Reader. It reminds of the joke about the shopkeeper responding to a request for something with “Oh, we don’t stock that—there’s no call for it. It’s funny though, you’re the fifth person to ask today.”

RSS is used a lot …but much of the usage is invisible:

RSS is plumbing. It’s used all over the place but you don’t notice it.

That’s from Brent Simmons, who penned a love letter to RSS:

If you subscribe to any podcasts, you use RSS. Flipboard and Twitter are RSS readers, even if it’s not obvious and they do other things besides.

He points out the many strengths of RSS, including its decentralisation:

It’s anti-monopolist. By design it creates a level playing field.

How foolish of us, therefore, that we ended up using Google Reader exclusively to power all our RSS consumption. We took something that was inherently decentralised and we locked it up into one provider. And now that provider is going to screw us over.

I hope we won’t make that mistake again. Because, believe me, RSS is far from dead just because Google and Twitter are threatened by it.

In a post called The True Web, Robin Sloan reiterates the strength of RSS:

It will dip and diminish, but will RSS ever go away? Nah. One of RSS’s weaknesses in its early days—its chaotic decentralized weirdness—has become, in its dotage, a surprising strength. RSS doesn’t route through a single leviathan’s servers. It lacks a kill switch.

I can understand why that power could be seen as a threat if what you are trying to do is force your users to consume their own data only the way that you see fit (and all in the name of “user experience”, I’m sure).

Returning to Anil’s description of the web we lost:

We get a generation of entrepreneurs encouraged to make more narrow-minded, web-hostile products like these because it continues to make a small number of wealthy people even more wealthy, instead of letting lots of people build innovative new opportunities for themselves on top of the web itself.

I think that the presence or absence of an RSS feed (whether I actually use it or not) is a good litmus test for how a service treats my data.

Instagram doesn’t provide an RSS feed of my uploaded photos.
Twitter doesn’t provide an RSS feed of my tweets.
Facebook doesn’t provide an RSS feed of my band’s updates

It might be that RSS is the canary in the coal mine for my data on the web.

If those services don’t trust me enough to give me an RSS feed, why should I trust them with my data?

Canvas sparklines

I like sparklines a lot. Tufte describes a sparkline as:

…a small intense, simple, word-sized graphic with typographic resolution.

Four years ago, I added sparklines to Huffduffer using Google’s chart API. That API comes in two flavours: a JavaScript API for client-side creation of graphs, and image charts for server-side rendering of charts as PNGs.

The image API is really useful: there’s no reliance on JavaScript, it works in every browser capable of displaying images, and it’s really flexible and customisable. Therefore it is, of course, being deprecated.

The death warrant for Google image charts sets the execution date for 2015. Time to start looking for an alternative.

I couldn’t find a direct equivalent to the functionality that Google provides i.e. generating the images dynamically on the server. There are, however, plenty of client-side alternatives, many of them using canvas.

Most of the implementations I found were a little heavy-handed for my taste: they either required jQuery or Processing or both. I just wanted a quick little script for generating sparklines from a dataset of numbers. So I wrote my own.

I’ve put my code up on Github as Canvas Sparkline.

Here’s the JavaScript. You create a canvas element with the dimensions you want for the sparkline, then pass the ID of that element (along with your dataset) into the sparkline function:

sparkline ('canvasID', [12, 18, 13, 12, 11, 15, 17, 20, 15, 12, 8, 7, 9, 11], true);

(that final Boolean value at the end just indicates whether you want a red dot at the end of the sparkline).

The script takes care of normalising the values, so it doesn’t matter how many numbers are in the dataset or whether the range of the numbers is in the tens, hundreds, thousands, or hundreds of thousands.

There’s plenty of room for improvement:

The colour of the sparkline is hardcoded (50% transparent black) but it could be passed in as a value.
All the values should probably be passed in as an array of options rather than individual parameters.

Feel free to fork, adapt, and improve.

The sparklines are working quite nicely, but I can’t help but feel that this isn’t the right tool for the job. Ideally, I’d like to keep using a server-side solution like Google’s image charts. But if I am going to use a client-side solution, I’m not sure that canvas is the right element. This should really be SVG: canvas is great for dynamic images and animations that need to update quite quickly, but sparklines are generally pretty static. If anyone fancies making a lightweight SVG solution for sparklines, that would be lovely.

In the meantime, you can see Canvas Sparkline in action on the member profiles at The Session, like here, here, here, or here.

Update: Ask and thou shalt receive. Check out this fantastic lightweight SVG solution from Stuart—bloody brilliant!

Generating placeholders from datalists

Here’s a cute little markup pattern for ya.

Suppose you’ve got an input element that has—by means of a list attribute—an associated datalist. Here’s the example I used in HTML5 For Web Designers:

<label for="homeworld">Your home planet</label>
<input type="text" name="homeworld" id="homeworld" list="planets">
<datalist id="planets">
 <option value="Mercury">
 <option value="Venus">
 <option value="Earth">
 <option value="Mars">
 <option value="Jupiter">
 <option value="Saturn">
 <option value="Uranus">
 <option value="Neptune">
</datalist>

That results in a combo-box control in supporting browsers: as you type in the text field, you are presented with a subset of the options in the datalist that match what you are typing. It’s more powerful than a regular select, because you aren’t limited by the list of options: you’re free to type something that isn’t in the list (like, say, “Pluto”).

I’ve already written about the design of datalist and how you can use a combination of select and input using the same markup to be backward-compatible. I like datalist.

I also like the placeholder attribute. Another recent addition to HTML, this allows you to show an example of the kind of content you’d like the user to enter (note: this is not the same as a label).

It struck me recently that all the options in a datalist are perfectly good candidates for placeholder text. In the example above, I could update the input element to include:

<input type="text" name="homeworld" id="homeworld" list="planets" placeholder="Mars">

or:

<input type="text" name="homeworld" id="homeworld" list="planets" placeholder="Saturn">

I wrote a little piece of JavaScript to do this:

Loop through all the input elements that have a list attribute.
Find the corresponding datalist element (its ID will match the list attribute).
Pick a random option element from that datalist.
Set the placeholder value of the input to that option value.

Put that JavaScript at the end of your document (or link to it from the end of your document) and you’re all set. You might want to tweak it a little: I find it helps to preface placeholder values with “e.g.” to make it clear that this is an example value. You can do that by changing the last line of the script:

input.setAttribute('placeholder','e.g. '+value);

You also might want to show more than one possible value. You might want the placeholder value to read “e.g. Mercury, Venus, Earth, etc.” …I’ll leave that as an exercise for the reader.

Hacking History

I spent the weekend at The Guardian offices in London at History Hack Day. It was rather excellent. You’d think I’d get used to the wonderful nature of these kinds of events, but I once again I experienced the same level of amazement that I experienced the first time I went to hack day.

The weekend kicked off in the traditional way with some quickfire talks. Some lovely people from The British Museum, The British Library and The National Archives talked about their datasets, evangelists from Yahoo and Google talked about YQL and Fusion Tables, and Max Gadney and Matthew Sheret got us thinking in the right directions.

Matthew Sheret was particularly inspiring, equating hackers with time travellers, and encouraging us to find and explore the stories within the data of history. The assembled geeks certainly took that message to heart.

Ben Griffiths told the story of his great-uncle, who died returning from a bomber raid on Bremen in 1941. Using data to put the death in context, Ben approached the story of the lost bomber with sensitivity.

Simon created geStation, a timeline of when railway stations opened in the UK. On the face of it, it sounds like just another mashup of datetimes and lat-long coordinates. But when you run it, you can see the story of the industrial revolution emerge on the map.

Similarly, Gareth Lloyd and Tom Martin used Wikipedia data to show the emerging shape of the world over time in their video A History of the World in 100 Seconds, a reference to the BBC’s History of the World in 100 Objects for which Cristiano built a thoroughly excellent mobile app to help you explore the collection at British Museum.

Brian used the Tropo API to make a telephone service that will find a passenger on the Titanic who was the same age and sex as you, and then tell you if they made it onto a lifeboat or not. Hearing this over the phone makes the story more personal somehow. Call +1 (804) 316-9215 in the US, +44 2035 142721 in the UK, or +990009369991481398 on Skype to try it for yourself.

I was so impressed with the Tropo API that I spent most of History Hack Day working on a little something for Huffduffer …more on that later.

My contribution to the hack day was very modest, but it was one of the few to involve something non-digital. It’s called London On A Stick.

A pile of USB sticks had been donated to History Hack Day, but nobody was making much use of them so I thought they could be used as fodder for Dead Drops. I took five USB sticks and placed a picture from The National Archives on Flickr Commons on each one. Each picture was taken somewhere in London and has been geotagged.

I slapped sticky notes on the USB sticks with the location of the picture. Then I asked for volunteers to go out and place the sticks at the locations of the pictures: Paddington, Trafalgar Square, Upper Lambeth, St. Paul’s and Tower Bridge. Not being a Londoner myself, I’m relying on the natives to take up the challenge. You can find the locations at icanhaz.com/londononastick. I ducked out of History Hack Day a bit early to get back to Brighton so I have no idea if the five sticks were claimed.

Although my contribution to History Hack Day was very modest, I had a really good time. Matt did a great job putting on an excellent event.

It was an eye-opening weekend. This hack day put the “story” back into history.

The design of datalist

One of the many form enhancements provided by HTML5 is the datalist element. It allows you to turn a regular input field into a combo-box.

Using the list attribute on an input, you can connect it to a datalist with the corresponding ID. The datalist itself contains a series of option elements.

<input list="suggestions">
<datalist id="suggestions">
    <option value="foo"></option>
    <option value="bar"></option>
    <option value="baz"></option>
</datalist>

I can imagine a number of use cases for this:

“Share this” forms, like the one on Last.fm, that allow you to either select from your contacts on the site, or enter email addresses, separated by commas. Using input type="email" with a multiple attribute, in combination with a datalist would work nicely.
Entering the details for an event, where you can either select from a list of venues or, if the venue is not listed, create a new one.
Just about any form that has a selection of choices, of which the last choice is “other”, followed by “If other, please specify…”

You can take something like this:

<label for="source">How did you hear about us?</label>
<select name="source">
    <option>please choose...</option>
    <option value="television">Television</option>
    <option value="radio">Radio</option>
    <option value="newspaper">Newspaper</option>
    <option>Other</option>
</select>
If other, please specify:
<input id="source" name="source">

And replace it with this:

<label for="source">How did you hear about us?</label>
<datalist id="sources">
    <option value="television"></option>
    <option value="radio"></option>
    <option value="newspaper"></option>
</datalist>
<input id="source" name="source" list="sources">

The datalist element has been designed according to one of the design principles driving HTML5—Degrade Gracefully:

On the World Wide Web, authors are often reluctant to use new language features that cause problems in older user agents, or that do not provide some sort of graceful fallback. HTML 5 document conformance requirements should be designed so that Web content can degrade gracefully in older or less capable user agents, even when making use of new elements, attributes, APIs and content models.

Because the datalist element contains a series of option elements with value attributes, it is effectively invisible to user-agents that don’t support the datalist element. That means you can use the datalist element without worrying it “breaking” in older browsers.

If you wanted, you could include a message for non-supporting browsers:

<datalist id="sources">
    Your browser doesn't support datalist!
    <option value="television"></option>
    <option value="radio"></option>
    <option value="newspaper"></option>
</datalist>

That message—“Your browser doesn’t support datalist!”—will be visible in older browsers, but browsers that support datalist know not to show anything that’s not an option. But displaying a message like this for older browsers is fairly pointless; I certainly wouldn’t consider it graceful degradation.

In my opinion, one of the best aspects of the design of the datalist element is that you can continue to do things the old-fashioned way—using a select and an input—and at the same time start using datalist. There’s no violation of the DRY principle either; you can use the same option elements for the select and the datalist:

<label for="source">How did you hear about us?</label>
<datalist id="sources">
    <select name="source">
        <option>please choose...</option>
        <option value="television">Television</option>
        <option value="radio">Radio</option>
        <option value="newspaper">Newspaper</option>
        <option>Other</option>
    </select>
    If other, please specify:
</datalist>
<input id="source" name="source" list="sources">

Browsers that support datalist will display the label “How did you hear about us?” followed by a combo-box text field that allows the user to select an option, or enter some free text.

Browsers that don’t support datalist will display the the label “How did you hear about us?” followed by a drop-down list of of options (the last of which is “other”), followed by the text “If other, please specifiy”, followed by a text field.

Take a look at this example in Opera to see datalist in operation. Take a look at it in any other browser to see the fallback. The source is on Github if you’d like to play around with it.

WebKit’s mistake

If you try that example in any browser, you’ll get something that works; either through datalist support or through the select fallback …unless the browser is using WebKit.

It took me a while to figure out why this would be the case. I didn’t think that Safari or Chrome supported datalist and a little digging around with object detection in JavaScript confirmed this. So why don’t those browsers follow the standard behaviour and simply ignore the element they don’t understand and render what’s inside it instead.

Here’s the problem: line 539 of WebKit’s default CSS:

datalist {
    display: none;
}

This is pretty much the worst possible behaviour for a browser to implement. An element should either be recognised—like p, h1 or img—and parsed accordingly, or an element is unrecognised—like foo or bar—and ignored. WebKit does not support the datalist element (even in the current nightly build), so the element should be ignored.

Fortunately the problem is easily rectified by adding something like this to your own stylesheet:

datalist {
    display: inline-block;
}

I’ve submitted a bug report on the WebKit Bugzilla.

Update: That Webkit bug has now been fixed so the extra CSS is no longer necessary.

The format of The Long Now

In 01992, Tim Berners-Lee wrote a document called HTML Tags.

In September 02001, I started keeping this online journal. Back then, I was storing my data in XML, using a format of my own invention. The XML was converted using PHP into (X)HTML, RSS, and potentially anything else …although the “anything else” part never really materialised.

In February 02006, I switched over to using a MySQL database to store my data as chunks of markup.

In February 02007, Tess wrote about data longevity

To me, being able to completely migrate my data — with minimal bit-rot — from system to system is the key in the never-ending and easily-lost fight to keep my data accessible over the entirety of my life.

She’s using non-binary, well-documented standards to store and structure her data: Atom, HTML and microformats.

Meanwhile, the HTML5 spec began defining error-handling for HTML documents. Ian Hickson wrote:

The original reason I got involved in this work is that I realised that the human race has written literally billions of electronic documents, but without ever actually saying how they should be processed.

I decided that for the sake of our future generations we should document exactly how to process today’s documents, so that when they look back, they can still reimplement HTML browsers and get our data back, even if they no longer have access to Microsoft Internet Explorer’s source code.

In August 2008, Ian Hickson mentioned in an interview that the timeline for HTML5 involves having two complete implementations by 02022. Many web developers were disgusted that such a seemingly far-off date was even being mentioned. My reaction was the opposite. I began to pay attention to HTML5.

HTML is starting to look like a relatively safe bet for data longevity and portability. I’m not sure the same can be said for any particular flavour of database. Sooner rather than later, I should remove the unnecessary layer of abstraction that I’m using to store my data.

This would be my third migration of content. I will take care to head Mark Pilgrim’s advice on data fidelity:

Long-term data preservation is like long-term backup: a series of short-term formats, punctuated by a series of migrations. But migrating between data formats is not like copying raw data from one medium to another.

Fidelity is not a binary thing. Data can gradually degrade with each conversion until you’re left with crap. People think this only affects the analog world, like copying cassette tapes for several generations. But I think digital preservation is actually much harder, in part because people don’t even realize that it has the same issues.

He’s also betting on HTML:

HTML is not an output format. HTML is The Format. Not The Format Of Forever, but damn if it isn’t The Format Of The Now.

I don’t think that any format could ever be The Format Of The Long Now but HTML is the closest we’ve come thus far in the history of computing to having a somewhat stable, human- and machine-readable data format with a decent chance of real longevity.

Older »