Journal tags: interface

51

Performative performance

Web Summer Camp in Croatia finished with an interesting discussion. It was labelled a town-hall meeting, but it was more like an Oxford debating club.

Two speakers had two minutes each to speak for or against a particular statement. Their stances were assigned to them so they didn’t necessarily believe what they said.

One of the propositions was something like:

In the future, sustainable design will be as important as UX or performance.

That’s a tough one to argue against! But that’s what Sophia had to do.

She actually made a fairly compelling argument. She said that real impact isn’t going to come from individual websites changing their colour schemes. Real impact is going to come from making server farms run on renewable energy. She advocated for political action to change the system rather than having the responsibility heaped on the shoulders of the individuals making websites.

It’s a fair point. Much like the concept of a personal carbon footprint started life at BP to distract from corporate responsibility, perhaps we’re going to end up navel-gazing into our individual websites when we should be collectively lobbying for real change.

It’s akin to clicktivism—thinking you’re taking action by sharing something on social media, when real action requires hassling your political representative.

I’ve definitely seen some examples of performative sustainability on websites.

For example, at the start of this particular debate at Web Summer Camp we were shown a screenshot of a municipal website that has a toggle. The toggle supposedly enables a low-carbon mode. High resolution images are removed and for some reason the colour scheme goes grayscale. But even if those measures genuinely reduced energy consumption, it’s a bit late to enact them only after the toggle has been activated. Those hi-res images have already been downloaded by then.

Defaults matter. To be truly effective, the toggle needs to work the other way. Start in low-carbon mode, and only download the hi-res images when someone specifically requests them. (Hopefully browsers will implement prefers-reduced-data soon so that we can have our sustainable cake and eat it.)

Likewise I’ve seen statistics bandied about around the energy-savings that could be made if we used dark colour schemes. I’m sure the statistics are correct, but I’d like to see them presented side-by-side with, say, the energy impact of Google Tag Manager or React or any other wasteful dependencies that impact performance invisibly.

That’s the crux. Most of the important work around energy usage on websites is invisible. It’s the work done to not add more images, more JavaScript or more web fonts.

And it’s not just performance. I feel like the more important the work, the more likely it is to be invisible: privacy, security, accessibility …those matter enormously but you can’t see when a website is secure, or accessible, or not tracking you.

I suspect this is why those areas are all frustratingly under-resourced. Why pour time and effort into something you can’t point at?

Now that I think about it, this could explain the rise of web accessibility overlays. If you do the real work of actually making a website accessible, your work will be invisible. But if you slap an overlay on your website, it looks like you’re making a statement about how much you care about accessibility (even though the overlay is total shit and does more harm than good).

I suspect there might be a similar mindset at work when it comes to interface toggles for low-carbon mode. It might make you feel good. It might make you look good. But it’s a poor substitute for making your website carbon-neutral by default.

Guessing

The last talk at the last dConstruct was by local clever clogs Anil Seth. It was called Your Brain Hallucinates Your Conscious Reality. It’s well worth a listen.

Anil covers a lot of the same ground in his excellent book, Being You. He describes a model of consciousness that inverts our intuitive understanding.

We tend to think of our day-to-day reality in a fairly mechanical cybernetic manner; we receive inputs through our senses and then make decisions about reality informed by those inputs.

As another former dConstruct speaker, Adam Buxton, puts it in his interview with Anil, it feels like that old Beano cartoon, the Numskulls, with little decision-making homonculi inside our head.

But Anil posits that it works the other way around. We make a best guess of what the current state of reality is, and then we receive inputs from our senses, and then we adjust our model accordingly. There’s still a feedback loop, but cause and effect are flipped. First we predict or guess what’s happening, then we receive information. Rinse and repeat.

The book goes further and applies this to our very sense of self. We make a best guess of our sense of self and then adjust that model constantly based on our experiences.

There’s a natural tendency for us to balk at this proposition because it doesn’t seem rational. The rational model would be to make informed calculations based on available data …like computers do.

Maybe that’s what sets us apart from computers. Computers can make decisions based on data. But we can make guesses.

Enter machine learning and large language models. Now, for the first time, it appears that computers can make guesses.

The guess-making is not at all like what our brains do—large language models require enormous amounts of inputs before they can make a single guess—but still, this should be the breakthrough to be shouted from the rooftops: we’ve taught machines how to guess!

And yet. Almost every breathless press release touting some revitalised service that uses AI talks instead about accuracy. It would be far more honest to tout the really exceptional new feature: imagination.

Using AI, we will guess who should get a mortgage.

Using AI, we will guess who should get hired.

Using AI, we will guess who should get a strict prison sentence.

Reframed like that, it’s easy to see why technologists want to bury the lede.

Alas, this means that large language models are being put to use for exactly the wrong kind of scenarios.

(This, by the way, is also true of immersive “virtual reality” environments. Instead of trying to accurately recreate real-world places like meeting rooms, we should be leaning into the hallucinatory power of a technology that can generate dream-like situations where the pleasure comes from relinquishing control.)

Take search engines. They’re based entirely on trust and accuracy. Introducing a chatbot that confidentally conflates truth and fiction doesn’t bode well for the long-term reputation of that service.

But what if this is an interface problem?

Currently facts and guesses are presented with equal confidence, hence the accurate descriptions of the outputs as bullshit or mansplaining as a service.

What if the more fanciful guesses were marked as such?

As it is, there’s a “temperature” control that can be adjusted when generating these outputs; the more the dial is cranked, the further the outputs will stray from the safest predictions. What if that could be reflected in the output?

I don’t know what that would look like. It could be typographic—some markers to indicate which bits should be taken with pinches of salt. Or it could be through content design—phrases like “Perhaps…”, “Maybe…” or “It’s possible but unlikely that…”

I’m sure you’ve seen the outputs when people request that ChatGPT write their biography. Perfectly accurate statements are generated side-by-side with complete fabrications. This reinforces our scepticism of these tools. But imagine how differently the fabrications would read if they were preceded by some simple caveats.

A little bit of programmed humility could go a long way.

Right now, these chatbots are attempting to appear seamless. If 80% or 90% of their output is accurate, then blustering through the other 10% or 20% should be fine, right? But I think the experience for the end user would be immensely more empowering if these chatbots were designed seamfully. Expose the wires. Show the workings-out.

Mind you, that only works if there is some way to distinguish between fact and fabrication. If there’s no way to tell how much guessing is happening, then that’s a major problem. If you can’t tell me whether something is 50% true or 75% true or 25% true, then the only rational response is to treat the entire output as suspect.

I think there’s a fundamental misunderstanding behind the design of these chatbots that goes all the way back to the Turing test. There’s this idea that the way to make a chatbot believable and trustworthy is to make it appear human, attempting to hide the gears of the machine. But the real way to gain trust is through honesty.

I want a machine to tell me when it’s guessing. That won’t make me trust it less. Quite the opposite.

After all, to guess is human.

Tweaking navigation labelling

I’ve always liked the idea that your website can be your API. Like, you’ve already got URLs to identify resources, so why not make that URL structure predictable and those resources parsable?

That’s why the (read-only) API for The Session doesn’t live at a separate subdomain. It uses the same URL structure as the regular site, but you can request the resources in an alternative format: JSON, XML, RSS.

This works out pretty well, mostly because I put a lot of thought into the URL structure of the site. I’m something of a URL fetishist, but I think that taking a URL-first approach to information architecture can be a good exercise.

Most of the resources on The Session involve nouns like tunes, events, discussions, and so on. There’s a consistent and predictable structure to the URLs for those sections:

  • /things
  • /things/new
  • /things/search

And then an idividual item can be found at:

  • things/ID

That’s all nice and predictable and the naming of the URLs matches what you’d expect to find:

Tunes, events, discussions, sessions. Those are all fine. But there’s one section of the site that has this root URL:

/recordings

When I was coming up with the URL structure twenty years ago, it was clear what you’d find there: track listings for albums of music. No one would’ve expected to find actual recordings of music available to listen to on-demand. The bandwidth constraints and technical limitations of the time made that clear.

Two decades on, the situation has changed. Now someone new to the site might well expect to hit a link called “recordings” and expect to hear actual recordings of music.

So I should probably change the label on the link. I don’t think “albums” is quite right—what even is an album any more? The word “discography” is probably the most appropriate label.

Here’s my dilemma: if I update the label, should I also update the URL structure?

Right now, the section of the site with /tunes URLs is labelled “tunes”. The section of the site with /events URLs is labelled “events”. Currently the section of the site with /recordings URLs is labelled “recordings”, but may soon be labelled “discography”.

If you click on “tunes”, you end up at /tunes. But if you click on “discography”, you end up at /recordings.

Is that okay? Am I the only one that would be bothered by that?

I could update the URLs to match the labelling (with redirects for the old URLs, of course), but I’m not so keen on this URL structure:

  • /discography
  • /discography/new
  • /discography/search
  • /discography/ID

It doesn’t seem as tidy as:

  • /recordings
  • /recordings/new
  • /recordings/search
  • /recordings/ID

But if I don’t update the URLs to match the label, then I’m just going to have to live with the mismatch.

I’m just thinking out loud here. I think I should definitely update the label. I just won’t make any decision on changing URLs for a while yet.

Overloading buttons

It’s been almost two years since I added audio playback on The Session. The interface is quite straightforward. For any tune setting, there’s a button that says “play audio”. When you press that button, audio plays and the button’s text changes to “pause audio.”

By updating the button’s text like this, I’m updating the button’s accessible name. In other situations, where the button text doesn’t change, you can indicate whether a button is active or not by toggling the aria-pressed attribute. I’ve been doing that on the “share” buttons that act as the interface for a progressive disclosure. The label on the button—“share”—doesn’t change when the button is pressed. For that kind of progressive disclosure pattern, the button also has an aria-controls and aria-expanded attribute.

From all the advice I’ve read about button states, you should either update the accessible name or change the aria-pressed attribute, but not both. That would lead to the confusing situation of having a button labelled “pause audio” as having a state of “pressed” when in fact the audio is playing.

That was all fine until I recently added some more functionality to The Session. As well as being able to play back audio, you can now adjust the tempo of the playback speed. The interface element for this is a slider, input type="range".

But this means that the “play audio” button now does two things. It plays the audio, but it also acts as a progressive disclosure control, revealing the tempo slider. The button is simultaneously a push button for playing and pausing music, and a toggle button for showing and hiding another interface element.

So should I be toggling the aria-pressed attribute now, even though the accessible name is changing? Or is it enough to have the relationship defined by aria-controls and the state defined by aria-expanded?

Based on past experience, my gut feeling is that I’m probably using too much ARIA. Maybe it’s an anti-pattern to use both aria-expanded and aria-pressed on a progressive disclosure control.

I’m kind of rubber-ducking here, and now that I’ve written down what I’m thinking, I’m pretty sure I’m going to remove the toggling of aria-pressed in any situation where I’m already toggling aria-expanded.

What I really need to do is enlist the help of actual screen reader users. There are a number of members of The Session who use screen readers. I should get in touch and see if the new functionality makes sense to them.

Directory enquiries

I was talking to someone recently about a forgotten battle in the history of the early web. It was a battle between search engines and directories.

These days, when the history of the web is told, a whole bunch of services get lumped into the category of “competitors who lost to Google search”: Altavista, Lycos, Ask Jeeves, Yahoo.

But Yahoo wasn’t a search engine, at least not in the same way that Google was. Yahoo was a directory with a search interface on top. You could find what you were looking for by typing or you could zero in on what you were looking for by drilling down through a directory structure.

Yahoo wasn’t the only directory. DMOZ was an open-source competitor. You can still experience it at DMOZlive.com:

The official DMOZ.com site was closed by AOL on February 17th 2017. DMOZ Live is committed to continuing to make the DMOZ Internet Directory available on the Internet.

Search engines put their money on computation, or to use today’s parlance, algorithms (or if you’re really shameless, AI). Directories put their money on humans. Good ol’ information architecture.

It turned out that computation scaled faster than humans. Search won out over directories.

Now an entire generation has been raised in the aftermath of this battle. Monica Chin wrote about how this generation views the world of information:

Catherine Garland, an astrophysicist, started seeing the problem in 2017. She was teaching an engineering course, and her students were using simulation software to model turbines for jet engines. She’d laid out the assignment clearly, but student after student was calling her over for help. They were all getting the same error message: The program couldn’t find their files.

Garland thought it would be an easy fix. She asked each student where they’d saved their project. Could they be on the desktop? Perhaps in the shared drive? But over and over, she was met with confusion. “What are you talking about?” multiple students inquired. Not only did they not know where their files were saved — they didn’t understand the question.

Gradually, Garland came to the same realization that many of her fellow educators have reached in the past four years: the concept of file folders and directories, essential to previous generations’ understanding of computers, is gibberish to many modern students.

Dr. Saavik Ford confirms:

We are finding a persistent issue with getting (undergrad, new to research) students to understand that a file/directory structure exists, and how it works. After a debrief meeting today we realized it’s at least partly generational.

We live in a world ordered only by search:

While some are quite adept at using labels, tags, and folders to manage their emails, others will claim that there’s no need to do because you can easily search for whatever you happen to need. Save it all and search for what you want to find. This is, roughly speaking, the hot mess approach to information management. And it appears to arise both because search makes it a good-enough approach to take and because the scale of information we’re trying to manage makes it feel impossible to do otherwise. Who’s got the time or patience?

There are still hold-outs. You can prise files from Scott Jenson’s cold dead hands.

More recently, Linus Lee points out what we’ve lost by giving up on directory structures:

Humans are much better at choosing between a few options than conjuring an answer from scratch. We’re also much better at incrementally approaching the right answer by pointing towards the right direction than nailing the right search term from the beginning. When it’s possible to take a “type in a query” kind of interface and make it more incrementally explorable, I think it’s almost always going to produce a more intuitive and powerful interface.

Directory structures still make sense to me (because I’m old) but I don’t have a problem with search. I do have a problem with systems that try to force me to search when I want to drill down into folders.

I have no idea what Google Drive and Dropbox are doing but I don’t like it. They make me feel like the opposite of a power user. Trying to find a file using their interfaces makes me feel like I’m trying to get a printer to work. Randomly press things until something happens.

Anyway. Enough fist-shaking from me. I’m going to ponder Linus’s closing words. Maybe defaulting to a search interface is a cop-out:

Text search boxes are easy to design and easy to add to apps. But I think their ease on developers may be leading us to ignore potential interface ideas that could let us discover better ideas, faster.

Control

In two of my recent talks—In And Out Of Style and Design Principles For The Web—I finish by looking at three different components:

  1. a button,
  2. a dropdown, and
  3. a datepicker.

In each case you could use native HTML elements:

  1. button,
  2. select, and
  3. input type="date".

Or you could use divs with a whole bunch of JavaScript and ARIA.

In the case of a datepicker, I totally understand why you’d go for writing your own JavaScript and ARIA. The native HTML element is quite restricted, especially when it comes to styling.

In the case of a dropdown, it’s less clear-cut. Personally, I’d use a select element. While it’s currently impossible to style the open state of a select element, you can style the closed state with relative ease. That’s good enough for me.

Still, I can understand why that wouldn’t be good enough for some cases. If pixel-perfect consistency across platforms is a priority, then you’re going to have to break out the JavaScript and ARIA.

Personally, I think chasing pixel-perfect consistency across platforms isn’t even desirable, but I get it. I too would like to have more control over styling select elements. That’s one of the reasons why the work being done by the Open UI group is so important.

But there’s one more component: a button.

Again, you could use the native button element, or you could use a div or a span and add your own JavaScript and ARIA.

Now, in this case, I must admit that I just don’t get it. Why wouldn’t you just use the native button element? It has no styling issues and the browser gives you all the interactivity and accessibility out of the box.

I’ve been trying to understand the mindset of a developer who wouldn’t use a native button element. The easy answer would be that they’re just bad people, and dismiss them. But that would probably be lazy and inaccurate. Nobody sets out to make a website with poor performance or poor accessibility. And yet, by choosing not to use the native HTML element, that’s what’s likely to happen.

I think I might have finally figured out what might be going on in the mind of such a developer. I think the issue is one of control.

When I hear that there’s a native HTML element—like button or select—that comes with built-in behaviours around interaction and accessibility, I think “Great! That’s less work for me. I can just let the browser deal with it.” In other words, I relinquish control to the browser (though not entirely—I still want the styling to be under my control as much as possible).

But I now understand that someone else might hear that there’s a native HTML element—like button or select—that comes with built-in behaviours around interaction and accessibility, and think “Uh-oh! What if there unexpected side-effects of these built-in behaviours that might bite me on the ass?” In other words, they don’t trust the browsers enough to relinquish control.

I get it. I don’t agree. But I get it.

If your background is in computer science, then the ability to precisely predict how a programme will behave is a virtue. Any potential side-effects that aren’t within your control are undesirable. The only way to ensure that an interface will behave exactly as you want is to write it entirely from scratch, even if that means using more JavaScript and ARIA than is necessary.

But I don’t think it’s a great mindset for the web. The web is filled with uncertainties—browsers, devices, networks. You can’t possibly account for all of the possible variations. On the web, you have to relinquish some control.

Still, I’m glad that I now have a bit more insight into why someone would choose to attempt to retain control by using div, JavaScript and ARIA. It’s not what I would do, but I think I understand the motivation a bit better now.

Scale

A few years back, Jessica got a ceiling fan for our living room. This might seem like a strange decision, considering we live in England. Most of the time, the problem in this country is that it’s too cold.

But then you get situations like this week, when the country experienced the hottest temperatures ever recorded. I was very, very grateful for that ceiling fan. It may not get used for most of the year, but on the occasions when it’s needed, it’s a godsend. And it’s going to get used more and more often, given the inexorable momentum of the climate emergency.

Even with the ceiling fan, it was still very hot in the living room. I keep my musical instruments in that room, and they all responded to the changing temperature. The strings on my mandolin, bouzouki, and guitar went looser in the heat. The tuning dropped by at least a semitone.

I tuned them back up, but then I had to be careful when the extreme heat ended and the temperature began to drop. The strings began to tighten accordingly. My instruments went up a semitone.

I was thinking about this connection between sound and temperature when I was tuning the instruments back down again.

The electronic tuner I use shows the current tone in relation to the desired note: G, D, A, E. If the string is currently producing a tone that’s lower than, say, A, the tuner displays the difference on its little screen as lines behind the ideal A position. If the string is producing a tone higher than A, the lines appear in front of the desired note.

What if we thought about temperature like this? Instead of weather apps showing the absolute temperature in degrees, what if they showed the relative distance from a predefined ideal? Then you could see at a glance whether it’s a little cooler than you’d like, or a little hotter than you’d like.

Perhaps an interface like that would let you see at a glance how out of the tune the current temperature is.

Accessibility testing

I was doing some accessibility work with a client a little while back. It was mostly giving their site the once-over, highlighting any issues that we could then discuss. It was an audit of sorts.

While I was doing this I started to realise that not all accessibility issues are created equal. I don’t just mean in their severity. I mean that some issues can—and should—be caught early on, while other issues can only be found later.

Take colour contrast. This is something that should be checked before a line of code is written. When designs are being sketched out and then refined in a graphical editor like Figma, that’s the time to check the ratio between background and foreground colours to make sure there’s enough contrast between them. You can catch this kind of thing later on, but by then it’s likely to come with a higher cost—you might have to literally go back to the drawing board. It’s better to find the issue when you’re at the drawing board the first time.

Then there’s the HTML. Most accessibility issues here can be caught before the site goes live. Usually they’re issues of ommission: form fields that don’t have an explicitly associated label element (using the for and id attributes); images that don’t have alt text; pages that don’t have sensible heading levels or landmark regions like main and nav. None of these are particularly onerous to fix and they come with the biggest bang for your buck. If you’ve got sensible forms, sensible headings, alt text on images, and a solid document structure, you’ve already covered the vast majority of accessibility issues with very little overhead. Some of these checks can also be automated: alt text for images; labels for inputs.

Then there’s interactive stuff. If you only use native HTML elements you’re probably in the clear, but chances are you’ve got some bespoke interactivity on your site: a carousel; a mega dropdown for navigation; a tabbed interface. HTML doesn’t give you any of those out of the box so you’d need to make your own using a combination of HTML, CSS, JavaScript and ARIA. There’s plenty of testing you can do before launching—I always ask myself “What would Heydon do?”—but these components really benefit from being tested by real screen reader users.

So if you commission an accessibility audit, you should hope to get feedback that’s mostly in that third category—interactive widgets.

If you get feedback on document structure and other semantic issues with the HTML, you should fix those issues, sure, but you should also see what you can do to stop those issues going live again in the future. Perhaps you can add some steps in the build process. Or maybe it’s more about making sure the devs are aware of these low-hanging fruit. Or perhaps there’s a framework or content management system that’s stopping you from improving your HTML. Then you need to execute a plan for ditching that software.

If you get feedback about colour contrast issues, just fixing the immediate problem isn’t going to address the underlying issue. There’s a process problem, or perhaps a communication issue. In that case, don’t look for a technical solution. A design system, for example, will not magically fix a workflow issue or route around the problem of designers and developers not talking to each other.

When you commission an accessibility audit, you want to make sure you’re getting the most out of it. Don’t squander it on issues that you can catch and fix yourself. Make sure that the bulk of the audit is being spent on the specific issues that are unique to your site.

Accent all areas

Whenever a new version of Chrome comes out, there’s an accompanying blog post listing what’s new. Chrome 93 just came out and, sure enough, Pete has written a blog post about it.

But what I think is the most exciting addition to the browser isn’t listed.

What is this feature that’s got me so excited?

Okay, I’ve probably oversold it now because actually, it looks like a rather small trivial addition. It’s the accent-color property in CSS.

Up until now, accent colour was controlled by the operating system. If you’re on a Mac, go to “System Preferences” and then “General”. There you’ll see an option to change your accent colour. Try picking a different colour. You’ll see that change cascade down into the other form fields in that preference pane: checkboxes, radio buttons, and dropdowns.

Your choice will also cascade down into web pages. Any web page that uses native checkboxes, radio buttons and other interface elements will inherit that colour.

This is how interface elements are supposed to work. The browser inherits the look’n’feel of the inputs from the operating system.

That’s the theory anyway. In practice, form elements—such as dropdowns—can look different from browser to browser, something that shouldn’t be happening if the browsers are all inheriting from the operating system.

Anyway, it’s probably this supposed separation of responsibility between browser and operating system which has led to the current situation with form fields and CSS. Authors can style form fields up to a point, but there’s always a line that you don’t get to cross.

The accent colour of a selected radio button or a checkbox has historically been on the other side of that line. You either had to accept that you couldn’t change the colour, or you had to make your own checkbox or radio button interface. You could use CSS to hide the native element and replace it with an image instead.

That feels a bit over-engineered and frankly kind of hacky. It reminds me of the bad old days of image replacement for text before we had web fonts.

Now, with the accent-color property in CSS, authors can over-ride the choice that the user has set at the operating system level.

On the one hand, this doesn’t feel great to me. Who are we to make that decision? Shouldn’t the user’s choice take primacy over our choices?

But then again, where do we draw the line? We’re allowed over-ride link colours. We’re allowed over-ride font choices.

Ultimately I think it’s a good thing that authors can now specify an accent colour. What makes me think that is the behaviour that authors have shown if they don’t have this ability—they do it anyway, and in a hackier manner. This is why I think the work of the Open UI group is so important. If developers don’t get a standardised way to customise native form controls, they’ll just recreate their own over-engineered versions.

The purpose of Open UI to the web platform is to allow web developers to style and extend built-in web UI controls, such as select dropdowns, checkboxes, radio buttons, and date/color pickers.

Trying to stop developers from styling checkboxes and radio buttons is like trying to stop teenagers from having sex. You might as well accept that it’s going to happen and give them contraception so they can at least do it safely.

So I welcome this new CSS condom.

You can see accent-color in action in this demo. Change the value of the accent-color property to see the form fields update:

:root {
  accent-color: rebeccapurple;
}

Applying it at the document level like that will make it universal, but you can also use the property on an element-by-element basis using whatever selector you want.

That demo works in Chrome and Edge 93, the current release. It also works in Firefox 92, which literally just landed (like as I was writing this blog post, support for accent-color magically arrived!).

As for Safari, well, who knows? If Apple published a roadmap, then developers would have a clue when to expect a property like this to land. But we mere mortals cannot be trusted with such important hush-hush information.

In the meantime, keep an eye on Can I Use. And lack of support on one browser is no reason not to use accent-color anyway. It’s a progressive enhancement. Add it to your CSS today and it will work in more browsers in the future.

BetrayURL

Back in February, I wrote about an excellent proposal by Jake for how browsers could display URLs in a safer way. Crucially, this involved highlighting the important part of the URL, but didn’t involve hiding any part. It’s a really elegant solution.

Turns out it was a Trojan horse. Chrome are now running an experiment where they will do the exact opposite: they will hide parts of the URL instead of highlighting the important part.

You can change this behaviour if you’re in the less than 1% of people who ever change default settings in browsers.

I’m really disappointed to see that Jake’s proposal isn’t going to be implemented. It was a much, much better solution.

No doubt I will hear rejoinders that the “solution” that Chrome is experimenting with is pretty similar to what Jake proposed. Nothing could be further from the truth. Jake’s solution empowered users with knowledge without taking anything away. What Chrome will be doing is the opposite of that, infantalising users and making decisions for them “for their own good.”

Seeing a complete URL is going to become a power-user feature, like View Source or user style sheets.

I’m really sad about that because, as Jake’s proposal demonstrates, it doesn’t have to be that way.

Overlay gap

I think a lot about Danielle’s talk at Patterns Day last year.

Around about the six minute mark she starts talking about gaps and overlaps.

Gaps are where hidden complexity live. If we don’t have a category to cover it, in effect it becomes invisible. But that doesn’t mean it’s not there. Unidentified gaps cause inconsistency and confusion.

Overlaps occur when two separate categories encompass some of the same areas of responsibility. They cause conflict, duplication of effort, and unnecessary friction.

This is the bit I keep thinking about. It’s such an insightful lens to view things through. On just about any project, tensions are almost due to either gaps (“I thought someone else was doing that”) or overlaps (“Oh, you’re doing that? I thought we were doing that”).

When I was talking to Gerry on his new podcast recently, we were trying to figure out why web performance is in such a woeful state. I mused that there may be a gap. Perhaps designers think it’s a technical problem and developers think it’s a design problem. I guess you could try to bridge this gap by having someone whose job is to focus entirely on performance. But I suspect the better—but harder—solution is to create a shared culture of performance, of the kind Lara wrote about in her book:

Performance is truly everyone’s responsibility. Anyone who affects the user experience of a site has a relationship to how it performs. While it’s possible for you to single-handedly build and maintain an incredibly fast experience, you’d be constantly fighting an uphill battle when other contributors touch the site and make changes, or as the Web continues to evolve.

I suspect there’s a similar ownership gap at play when it comes to the ubiquitous obtrusive overlays that are plastered on so many websites these days.

Kirill Grouchnikov recently published a gallery of screenshots showcasing the beauty of modern mobile websites:

There are two things common between the websites in these screenshots that I took yesterday.

  1. They are beautifully designed, with great typography, clear branding, all optimized for readability.
  2. I had to install Firefox, Adblock Plus and uBlock Origin, as well as manually select and remove additional elements such as subscription overlays.

The web can be beautiful. Except it’s not right now.

How is this dissonance possible? How can designers and developers who clearly care about the user experience be responsible for unleashing such user-hostile interfaces?

PM/Legal/Marketing made me do it

I get that. But surely the solution can’t be to shrug our shoulders, pass the buck, and say “not my job.” Somebody designed each one of those obtrusive overlays. Somebody coded up each one and pushed them into production.

It’s clear that this is a problem of communication and understanding, rather than a technical problem. As always. We like to talk about how hard and complex our technical work is, but frankly, it’s a lot easier to get a computer to do what you want than to convince a human. Not least because you also need to understand what that other human wants. As Danielle says:

Recognising the gaps and overlaps is only half the battle. If we apply tools to a people problem, we will only end up moving the problem somewhere else.

Some issues can be solved with better tools or better processes. In most of our workplaces, we tend to reach for tools and processes by default, because they feel easier to implement. But as often as not, it’s not a technology problem. It’s a people problem. And the solution actually involves communication skills, or effective dialogue.

So let’s say it is someone in the marketing department who is pushing to have an obtrusive newsletter sign-up form get shoved in the user’s face. Talk to them. Figure out what their goals are—what outcome are they hoping to get to. If they don’t seem to understand the user-experience implications, talk to them about that. But it needs to be a two-way conversation. You need to understand what they need before you start telling them what you want.

I realise that makes it sound patronisingly simple, and I know that in actuality it’s a sisyphean task. It may be that genuine understanding between people is the wickedest of design problems. But even if this problem seems insurmoutable, at least you’d be tackling the right problem.

Because the web can’t survive like this.

Web Share API test

Remember a while back I wrote about some odd behaviour with the Web Share API in Safari on iOS?

When the share() method is triggered, iOS provides multiple ways of sharing: Messages, Airdrop, email, and so on. But the simplest option is the one labelled “copy”, which copies to the clipboard.

Here’s the thing: if you’ve provided a text parameter to the share() method then that’s what’s going to get copied to the clipboard—not the URL.

That’s a shame. Personally, I think the url field should take precedence.

Tess filed a bug soon after, which was very gratifying to see.

Now Phil has put together a test case:

  1. Share URL, title, and text
  2. Share URL and title
  3. Share URL and text

Very handy! The results (using the “copy” to clipboard action) are somewhat like rock, paper, scissors:

  • URL beats title,
  • text beats URL,
  • nothing beats text.

So it’s more like rock, paper, high explosives.

IncrementURL

Last month I wrote some musings on default browser behaviours. When it comes to all the tasks that browsers do for us, the most fundamental is taking a URL, fetching its contents and giving us the results. As part of that process, browsers also show us the URL of the page currently loaded in a tab or window.

But even at this fundamental level, there are some differences from browser to browser.

Safari only shows you the domain name—and any subdomain names—by default. It looks like nice and tidy, but it obfuscates what page you’re on (until you click on the domain name). This is bad.

Chrome shows you the full URL, nice and straightforward. This is neutral.

Firefox, like Chrome, shows you the full URL, but with a subtle difference. The important part of the URL—usually the domain name—is subtly highlighted in a darker shade of grey. This is good.

The reason I say that what it highlights is usually the domain name is because what it actually highlights is eTLD+1.

The what now?

Well, if you’re looking at a page on adactio.com, that’s the important bit. But what if you’re looking at a page on adactio.github.io? The domain name is important, but so is the subdomain.

It turns out there’s a list out there of which sites and top level domains allow registrations like this. This is the list that Firefox is using for its shading behaviour in displaying URLs.

Safari, by the way, does not use this list. These URLs are displayed identically in Safari, the phisherman’s friend:

  • example.com
  • example.github.io
  • github.example.com

Whereas Firefox displays them as:

  • example.com
  • example.github.io
  • github.example.com

I learned all this from Jake on a recent edition of HTTP 203. Nicolas Hoizey has writen a nice little summary.

Jake acknowledges that what Apple is doing is shi… , what Firefox is doing is good, and then puts forward an idea for what Chrome could do. (But please note that this is Jake’s personal opinion; not an official proposal from the Chrome team.)

There’s some prior art here. It used to be that, if your SSL certificate included extended validation, the name would be shown in green next to the padlock symbol. So while my website—which uses regular SSL from Let’s Encrypt—would just have a padlock, Medium—which uses EV SSL—would have a padlock and the text “A Medium Corporation”.

Extended validation wasn’t quite the bulletproof verification it was cracked up to be. So browsers don’t use that interface pattern any more.

Jake suggests repurposing this pattern for all URLs. Pull out the important bit—eTLD+1—and show it next to the padlock.

Screenshots of @JaffaTheCake’s idea for separating out the eTLD+1 part of a URL in a browser’s address bar. Screenshots of @JaffaTheCake’s idea for separating out the eTLD+1 part of a URL in a browser’s address bar.

I like this. The full URL is still displayed. This proposal is more of an incremental change. An enhancement that is applied progressively, if you will.

I also like that it builds on existing interface patterns—Firefox’s URL treatment and the deprecated treatment of EV certs. In fact, I think the first step for Chrome should be to match Firefox’s current behaviour, and then go further with something like Jake’s proposal.

This kind of gradual change was exactly what Chrome did with displaying https and http domains.

Chrome treatment for HTTPS pages.

Jake mentions this in the video

We’ve already seen that you have to take small steps here, like we did with the https change.

There’s a fascinating episode of the Freakonomics podcast called In Praise of Incrementalism. I’ve huffduffed it.

I’m a great believer in the HTML design principle, Evolution Not Revolution:

It is better to evolve an existing design rather than throwing it away.

I’d love to see Chrome take the first steps to Jake’s proposal by following Firefox’s lead.

Then again, I’d love it if Chrome followed Firefox’s lead in implementing subgrid.

Install prompt

There’s an interesting thread on Github about the tongue-twistingly named beforeinstallpromt JavaScript event.

Let me back up…

Progressive web apps. You know what they are, right? They’re websites that have taken their vitamins. Specifically, they’re responsive websites that:

  1. are served over HTTPS,
  2. have a web app manifest, and
  3. have a service worker handling the offline scenario.

The web app manifest—a JSON file of metadata—is particularly useful for describing how your site should behave if someone adds it to their home screen. You can specify what icon should be used. You can specify whether the site should launch in a browser or as a standalone app (practically indistinguishable from a native app). You can specify which URL on the site should be used as the starting point when the site is launched from the home screen.

So progressive web apps work just fine when you visit them in a browser, but they really shine when you add them to your home screen. It seems like pretty much everyone is in agreement that adding a progressive web app to your home screen shouldn’t be an onerous task. But how does the browser let the user know that it might be a good idea to “install” the web site they’re looking at?

The Samsung Internet browser does ambient badging—a + symbol shows up to indicate that a website can be installed. This is a great approach!

I hope that Chrome on Android will also use ambient badging at some point. To start with though, Chrome notified users that a site was installable by popping up a notification at the bottom of the screen. I think these might be called “toasts”.

Getting the “add to home screen” prompt for https://huffduffer.com/ on Android Chrome. And there’s the “add to home screen” prompt for https://html5forwebdesigners.com/ HTTPS + manifest.json + Service Worker = “Add to Home Screen” prompt. Add to home screen.

Needless to say, the toast notification wasn’t very effective. That’s because we web designers and developers have spent years teaching people to immediately dismiss those notifications without even reading them. Accept our cookies! Sign up to our newsletter! Install our native app! Just about anything that’s user-hostile gets put in a notification (either a toast or an overlay) and shoved straight in the user’s face before they’ve even had time to start reading the content they came for in the first place. Users will then either:

  1. turn around and leave, or
  2. use muscle memory reach for that X in the corner of the notification.

A tiny fraction of users might actually click on the call to action, possibly by mistake.

Chrome didn’t abandon the toast notification for progressive web apps, but it did change when they would appear. Rather than the browser deciding when to show the prompt—usually when the user has just arrived on the site—a new JavaScript event called beforeinstallprompt can be used.

It’s a bit weird though. You have to “capture” the event that fires when the prompt would have normally been shown, subdue it, hold on to that event, and then re-release it when you think it should be shown (like when the user has completed a transaction, for example, and having your site on the home screen would genuinely be useful). That’s a lot of hoops. Here’s the code I use on The Session to only show the installation prompt to users who are logged in.

The end result is that the user is still shown a toast notification, but at least this time it’s the site owner who has decided when it will be shown. The Chrome team call this notification “the mini-info bar”, and Pete acknowledges that it’s not ideal:

The mini-infobar is an interim experience for Chrome on Android as we work towards creating a consistent experience across all platforms that includes an install button into the omnibox.

I think “an install button in the omnibox” means ambient badging in the browser interface, which would be great!

Anyway, back to that thread on Github. Basically, neither Apple nor Mozilla are going to implement the beforeinstallprompt event (well, technically Mozilla have implemented it but they’re not going to ship it). That’s fair enough. It’s an interim solution that’s not ideal for all reasons I’ve already covered.

But there’s a lot of pushback. Even if the details of beforeinstallprompt are troublesome, surely there should be some way for site owners to let users know that can—or should—install a progressive web app? As a site owner, I have a lot of sympathy for that viewpoint. But I also understand the security and usability issues that can arise from bad actors abusing this mechanism.

Still, I have to hand it to Chrome: even if we put the beforeinstallprompt event to one side, the browser still has a mechanism for letting users know that a progressive web app can be installed—the mini info bar. It’s not a great mechanism, but it’s better than nothing. Nothing is precisely what Firefox and Safari currently offer (though Firefox is experimenting with something).

In the case of Safari, not only do they not provide a mechanism for letting the user know that a site can be installed, but since the last iOS update, they’ve buried the “add to home screen” option even deeper in the “sharing sheet” (the list of options that comes up when you press the incomprehensible rectangle-with-arrow-emerging-from-it icon). You now have to scroll below the fold just to find the “add to home screen” option.

So while I totally get the misgivings about beforeinstallprompt, I feel that a constructive alternative wouldn’t go amiss.

And that’s all I have to say about that.

Except… there’s another interesting angle to that Github thread. There’s talk of allowing sites that are launched from the home screen to have access to more features than a site inside a web browser. Usually permissions on the web are explicitly granted or denied on a case-by-case basis: geolocation; notifications; camera access, etc. I think this is the first time I’ve heard of one action—adding to the home screen—being used as a proxy for implicitly granting more access. Very interesting. Although that idea seems to be roundly rejected here:

A key argument for using installation in this manner is that some APIs are simply so powerful that the drive-by web should not be able to ask for them. However, this document takes the position that installation alone as a restriction is undesirable.

Then again:

I understand that Chromium or Google may hold such a position but Apple’s WebKit team may not necessarily agree with such a position.

The Web Share API in Safari on iOS

I implemented the Web Share API over on The Session back when it was first available in Chrome in Android. It’s a nifty and quite straightforward API that allows websites to make use of the “sharing drawer” that mobile operating systems provide from within a web browser.

I already had sharing buttons that popped open links to Twitter, Facebook, and email. You can see these sharing buttons on individual pages for tunes, recordings, sessions, and so on.

I was already intercepting clicks on those buttons. I didn’t have to add too much to also check for support for the Web Share API and trigger that instead:

if (navigator.share) {
  navigator.share(
    {
      title: document.querySelector('title').textContent,
      text: document.querySelector('meta[name="description"]').getAttribute('content'),
      url: document.querySelector('link[rel="canonical"]').getAttribute('href')
    }
  );
}

That worked a treat. As you can see, there are three fields you can pass to the share() method: title, text, and url. You don’t have to provide all three.

Earlier this year, Safari on iOS shipped support for the Web Share API. I didn’t need to do anything. ‘Cause that’s how standards work. You can make use of APIs before every browser supports them, and then your website gets better and better as more and more browsers add support.

But I recently discovered something interesting about the iOS implementation.

When the share() method is triggered, iOS provides multiple ways of sharing: Messages, Airdrop, email, and so on. But the simplest option is the one labelled “copy”, which copies to the clipboard.

Here’s the thing: if you’ve provided a text parameter to the share() method then that’s what’s going to get copied to the clipboard—not the URL.

That’s a shame. Personally, I think the url field should take precedence. But I don’t think this is a bug, per se. There’s nothing in the spec to say how operating systems should handle the data sent via the Web Share API. Still, I think it’s a bit counterintuitive. If I’m looking at a web page, and I opt to share it, then surely the URL is the most important piece of data?

I’m not even sure where to direct this feedback. I guess it’s under the purview of the Safari team, but it also touches on OS-level interactions. Either way, I hope that somebody at Apple will consider changing the current behaviour for copying Web Share data to the clipboard.

In the meantime, I’ve decided to update my code to remove the text parameter:

if (navigator.share) {
  navigator.share(
    {
      title: document.querySelector('title').textContent,
      url: document.querySelector('link[rel="canonical"]').getAttribute('href')
    }
  );
}

If the behaviour of Safari on iOS changes, I’ll reinstate the missing field.

By the way, if you’re making progressive web apps that have display: standalone in the web app manifest, please consider using the Web Share API. When you remove the browser chrome, you’re removing the ability for users to easily share URLs. The Web Share API gives you a way to reinstate that functionality.

Voice User Interface Design by Cheryl Platz

Cheryl Platz is speaking at An Event Apart Chicago. Her inaugural An Event Apart presentation is all about voice interfaces, and I’m going to attempt to liveblog it…

Why make a voice interface?

Successful voice interfaces aren’t necessarily solving new problems. They’re used to solve problems that other devices have already solved. Think about kitchen timers. There are lots of ways to set a timer. Your oven might have one. Your phone has one. Why use a $200 device to solve this mundane problem? Same goes for listening to music, news, and weather.

People are using voice interfaces for solving ordinary problems. Why? Context matters. If you’re carrying a toddler, then setting a kitchen timer can be tricky so a voice-activated timer is quite appealing. But why is voice is happening now?

Humans have been developing the art of conversation for thousands of years. It’s one of the first skills we learn. It’s deeply instinctual. Most humans use speach instinctively every day. You can’t necessarily say that about using a keyboard or a mouse.

Voice-based user interfaces are not new. Not just the idea—which we’ve seen in Star Trek—but the actual implementation. Bell Labs had Audrey back in 1952. It recognised ten words—the digits zero through nine. Why did it take so long to get to Alexa?

In the late 70s, DARPA issued a challenge to create a voice-activated system. Carnagie Mellon came up with Harpy (with a thousand word grammar). But none of the solutions could respond in real time. In conversation, we expect a break of no more than 200 or 300 milliseconds.

In the 1980s, computing power couldn’t keep up with voice technology, so progress kind of stopped. Time passed. Things finally started to catch up in the 90s with things like Dragon Naturally Speaking. But that was still about vocabulary, not grammar. By the 2000s, small grammars were starting to show up—starting an X-Box or pausing Netflix. In 2008, Google Voice Search arrived on the iPhone and natural language interaction began to arrive.

What makes natural language interactions so special? It requires minimal training because it uses the conversational muscles we’ve been working for a lifetime. It unlocks the ability to have more forgiving, less robotic conversations with devices. There might be ten different ways to set a timer.

Natural language interactions can also free us from “screen magnetism”—that tendency to stay on a device even when our original task is complete. Voice also enables fast and forgiving searches of huge catalogues without time spent typing or browsing. You can pick a needle straight out of a haystack.

Natural language interactions are excellent for older customers. These interfaces don’t intimidate people without dexterity, vision, or digital experience. Voice input often leads to more inclusive experiences. Many customers with visual or physical disabilities can’t use traditional graphical interfaces. Voice experiences throw open the door of opportunity for some people. However, voice experience can exclude people with speech difficulties.

Making the case for voice interfaces

There’s a misconception that you need to work at Amazon, Google, or Apple to work on a voice interface, or at least that you need to have a big product team. But Cheryl was able to make her first Alexa “skill” in a week. If you’re a web developer, you’re good to go. Your voice “interaction model” is just JSON.

How do you get your product team on board? Find the customers (and situations) you might have excluded with traditional input. Tell the stories of people whose hands are full, or who are vision impaired. You can also point to the adoption rate numbers for smart speakers.

You’ll need to show your scenario in context. Otherwise people will ask, “why can’t we just build an app for this?” Conduct research to demonstrate the appeal of a voice interface. Storyboarding is very useful for visualising the context of use and highlighting existing pain points.

Getting started with voice interfaces

You’ve got to understand how the technology works in order to adapt to how it fails. Here are a few basic concepts.

Utterance. A word, phrase, or sentence spoken by a customer. This is the true form of what the customer provides.

Intent. This is the meaning behind a customer’s request. This is an important distinction because one intent could have thousands of different utterances.

Prompt. The text of a system response that will be provided to a customer. The audio version of a prompt, if needed, is generated separately using text to speech.

Grammar. A finite set of expected utterances. It’s a list. Usually, each entry in a grammar is paired with an intent. Many interfaces start out as being simple grammars before moving on to a machine-learning model later once the concept has been proven.

Here’s the general idea with “artificial intelligence”…

There’s a human with a core intent to do something in the real world, like knowing when the cookies in the oven are done. This is translated into an intent like, “set a 15 minute timer.” That’s the utterance that’s translated into a string. But it hasn’t yet been parsed as language. That string is passed into a natural language understanding system. What comes is a data structure that represents the customers goal e.g. intent=timer; duration=15 minutes. That’s sent to the business logic where a timer is actually step. For a good voice interface, you also want to send back a response e.g. “setting timer for 15 minutes starting now.”

That seems simple enough, right? What’s so hard about designing for voice?

Natural language interfaces are a form of artifical intelligence so it’s not deterministic. There’s a lot of ruling out false positives. Unlike graphical interfaces, voice interfaces are driven by probability.

How do you turn a sound wave into an understandable instruction? It’s a lot like teaching a child. You feed a lot of data into a statistical model. That’s how machine learning works. It’s a probability game. That’s where it gets interesting for design—given a bunch of possible options, we need to use context to zero in on the most correct choice. This is where confidence ratings come in: the system will return the probability that a response is correct. Effectively, the system is telling you how sure or not it is about possible results. If the customer makes a request in an unusual or unexpected way, our system is likely to guess incorrectly. That’s because the system is being given something new.

Designing a conversation is relatively straightforward. But 80% of your voice design time will be spent designing for what happens when things go wrong. In voice recognition, edge cases are front and centre.

Here’s another challenge. Interaction with most voice interfaces is part conversation, part performance. Most interactions are not private.

Humans don’t distinguish digital speech fom human speech. That means these devices are intrinsically social. Our brains our wired to try to extract social information, even form digital speech. See, for example, why it’s such a big question as to what gender a voice interface has.

Delivering a voice interface

Storyboards help depict the context of use. Sample dialogues are your new wireframes. These are little scripts that not only cover the happy path, but also your edge case. Then you reverse engineer from there.

Flow diagrams communicate customer states, but don’t use the actual text in them.

Prompt lists are your final deliverable.

Functional prototypes are really important for voice interfaces. You’ll learn the real way that customers will ask for things.

If you build a working prototype, you’ll be building two things: a natural language interaction model (often a JSON file) and custom business logic (in a programming language).

Eventually voice design will become a core competency, much like mobile, which was once separate.

Ask yourself what tasks your customers complete on your site that feel clunkly. Remember that voice desing is almost never about new scenarious. Start your journey into voice interfaces by tackling old problems in new, more inclusive ways.

May the voice be with you!

Am I cached or not?

When I was writing about the lie-fi strategy I’ve added to adactio.com, I finished with this thought:

What I’d really like is some way to know—on the client side—whether or not the currently-loaded page came from a cache or from a network. Then I could add some kind of interface element that says, “Hey, this page might be stale—click here if you want to check for a fresher version.”

Trys heard my plea, and came up with a very clever technique to alter the HTML of a page when it’s put into a cache.

It’s a function that reads the response body stream in, returning a new stream. Whilst reading the stream, it searches for the character codes that make up: <html. If it finds them, it tacks on a data-cached attribute.

Nice!

But then I was discussing this issue with Tantek and Aaron late one night after Indie Web Camp Düsseldorf. I realised that I might have another potential solution that doesn’t involve the service worker at all.

Caveat: this will only work for pages that have some kind of server-side generation. This won’t work for static sites.

In my case, pages are generated by PHP. I’m not doing a database lookup every time you request a page—I’ve got a server-side cache of posts, for example—but there is a little bit of assembly done for every request: get the header from here; get the main content from over there; get the footer; put them all together into a single page and serve that up.

This means I can add a timestamp to the page (using PHP). I can mark the moment that it was served up. Then I can use JavaScript on the client side to compare that timestamp to the current time.

I’ve published the code as a gist.

In a script element on each page, I have this bit of coducken:

var serverTimestamp = <?php echo time(); ?>;

Now the JavaScript variable serverTimestamp holds the timestamp that the page was generated. When the page is put in the cache, this won’t change. This number should be the number of seconds since January 1st, 1970 in the UTC timezone (that’s what my server’s timezone is set to).

Starting with JavaScript’s Date object, I use a caravan of methods like toUTCString() and getTime() to end up with a variable called clientTimestamp. This will give the current number of seconds since January 1st, 1970, regardless of whether the page is coming from the server or from the cache.

var localDate = new Date();
var localUTCString = localDate.toUTCString();
var UTCDate = new Date(localUTCString);
var clientTimestamp = UTCDate.getTime() / 1000;

Then I compare the two and see if there’s a discrepency greater than five minutes:

if (clientTimestamp - serverTimestamp > (60 * 5))

If there is, then I inject some markup into the page, telling the reader that this page might be stale:

document.querySelector('main').insertAdjacentHTML('afterbegin',`
  <p class="feedback">
    <button onclick="this.parentNode.remove()">dismiss</button>
    This page might be out of date. You can try <a href="javascript:window.location=window.location.href">refreshing</a>.
  </p>
`);

The reader has the option to refresh the page or dismiss the message.

This page might be out of date. You can try refreshing.

It’s not foolproof by any means. If the visitor’s computer has their clock set weirdly, then the comparison might return a false positive every time. Still, I thought that using UTC might be a safer bet.

All in all, I think this is a pretty good method for detecting if a page is being served from a cache. Remember, the goal here is not to determine if the user is offline—for that, there’s navigator.onLine.

The upshot is this: if you visit my site with a crappy internet connection (lie-fi), then after three seconds you may be served with a cached version of the page you’re requesting (if you visited that page previously). If that happens, you’ll now also be presented with a little message telling you that the page isn’t fresh. Then it’s up to you whether you want to have another go.

I like the way that this puts control back into the hands of the user.

Drag’n’drop revisited

I got a message from a screen-reader user of The Session recently, letting me know of a problem they were having. I love getting any kind of feedback around accessibility, so this was like gold dust to me.

They pointed out that the drag’n’drop interface for rearranging the order of tunes in a set was inaccessible.

Drag and drop

Of course! I slapped my forehead. How could I have missed this?

It had been a while since I had implemented that functionality, so before even looking at the existing code, I started to think about how I could improve the situation. Maybe I could capture keystroke events from the arrow keys and announce changes via ARIA values? That sounded a bit heavy-handed though: mess with people’s native keyboard functionality at your peril.

Then I looked at the code. That was when I realised that the fix was going to be much, much easier than I thought.

I documented my process of adding the drag’n’drop functionality back in 2016. Past me had his progressive enhancement hat on:

One of the interfaces needed for this feature was a form to re-order items in a list. So I thought to myself, “what’s the simplest technology to enable this functionality?” I came up with a series of select elements within a form.

Reordering

The problem was in my feature detection:

There’s a little bit of mustard-cutting going on: does the dragula object exist, and does the browser understand querySelector? If so, the select elements are hidden and the drag’n’drop is enabled.

The logic was fine, but the execution was flawed. I was being lazy and hiding the select elements with display: none. That hides them visually, but it also hides them from screen readers. I swapped out that style declaration for one that visually hides the elements, but keeps them accessible and focusable.

It was a very quick fix. I had the odd sensation of wanting to thank Past Me for making things easy for Present Me. But I don’t want to talk about time travel because if we start talking about it then we’re going to be here all day talking about it, making diagrams with straws.

I pushed the fix, told the screen-reader user who originally contacted me, and got a reply back saying that everything was working great now. Success!

Handing back control

An Event Apart Seattle was most excellent. This year, the AEA team are trying something different and making each event three days long. That’s a lot of mindblowing content!

What always fascinates me at events like these is the way that some themes seem to emerge, without any prior collusion between the speakers. This time, I felt that there was a strong thread of giving control directly to users:

Sarah and Margot both touched on this when talking about authenticity in brand messaging.

Margot described this in terms of vulnerability for the brand, but the kind of vulnerability that leads to trust.

Sarah talked about it in terms of respect—respecting the privacy of users, and respecting the way that they want to use your services. Call it compassion, call it empathy, or call it just good business sense, but providing these kind of controls in an interface is an excellent long-term strategy.

In Val’s animation talk, she did a deep dive into prefers-reduced-motion, a media query that deliberately hands control back to the user.

Even in a CSS-heavy talk like Jen’s, she took the time to explain why starting with meaningful markup is so important—it’s because you can’t control how the user will access your content. They may use tools like reader modes, or Pocket, or have web pages read aloud to them. The user has the final say, and rightly so.

In his CSS talk, Eric reminded us that a style sheet is a list of strong suggestions, not instructions.

Beth’s talk was probably the most explicit on the theme of returning control to users. She drew on examples from beyond the world of the web—from architecture, urban planning, and more—to show that the most successful systems are not imposed from the top down, but involve everyone, especially those most marginalised.

And even in my own talk on service workers, I raved about the design pattern of allowing users to save pages offline to read later. Instead of trying to guess what the user wants, give them the means to take control.

I was really encouraged to see this theme emerge. Mind you, when I look at the reality of most web products, it’s easy to get discouraged. Far from providing their users with controls over their own content, Instagram won’t even let their customers have a chronological feed. And Matt recently wrote about how both Twitter and Quora are heading further and further away from giving control to their users in his piece called Optimizing for outrage.

Still, I came away from An Event Apart Seattle with a renewed determination to do my part in giving people more control over the products and services we design and develop.

I spent the first two days of the conference trying to liveblog as much as I could. I find it really focuses my attention, although it’s also quite knackering. I didn’t do too badly; I managed to write cover eleven of the talks (out of the conference’s total of seventeen):

  1. Slow Design for an Anxious World by Jeffrey Zeldman
  2. Designing for Trust in an Uncertain World by Margot Bloomstein
  3. Designing for Personalities by Sarah Parmenter
  4. Generation Style by Eric Meyer
  5. Making Things Better: Redefining the Technical Possibilities of CSS by Rachel Andrew
  6. Designing Intrinsic Layouts by Jen Simmons
  7. How to Think Like a Front-End Developer by Chris Coyier
  8. From Ideation to Iteration: Design Thinking for Work and for Life by Una Kravets
  9. Move Fast and Don’t Break Things by Scott Jehl
  10. Mobile Planet by Luke Wroblewski
  11. Unsolved Problems by Beth Dean

Mobile Planet by Luke Wroblewski

It’s the afternoon of day two of An Event Apart in Seattle. The mighty green one, Luke Wroblewski, is here to deliver a talk called Mobile Planet:

With 3.5 billion active smartphones on Earth, we’re now faced with the challenges and opportunities of designing planet-scale software. Through a data-informed, big-picture walk-through of our mobile planet, Luke will dig into how people use computing devices today and how the design of our products needs to adapt to this reality. He’ll cover key issues like app on-boarding and performance in enough detail to give you clear ways to improve first time and subsequent use of your mobile apps and sites.

Luke has been working on figuring out hardware and software for years. He looks at a lot of data. The more we understand how people use technology in their daily lives, the better we can design for them.

Earth is the third planet from the sun, and the only place that we know of that harbours life. Our population is at about 7.7 billion people. There are about 5.6 billion people in our addressable market (people over 14 years old). There are already 5 billion mobile subscribers in there. That’s interesting, but which of those devices are modern smartphones? There are about 3.6 billion active smartphones. Compare that to about 1.3 billion active personal computers—the vast majority of them Windows devices (about 1.2 billion). Over the next four or five years, we’ll have about 5 billion smartphone users and a global population of 8 billion.

The point is that we can reach a significant proportion of the human species. The diversity of our species makes it challenging to design for everyone.

Let’s take a closer look at these 3.6 billion active smartphones. About 25% of them are iOS devices. 75% of them are Android. Bear in mind that these are active devices—what’s actually being used. That’s different to shipping devices. Apple ships 15% of smartphone, and Android ships 85%, but the iOS devices tend to have longer lifespans (around 2 years for Android; around 4 years for iOS).

The UK has 82% smartphone penetration. Compare that to India, where it’s 27%. There’s room to grow.

Everywhere you look, the growth of these devices has led to a shift of digital things overtaking analogue. Shopping, advertising, music, you name it. We’ve seen enough of these transitions happen, that we should be prepared for it.

So there are lots of smartphones, with basically two major operating systems. But how are people using these devices?

In the US, adults spend about 2.3 - 3.5 hours per day on their mobile devices. Let’s call it an even 3 hours. That’s a lot of time. Where does that time come from? Interestingly, as time spent on mobile devices has surged, time spent on other media has only slowly declined. So mobile is additive. It’s contributing to more time spent on the internet rather than taking it away from existing screen time.

Next question: what the hell are people doing during those 3 hours per day on smartphones? Native apps get about 169 minutes of time compared to only 11 minutes on the web. There are about 2 million native apps on Apple, and about 2 million native apps on Android. But although people have a lot of apps, people only use about half them. Remember folks, downloads does not equal usage. Most apps don’t make it past the first opening. Only a third make it past being opened ten times.

Because people spend so much time and energy on these apps, and given the abysmal abandonment, people start freaking out about “engagement.” So what do they reach for? Push notifications. Either that or onboarding.

Push notifications. The worst. I mean, they do succeed in getting your attention: push notifications do increase the amount of time spent in your app …but there’s a human cost.

Let’s look at app onboarding. Take Flickr, for example. It walks through some of the features and benefits of the service. But it doesn’t actually help you much. It’s a list of marketing slogans. So why do people reach for onboarding?

If you just drop people into an interface and talk to them about it, they’ll say things like “I don’t know what to do. I’m lost.” The Intuit team heard this from people using their app. They reached for onboarding to solve the problem. They created guided tutorials and intro tours. Turns out that nobody would read these screens and everyone would try to skip them. What the hell, people!?

So they try in-context help, with a cute cartoon robot to explain the features. Or they scribble Einstein’s equations over the interface. Test this. People respond with “Please make it stop.”

They decided to try something simpler: one tip that calls out a good first step. That worked.

Vevo used to have an intro tour. Most people were swiping through without reading. They experimented with not running the tour. They got a 10% increase in log-ins and a 6% increase in sign-ups.

Vevo got rid of their tour, but left the sign-in/registration step. You can’t remove that, right?

Well, Hotel Tonight experimented with not doing registration. Signing up was confusing people—it’s Hotels Tonight, not Accounts Today. When they got rid of accounts, they saw a 15% increase in conversions.

Ruthlessly edit.

Google Photos used to have an in-depth on-boarding experience. First they got rid of the animation. Then the start-up screen. Then the animated tutorial. Each time they removed something, conversion went up. All that was left from the original onboarding was a half screen with one option to turn on auto-backup.

Get to your product value as fast as possible. Of course that requires you to know what your core value is. And that’s not easy to figure out.

Google Maps went through a similar reduction, removing intro screens and explanations. Now they just drop you into the map.

It’s not “get rid of everything”. It’s “get rid of everything that gets in the way of the core user action.”

Going back to the Intuit example, that’s exactly what they did in the end. That one initial tip was for the core action.

But it’s worth discussing how to present this kind of thing. If you have to overlay a tooltip for an important UI feature, maybe that UI feature should have a clearer affordance. People treat overlays as annoyances. People ignore or dismiss overlays when they’re focused on a task. It’s like an instinct to get rid of them. So if you put something useful or valuable there, it’s gone.

The core part of your application should feel like the core part of your application. It’s tough because stakeholders want to make things “pop.” We throw contrast, colour, and animation at things. But when something sticks out from the UI, people ignore it. Integrate the core action into the product UI. When elements feel foreign to a product UI, they are at best ignored, or at worst dismissed.

These is why cohesive design matters. It’s not about consistency. It’s about feeling integrated. In many cases, consistency can be counter-productive.

Some principles for successful onboarding:

  1. Get to to the product value as fast as possible. Grubhub needs your address. Pinterest needs your interests.
  2. Get rid of everything that doesn’t lead to that product value. Ruthlessly edit. Remove all friction that distracts the user from experiencing product value.
  3. Don’t be afraid to educate contextually. But do so with integrated UI.

Luke talked a lot about what’s happening in mobile apps, and mentioned that the mobile web only gets 11 minutes to the native’s 169. But let’s dive into this, because people sometimes think that a “mobile strategy” comes down to picking between these two. 50% of those 169 minutes are spent in your most used app (Facebook). 78% of the time is spent in the top 5 apps. Now the mobile web doesn’t look so bad. It turns out you can get people to a mobile web experience much, much faster than to a native one. The audience size is much, much, much higher on the web (although people will do more in a dedicated native app). So strategically both are useful—the web can attract people to native.

Back to our planet, and those 3 hours of usage on smartphones every day. People unlock their phones around 80 times a day. The average time people sleep is about 8 hours. So for every 12 waking minutes, you’re unlocking your phone. Given this frequency, it’s unsurprising that most sessions are very short—most under 30 seconds.

Given that, if things are slow, you’re going to really, really, really hate it. Waiting for slow pages to load is what really pisses people off.

The cognitive load and stress of waiting for slow pages is worse than waiting in line in a store, or watching a horror movie. That’s an industry that’s all about stressing people out by design! But experiencing mobile delays is more stressful! Probably because people aren’t watching horror movies every 12 minutes.

Because mobile delays are such a big deal, many mobile apps reach for loading spinners. But Luke saw that adding a spinner to his product increased complaints of slow loading times. Of course! The spinner is explicitly telling people, “Hey, we’re slow.”

So the switched to skeleton screens. This should feel like something is always happening. Focus on the progress, not the progress indicator. Occupied time feels shorter than unoccupied time.

A lot of people have implemented skeleton screen, but without the progressive loading. Swapping out a skeleton screen to a completely different UI all at once doesn’t help. The skeleton screens should represent the real content.

This is a lot of work; figuring how to prioritise what to load first. Luke isn’t talking about the techical side here, but the user’s experience. Investing in getting this right makes a lot of sense.

Let’s look a little closer at this number: people interacting with their phones 80 times a day. The average user touches the device 2,617 times a day. A power user touches the device over 5,000 times a day. Most touches are within one app.

90% of the touches are dealing with one thumb. Young people tend to operate with one hand. For older people, it’s more like 60%.

This is why your interface targets need to work for the thumb.

On phones, 90% of the time you’re dealing with portrait mode. Things at the top of the screen on larger devices are hard to reach. Core actions gravitate to the bottom of the screen.

Opera Touch is a new browser designed specifically for one-handed use. The Palm Pre’s WebOS was also about one-hand usage. Now that’s how iOS and Android work: swiping up from the bottom.

So mobile usage is:

  • One-handed/thumb.
  • In portrait mode on large screens.
  • Design accordingly.

What’s next? What do we need to be aware of so we don’t get caught with our pants down?

We can use the product lifecycle chart to figure this out. There’s an emergent phase, then a growth phase, then consolidation in a mature market, and then that gets disrupted and becomes a declining market.

  • Mobile devices—hand computers—are in a mature consolidated market.
  • Desktop and laptop computers are in a declining market.
  • Wrist computers and voice computers are in a growth market.

Small screens get used more frequently, but for shorter periods of time than large screens. Wrist and voice computers are figuring out what their core offerings are.

In the emergent category, it’s all about exploration. We have no idea how things will turn out. We just don’t know. But we do know that we are now designing for lots and lots of different devices.

For today, though, focusing on mobile is still a pretty good idea.

To summarise:

  • It’s a mobile planet.
  • Understanding real world usage helps you design.
  • Prep for what’s next