Journal tags: descript

3

Alt writing

I made the website for this year’s UX London by hand.

Well, that’s not entirely true. There’s exactly one build tool involved. I’m using Sergey to include global elements—the header and footer—something that’s still not possible in HTML.

So it’s minium viable static site generation rather than actual static files. It’s still very hands-on though and I enjoy that a lot; editing HTML and CSS directly without intermediary tools.

When I update the site, it’s usually to add a new speaker to the line-up (well, not any more now that the line up is complete). That involves marking up their bio and talk description. I also create a couple of different sized versions of their headshot to use with srcset. And of course I write an alt attribute to accompany that image.

By the way, Jake has an excellent article on writing alt text that uses the specific example of a conference site. It raises some very thought-provoking questions.

I enjoy writing alt text. I recently described how I updated my posting interface here on my own site to put a textarea for alt text front and centre for my notes with photos. Since then I’ve been enjoying the creative challenge of writing useful—but also evocative—alt text.

Some recent examples:

Time to go play some songs with @SalterCane.
A close-up of a microphone in a practice room. In the background, a guitar player tunes up and a bass player waits to start.
Brighton in the sun.
People sitting around in the dappled sunshine on the green grass in a park with the distinctive Indian-inspired architecture of the Brighton Pavilion in the background, all under a clear blue sky.
Duck leg on white beans with sage, garlic, rosemary and olives.
Looking down on the crispy browned duck leg contrasting with the white beans, all with pieces of green fried herbs scattered throughout.

But when I was writing the alt text for the headshots on the UX London site, I started to feel a little disheartened. The more speakers were added to the line-up, the more I felt like I was repeating myself with the alt text. After a while they all seemed to be some variation on “This person looking at the camera, smiling” with maybe some detail on their hair or clothing.

Videha Sharma
The beaming bearded face of Videha standing in front of the beautiful landscape of a riverbank.
Candi Williams
Candi working on her laptop, looking at the camera with a smile.
Emma Parnell
Emma smiling against a yellow background. She’s wearing glasses and has long straight hair.
John Bevan
A monochrome portrait of John with a wry smile on his face, wearing a black turtleneck in the clichéd design tradition.
Laura Yarrow
Laura smiling, wearing a chartreuse coloured top.
Adekunle Oduye
A profile shot of Adekunle wearing a jacket and baseball cap standing outside.

The more speakers were added to the line-up, the harder I found it not to repeat myself. I wondered if this was all going to sound very same-y to anyone hearing them read aloud.

But then I realised, “Wait …these are kind of same-y images.”

By the very nature of the images—headshots of speakers—there wasn’t ever going to be that much visual variation. The experience of a sighted person looking at a page full of speakers is that after a while the images kind of blend together. So if the alt text also starts to sound a bit repetitive after a while, maybe that’s not such a bad thing. A screen reader user would be getting an equivalent experience.

That doesn’t mean it’s okay to have the same alt text for each image—they are all still different. But after I had that realisation I stopped being too hard on myself if I couldn’t come up with a completely new and original way to write the alt text.

And, I remind myself, writing alt text is like any other kind of writing. The more you do it, the better you get.

In the zone

I went to art college in my younger days. It didn’t take. I wasn’t very good and I didn’t work hard. So I dropped out before they could kick me out.

But I remember one instance where I actually ended up putting in more work than my fellow students—an exceptional situation.

In the first year of art college, we did a foundation course. That’s when you try a bit of everything to help you figure out what you want to concentrate on: painting, sculpture, ceramics, printing, photography, and so on. It was a bit of a whirlwind, which was generally a good thing. If you realised you really didn’t like a subject, you didn’t have to stick it out for long.

One of those subjects was animation—a relatively recent addition to the roster. On the first day, the tutor gave everyone a pack of typing paper: 500 sheets of A4. We were told to use them to make a piece of animation. Put something on the first piece of paper. Take a picture. Now put something slightly different on the second piece of paper. Take a picture of that. Repeat another 498 times. At 24 frames a second, the result would be just over 20 seconds of animation. No computers, no mobile phones. Everything by hand. It was so tedious.

And I loved it. I ended up asking for more paper.

(Actually, this was another reason why I ended up dropping out. I really, really enjoyed animation but I wasn’t able to major in it—I could only take it as a minor.)

I remember getting totally absorbed in the production. It was the perfect mix of tedium and creativity. My mind was simultaneously occupied and wandering free.

Recently I’ve been re-experiencing that same feeling. This time, it’s not in the world of visuals, but of audio. I’m working on season two of the Clearleft podcast.

For both seasons and episodes, this is what the process looks like:

Decide on topics. This will come from a mix of talking to Alex, discussing work with my colleagues, and gut feelings about what might be interesting.
Gather material. This involves arranging interviews with people; sometimes co-workers, sometimes peers in the wider industry. I also trawl through the archives of talks from Clearleft conferences for relevent presentations.
Assemble the material. This is where I’m chipping away at the marble of audio interviews to get at the nuggets within. I play around with the flow of themes, trying different juxtapositions and narrative structures.
Tie everything together. I add my own voice to introduce the topic and segue from point to point.
Release. I upload the audio, update the RSS feed, and publish the transcript.

Lots of podcasts (that I really enjoy) stop at step two: record a conversation and then release it verbatim. Job done.

Being a glutton for punishment, I wanted to do more of an amalgamation for each episode, weaving multiple conversations together.

Right now I’m in step three. That’s where I’ve found the same sweet spot that I had back in my art college days. It’s somewhat mindless work, snipping audio waveforms and adjusting volume levels. At the same time, there’s the creativity of putting those audio snippets into a logical order. I find myself getting into the zone, losing track of time. It’s the same kind of flow state you get from just the right level of coding or design work. Normally this kind of work lends itself to having some background music, but that’s not an option with podcast editing. I’ve got my headphones on, but my ears are busy.

I imagine that is what life is like for an audio engineer or producer.

When I first started the Clearleft podcast, I thought I would need to use GarageBand for this work, arranging multiple tracks on a timeline. Then I discovered Descript. It’s been an enormous time-saver. It’s like having GarageBand and a text editor merged into one. I can see the narrative flow as a text document, as well as looking at the accompanying waveforms.

Descript isn’t perfect. The transcription accuracy is good enough to allow me to search through my corpus of material, but it’s not accurate enough to publish as is. Still, it gives me some nice shortcuts. I can elimate ums and ahs in one stroke, or shorten any gaps that are too long.

But even with all those conveniences, this is still time-consuming work. If I spend three or four hours with my head down sculpting some audio and I get anything close to five minutes worth of usable content, I consider it time well spent.

Sometimes when I’m knee-deep in a piece of audio, trimming and arranging it just so to make a sentence flow just right, there’s a voice in the back of my head that says, “You know that no one is ever going to notice any of this, don’t you?” I try to ignore that voice. I mean, I know the voice is right, but I still think it’s worth doing all this fine tuning. Even if nobody else knows, I’ll have the satisfaction of transforming the raw audio into something a bit more polished.

If you aren’t already subscribed to the RSS feed of the Clearleft podcast, I recommend adding it now. New episodes will start showing up …sometime soon.

Yes, I’m being a little vague on the exact dates. That’s because I’m still in the process of putting the episodes together.

So if you’ll excuse me, I need to put my headphones on and enter the zone.

Web standards, dictionaries, and design systems

Years ago, the world of web standards was split. Two groups—the W3C and the WHATWG—were working on the next iteration of HTML. They had different ideas about the nature of standardisation.

Broadly speaking, the W3C followed a specification-first approach. Figure out what should be implemented first and foremost. From this perspective, specs can be seen as blueprints for browsers to work from.

The WHATWG, by contrast, were implementation led. The way they saw it, there was no point specifying something if browsers weren’t going to implement it. Instead, specs are there to document existing behaviour in browsers.

I’m over-generalising somewhat in my descriptions there, but the point is that there was an ideological difference of opinion around what standards bodies should do.

This always reminded me of a similar ideological conflict when it comes to language usage.

Language prescriptivists attempt to define rules about what’s right or right or wrong in a language. Rules like “never end a sentence with a preposition.” Prescriptivists are generally fighting a losing battle and spend most of their time bemoaning the decline of their language because people aren’t following the rules.

Language descriptivists work the exact opposite way. They see their job as documenting existing language usage instead of defining it. Lexicographers—like Merriam-Webster or the Oxford English Dictionary—receive complaints from angry prescriptivists when dictionaries document usage like “literally” meaning “figuratively”.

Dictionaries are descriptive, not prescriptive.

I’ve seen the prescriptive/descriptive divide somewhere else too. I’ve seen it in the world of design systems.

Jordan Moore talks about intentional and emergent design systems:

There appears to be two competing approaches in designing design systems.

An intentional design system. The flavour and framework may vary, but the approach generally consists of: design system first → design/build solutions.

An emergent design system. This approach is much closer to the user needs end of the scale by beginning with creative solutions before deriving patterns and systems (i.e the system emerges from real, coded scenarios).

An intentional design system is prescriptive. An emergent design system is descriptive.

I think we can learn from the worlds of web standards and dictionaries here. A prescriptive approach might give you a beautiful design system, but if it doesn’t reflect the actual product, it’s fiction. A descriptive approach might give a design system with imperfections and annoying flaws, but at least it will be accurate.

I think it’s more important for a design system to be accurate than beautiful.

As Matthew Ström says, you should start with the design system you already have:

Instead of drawing a whole new set of components, start with the components you already have in production. Document them meticulously. Create a single source of truth for design, warts and all.