Journal tags: abc

4

Speedy tunes

Performance is a high priority for me with The Session. It needs to work for people all over the world using all kinds of devices.

My main strategy for ensuring good performance is to diligently apply progressive enhancement. The core content is available to any device that can render HTML.

To keep things performant, I’ve avoided as many assets (or, more accurately, liabilities) as possible. No uneccessary images. No superfluous JavaScript libraries. Not even any web fonts (gasp!). And definitely no third-party resources.

The pay-off is a speedy site. If you want to see lab data, run a page from The Session through lighthouse. To see field data, take a look at data from Chrome UX Report (Crux).

But the devil is in the details. Even though most pages on The Session are speedy, the outliers have bothered me for a while.

Take a typical tune page on the site. The data is delivered from the server as HTML, which loads nice and quick. That data includes the notes for the tune settings, written in ABC notation, a nice lightweight text format.

Then the enhancement happens. Using Paul Rosen’s brilliant abcjs JavaScript library, those ABCs are converted into SVG sheetmusic.

So on tune pages there’s an additional download for that JavaScript library. That’s not so bad though—I’m using a service worker to cache that file so there’ll only ever be one initial network request.

If a tune has just a few different versions, the page remains nice and zippy. But if a tune has lots of settings, the computation starts to add up. Converting all those settings from ABC to SVG starts to take a cumulative toll on the main thread.

I pondered ways to avoid that conversion step. Was there some way of pre-generating the SVGs on the server rather than doing it all on the client?

In theory, yes. I could spin up a headless browser, run the JavaScript and take a snapshot. But that’s a bit beyond my backend programming skills, so I’ve taken a slightly different approach.

The first time anyone hits a tune page, the ABCs getting converted to SVGs as usual. But now there’s one additional step. I grab the generated markup and send it as an Ajax payload to an endpoint on my server. That endpoint stores the sheetmusic as a file in a cache.

Next time someone hits that page, there’s a server-side check to see if the sheetmusic has been cached. If it has, send that down the wire embedded directly in the HTML.

The idea is that over time, most of the sheetmusic on the site will transition from being generated in the browser to being stored on the server.

So far it’s working out well.

Take a really popular tune like The Lark In The Morning. There are twenty settings, and each one has four parts. Previously that would have resulted in a few seconds of parsing and rendering time on the main thread. Now everything is delivered up-front.

I’m not out of the woods. A page like that with lots of sheetmusic and plenty of comments is going to have a hefty page weight and a large DOM size. I’ve still got a fair bit of main-thread work happening, but now the bulk of it is style and layout, whereas previously I had the JavaScript overhead on top of that.

I’ll keep working on it. But overall, the speed improvement is great. A typical tune page is now very speedy indeed.

It’s like a microcosm of web performance in general: respect your users’ time, network connection and battery life. If that means shifting the work from the browser to the server, do it!

Audio

I spent the last couple of weekends rolling out a new feature on The Session. It involves playing audio in a web page. No big deal these days, right? But the history involves some old file formats…

The first venerable format is ABC notation. File extension: .abc, mime type: text/vnd.abc. It’s an ingenious text format for musical notation using ASCII. The metadata of the piece of music is defined in JSON-like key/value pairs. Then the contents are encoded with letters: A, B, C, etc. Uppercase and lowercase denote different octaves. Numbers can be used for note lengths.

The format was created by Chris Walshaw in 1997 when dial-up was the norm. With ABC, people were able to swap tunes on email lists or bulletin boards without transferring weighty image or sound files. If you had ABC software on your computer, you could convert that lightweight text file into sheet music …or audio.

That brings me to the second old format: midi files. File extension: .mid, mime-type: audio/midi. Like ABC, it’s a lightweight format for encoding the instructions for music instead of the music itself.

Think of it like SVG: instead of storing the final pixels of an image, SVG stores the instructions for drawing the image instead. The instructions in a midi file are like “play this note for this long on this instrument.” Again, as with ABC, you need some software to turn the instructions into sound.

There was a time when lots of software could play midi files. Quicktime on the Mac, for example. You could even embed midi files in web pages. I mean literally embed them …with the embed element. No Geocities page was complete without an autoplaying midi file.

On The Session, people submit tunes in ABC format. Then, using the amazing ABCJS JavaScript library, the ABC is turned into SVG on the fly! For years I’ve also offered midi files, generated on the server from the ABC notation.

But times have changed. These days it’s hard to find software that plays midi files. Quicktime doesn’t do it anymore. And you’d need to go to the app store on iOS to find a midi file player. It’s time to phase out the midi files on The Session.

I still want to provide automatically-generated audio though. Fortunately ABCJS gives me a way to do this. But instead of using the old technology of midi files, it uses a more modern browser feature: the Web Audio API.

The end result sounds like a midi file, but the underlying technique is more like a synthesiser. There’s a separate mp3 file for each note. The JavaScript figures out how long each “sample” needs to be played for, strings them all together, and outputs them with Web Audio. So you’ve got cutting-edge browser technology recreating a much older file format. Paul Rosen—the creator of ABCJS—has a presentation explaining how it all works under the hood.

Not only is there a separate short mp3 file for each note in seven octaves, but if you want the sound of a different instrument, you need samples for all seven octaves in that instrument. They’re called soundfonts.

Paul provides soundfonts for ABCJS. It’s a repo that was forked from this repo from Benjamin Gleitzman. And here’s where it gets small worldy…

The reason why Benjamin has a repo of soundfonts is because he needed to create midi-like audio in the browser. He wanted to do this for a project on September 28th and 29th, 2013 …at Science Hack Day San Francisco!

I was there too—working on my own audio-related hack—and I remember the excellent (and winning) hack that Benjamin worked on. It was called Symphony of Satellites and it’s still online along with the promo video. Here’s Benjamin’s post-hackday write-up from seven years ago.

It’s rare that the worlds of the web and Irish music cross over. When I got to meet Paul—creator of ABCJS—at a web conference a couple of years ago it kind of blew my mind. Last weekend when I set out to dabble with a feature on The Session, I certainly didn’t expect to stumble on a connection to Science Hack Day! (Aside: the first Science Hack Day was ten years ago—yowzers!)

Anyway, I was able to get that audio playback working on The Session. Except for some weirdness on iOS that I had to fix. But that’s a hack for another day.

Teaching in Porto, day five

For the final day of the week-long masterclass, I had no agenda. This was a time for the students to work on their own projects, but I was there to answer any remaining questions they might have.

As I suspected, the people with the most interest and experience in development were the ones with plenty of questions. I was more than happy to answer them. With no specific schedule for the day, we were free to merrily go chasing down rabbit holes.

SVG? Sure, I’d be happy to talk about that. More JavaScript? My pleasure! Databases? Not really my area of expertise, but I’m more than willing to share what I know.

It was a fun day. The centrepiece was a most excellent lunch across the river at a really traditional seafood place.

At the very end of the day, after everyone else had gone, I sat down with Tiago to discuss how the week went. Overall, I was happy. I was nervous going into this masterclass—I had never done a whole week of teaching—but based on the feedback I got, I think I did okay. There were times when I got impatient, and I wish I could turn back the clock and erase those moments. I noticed that those moments tended to occur when it was time for hands-on-keyboards coding: “no, not like that—like this!” I need to get better at handling those situations. But when we working on paper, or having stand-up discussions, or when I was just geeking out on a particular topic, everything felt quite positive.

All in all, this week has been a great experience. I know it sounds like a cliché, but I felt it was a real honour and a privilege to be involved with the New Digital School. I’ve enjoyed doing hands-on teaching, and I’d like to do more of it.

ABC

When I finally unveiled the redesigned and overhauled version of The Session at the end of 2012, it was the culmination of a lot of late nights and weekends. It was also a really great learning experience, one that I subsequently drew on to inform my An Event Apart presentation, The Long Web.

As part of that presentation, I give a little backstory on the ABC format. It’s a way of notating music using nothing more than ASCII text. It begins with some JSON-like metadata about the tune—its title, time signature, and key—followed by the notes of the tune itself—uppercase and lowercase letters denote different octaves, and numbers can be used to denote length:

X: 1
T: Cooley's
R: reel
M: 4/4
L: 1/8
K: Edor
|:D2|EBBA B2 EB|B2 AB dBAG|FDAD BDAD|FDAD dAFD|
EBBA B2 EB|B2 AB defg|afec dBAF|DEFD E2:|
|:gf|eB B2 efge|eB B2 gedB|A2 FA DAFA|A2 FA defg|
eB B2 eBgB|eB B2 defg|afec dBAF|DEFD E2:|

On The Session, a little bit of progressive enhancement produces a nice crisp SVG version of the sheet music at the user’s request (the non-JavaScript fallback is a server-rendered bitmap of the sheet music).

ABC notation dates back to the early nineties, a time of very limited bandwidth. Exchanging audio files or even images would have been prohibitively expensive. Having software installed on your machine that could convert ABC into sheet music or audio meant that people could share and exchange tunes through email, BBS, or even the then-fledgling World Wide Web.

In today’s world of relatively fast connections, ABC’s usefulness might seemed lessened. But in fact, it’s just as popular as it ever was. People have become used to writing (and even sight-reading) the format, and it has all the resilience that comes with being a text format; easily editable, and human-readable. It’s still the format that people use to submit new tune settings to The Session.

A little while back, I came upon another advantage of the ABC format, one that I had never previously thought of…

The Session has a wide range of users, of all ages, from all over the world, from all walks of life, using all sorts of browsers. I do my best to make sure that the site works for just about any kind of user-agent (while still providing plenty of enhancements for the most modern browsers). That includes screen readers. Some active members of The Session happen to be blind.

One of those screen-reader users got in touch with me shortly after joining to ask me to explain what ABC was all about. I pointed them at some explanatory links. Once the format “clicked” with them, they got quite enthused. They pointed out that if the sheet music were only available as an image, it would mean very little to them. But by providing the ABC notation alongside the sheet music, they could read the music note-for-note.

That’s when it struck me that ABC notation is effectively alt text for sheet music!

There’s one little thing that slightly irks me though. The ABC notation should be read out one letter at a time. But screen readers use a kind of fuzzy logic to figure out whether a set of characters should be spoken as a word:

Screen readers try to pronounce acronyms and nonsensical words if they have sufficient vowels/consonants to be pronounceable; otherwise, they spell out the letters. For example, NASA is pronounced as a word, whereas NSF is pronounced as “N. S. F.” The acronym URL is pronounced “earl,” even though most humans say “U. R. L.” The acronym SQL is not pronounced “sequel” by screen readers even though some humans pronounce it that way; screen readers say “S. Q. L.”

It’s not a big deal, and the screen reader user can explicitly request that a word be spoken letter by letter:

Screen reader users can pause if they didn’t understand a word, and go back to listen to it; they can even have the screen reader read words letter by letter. When reading words letter by letter, JAWS distinguishes between upper case and lower case letters by shouting/emphasizing the upper case letters.

But still …I wish there were some way that I could mark up the ABC notation so that a screen reader would know that it should be read letter by letter. I’ve looked into using abbr, but that offers no guarantees: if the string looks like a word, it will still be spoken as a word. It doesn’t look there’s any ARIA settings for this use-case either.

So if any accessibility experts out there know of something I’m missing, please let me know.

Update: I’ve added an aural CSS declaration of speak: spell-out (thanks to Martijn van der Ven for the tip), although I think the browser support is still pretty non-existent. Any other ideas?