Tony Stark’s Web Browser…

--

Hey… Did you know that there are APIs in the browser you’re using right now that can allow it to talk to you? I mean, I’m not 100% positive about that, but if you’re using a modern browser it’s almost certainly true. In terms of browser support, it’s more widely supported than a whole bunch of stuff we’re all using already — and, actually, has had support for longer than probably anything you’re excited about today. It’s not standard in the sense that there isn’t an “official standard” but it is pretty nearly so in the de facto sense (lots of stuff started out like this way back). Further, part of the same proposal deals with allowing you to talk to your browser. Holy shit. That bit is less widely implemented, but it is available in the releases of Chrome and derivatives so it’s still available for you to play around with for a shit ton of people.

Over the past few months I’ve been toying around with this a lot in my spare time, and I have written several pieces on my own blog. You might have seen a few weeks back on Smashing Magazine, Tomomi Imura also beat me to the punch and posted a very nice piece on how you can basically start building your very own Jarvis (or Friday if you prefer) and it was so good that it caused me to go back and rethink what I was going to post to avoid overlap… So here are my pieces:

The History and State of Speech introduces a brief history of us trying to make machines that speak and listen and explains the efforts to bring these abilities into the browser starting in 1997!

Greetings, Professor Falken introduces and walks you through the Text-To-Speech bits of the APIs and all of their wrinkles and warts. It’s full of fun code samples and runnable demos.

You Don’t Say discusses what I see as design problems with the existing APIs. It’s about how I’m papering over them myself and why as I attempt to figure out how I think this really should be explained in Extensible Web form. It also has some fun code samples and demos to illustrate.

Listen Up introduces and walks you through the Voice Recognition bits and also discusses its warts. It’s a little harder to demo, but there’s a number of code samples explaining.

Thoughts on Voice Recognition discusses the design problems in those APIs in similar fashion to the one about TTS. It has lots of hypothetical code examples I’d like to see us make work and explains why.

If you’re not into all that reading but you still find this interesting, I gave two talks and led some town hall style discussion/Q&A in which I attempt to be moderately entertaining/engaging. You can check those out and let me know: There’s video of the one on TTS and my slides as well as video of the one on Voice Recognition and my slides for that as well (this one includes many beautiful pictures of world famous fashion model/trend-setter Bruce Lawson).

For the next few weeks we’re going to try to make an effort to get together virtually via Code & Supply’s Slack on the #chapters channel between 6pm-8pm EST and try to build some things, play with the APIs and talk about problems and challenges if you’d like to join us there.

I’d like to talk about restarting some standards efforts on this with feedback from developers, wdyt? Interesting?

--

--