Journal tags: information

6

Directory enquiries

I was having a discussion with some of my peers a little while back. We were collectively commenting on the state of education and documentation for front-end development.

A lot of the old stalwarts have fallen by the wayside of late. CSS Tricks hasn’t been the same since it got bought out by Digital Ocean. A List Apart goes through fallow periods. Even the Mozilla Developer Network is looking to squander its trust by adding inaccurate “content” generated by a large language model.

The most obvious solution is to start up a brand new resource for front-end developers. But there are two probems with that:

It’s really, really, really hard work, and
It feels a bit 927.

I actually think there are plenty of good articles and resources on front-end development being published. But they’re not being published in any one specific place. People are publishing them on their own websites.

Ahmed, Josh, Stephanie, Andy, Lea, Rachel, Robin, Michelle …I could go on, but you get the picture.

All this wonderful stuff is distributed across the web. If you have a well-stocked RSS reader, you’re all set. But if you’re new to front-end development, how do you know where to find this stuff? I don’t think you can rely on search, unless you have a taste for slop.

I think the solution lies not with some hand-wavey “AI” algorithm that burns a forest for every query. I think the solution lies with human curation.

I take inspiration from Phil’s fantastic project, ooh.directory. Imagine taking that idea of categorisation and applying it to front-end dev resources.

Whether it’s a post on web.dev, Smashing Magazine, or someone’s personal site, it could be included and categorised appropriately.

Now, there would still be a lot of work involved, especially in listing and categorising the articles that are already out there, but it wouldn’t be nearly as much work as trying to create those articles from scratch.

I don’t know what the categories should be. Does it make sense to have top-level categories for HTML, CSS, and JavaScript, with sub-directories within them? Or does it make more sense to categorise by topics like accessibility, animation, and so on?

And this being the web, there’s no reason why one article couldn’t be tagged to simultaneously live in multiple categories.

There’s plenty of meaty information architecture work to be done. And there’d be no shortage of ongoing work to handle new submissions.

A stretch goal could be the creation of “playlists” of hand-picked articles. “Want to get started with CSS grid layout? Read that article over there, watch this YouTube video, and study this page on MDN.”

What do you think? Does this one-stop shop of hyperlinks sound like it would be useful? Does it sound feasible?

I’m just throwing this out there. I’d love it if someone were to run with it.

Labels

I love libraries. I think they’re one of humanity’s greatest inventions.

My local library here in Brighton is terrific. It’s well-stocked, it’s got a welcoming atmosphere, and it’s in a great location.

But it has an information architecture problem.

Like most libraries, it’s using the Dewey Decimal system. It’s not a great system, but every classification system is going to have flaws—wherever you draw boundaries, there will be disagreement.

The Dewey Decimal class of 900 is for history and geography. Within that class, those 100 numbers (900 to 999) are further subdivded in groups of 10. For example, everything from 940 to 949 is for the history of Europe.

Dewey Decimal number 941 is for the history of the British Isles. The term “British Isles” is a geographical designation. It’s not a good geographical designation, but technically it’s not a political term. So it’s actually pretty smart to use a geographical rather than a political term for categorisation: geology moves a lot slower than politics.

But the Brighton Library is using the wrong label for their shelves. Everything under 941 is labelled “British History.”

The island of Ireland is part of the British Isles.

The Republic of Ireland is most definitely not part of Britain.

Seeing books about the history of Ireland, including post-colonial history, on a shelf labelled “British History” is …not good. Frankly, it’s offensive.

(I mentioned this situation to an English friend of mine, who said “Well, Ireland was once part of the British Empire”, to which I responded that all the books in the library about India should also be filed under “British History” by that logic.)

Just to be clear, I’m not saying there’s a problem with the library using the Dewey Decimal system. I’m saying they’re technically not using the system. They’ve deviated from the system’s labels by treating “History of the British Isles” and “British History” as synonymous.

I spoke to the library manager. They told me to write an email. I’ve written an email. We’ll see what happens.

You might think I’m being overly pedantic. That’s fair. But the fact this is happening in a library in England adds to the problem. It’s not just technically incorrect, it’s culturally clueless.

Mind you, I have noticed that quite a few English people have a somewhat fuzzy idea about the Republic of Ireland. Like, they understand it’s a different country, but they think it’s a different country in the way that Scotland is a different country, or Wales is a different country. They don’t seem to grasp that Ireland is a different country like France is a different country or Germany is a different country.

It would be charming if not for, y’know, those centuries of subjugation, exploitation, and forced starvation.

British history.

Update: They fixed it!

Tweaking navigation labelling

I’ve always liked the idea that your website can be your API. Like, you’ve already got URLs to identify resources, so why not make that URL structure predictable and those resources parsable?

That’s why the (read-only) API for The Session doesn’t live at a separate subdomain. It uses the same URL structure as the regular site, but you can request the resources in an alternative format: JSON, XML, RSS.

This works out pretty well, mostly because I put a lot of thought into the URL structure of the site. I’m something of a URL fetishist, but I think that taking a URL-first approach to information architecture can be a good exercise.

Most of the resources on The Session involve nouns like tunes, events, discussions, and so on. There’s a consistent and predictable structure to the URLs for those sections:

/things
/things/new
/things/search

And then an idividual item can be found at:

things/ID

That’s all nice and predictable and the naming of the URLs matches what you’d expect to find:

Tunes, events, discussions, sessions. Those are all fine. But there’s one section of the site that has this root URL:

/recordings

When I was coming up with the URL structure twenty years ago, it was clear what you’d find there: track listings for albums of music. No one would’ve expected to find actual recordings of music available to listen to on-demand. The bandwidth constraints and technical limitations of the time made that clear.

Two decades on, the situation has changed. Now someone new to the site might well expect to hit a link called “recordings” and expect to hear actual recordings of music.

So I should probably change the label on the link. I don’t think “albums” is quite right—what even is an album any more? The word “discography” is probably the most appropriate label.

Here’s my dilemma: if I update the label, should I also update the URL structure?

Right now, the section of the site with /tunes URLs is labelled “tunes”. The section of the site with /events URLs is labelled “events”. Currently the section of the site with /recordings URLs is labelled “recordings”, but may soon be labelled “discography”.

If you click on “tunes”, you end up at /tunes. But if you click on “discography”, you end up at /recordings.

Is that okay? Am I the only one that would be bothered by that?

I could update the URLs to match the labelling (with redirects for the old URLs, of course), but I’m not so keen on this URL structure:

/discography
/discography/new
/discography/search
/discography/ID

It doesn’t seem as tidy as:

/recordings
/recordings/new
/recordings/search
/recordings/ID

But if I don’t update the URLs to match the label, then I’m just going to have to live with the mismatch.

I’m just thinking out loud here. I think I should definitely update the label. I just won’t make any decision on changing URLs for a while yet.

Directory enquiries

I was talking to someone recently about a forgotten battle in the history of the early web. It was a battle between search engines and directories.

These days, when the history of the web is told, a whole bunch of services get lumped into the category of “competitors who lost to Google search”: Altavista, Lycos, Ask Jeeves, Yahoo.

But Yahoo wasn’t a search engine, at least not in the same way that Google was. Yahoo was a directory with a search interface on top. You could find what you were looking for by typing or you could zero in on what you were looking for by drilling down through a directory structure.

Yahoo wasn’t the only directory. DMOZ was an open-source competitor. You can still experience it at DMOZlive.com:

The official DMOZ.com site was closed by AOL on February 17th 2017. DMOZ Live is committed to continuing to make the DMOZ Internet Directory available on the Internet.

Search engines put their money on computation, or to use today’s parlance, algorithms (or if you’re really shameless, AI). Directories put their money on humans. Good ol’ information architecture.

It turned out that computation scaled faster than humans. Search won out over directories.

Now an entire generation has been raised in the aftermath of this battle. Monica Chin wrote about how this generation views the world of information:

Catherine Garland, an astrophysicist, started seeing the problem in 2017. She was teaching an engineering course, and her students were using simulation software to model turbines for jet engines. She’d laid out the assignment clearly, but student after student was calling her over for help. They were all getting the same error message: The program couldn’t find their files.

Garland thought it would be an easy fix. She asked each student where they’d saved their project. Could they be on the desktop? Perhaps in the shared drive? But over and over, she was met with confusion. “What are you talking about?” multiple students inquired. Not only did they not know where their files were saved — they didn’t understand the question.

Gradually, Garland came to the same realization that many of her fellow educators have reached in the past four years: the concept of file folders and directories, essential to previous generations’ understanding of computers, is gibberish to many modern students.

Dr. Saavik Ford confirms:

We are finding a persistent issue with getting (undergrad, new to research) students to understand that a file/directory structure exists, and how it works. After a debrief meeting today we realized it’s at least partly generational.

We live in a world ordered only by search:

While some are quite adept at using labels, tags, and folders to manage their emails, others will claim that there’s no need to do because you can easily search for whatever you happen to need. Save it all and search for what you want to find. This is, roughly speaking, the hot mess approach to information management. And it appears to arise both because search makes it a good-enough approach to take and because the scale of information we’re trying to manage makes it feel impossible to do otherwise. Who’s got the time or patience?

There are still hold-outs. You can prise files from Scott Jenson’s cold dead hands.

More recently, Linus Lee points out what we’ve lost by giving up on directory structures:

Humans are much better at choosing between a few options than conjuring an answer from scratch. We’re also much better at incrementally approaching the right answer by pointing towards the right direction than nailing the right search term from the beginning. When it’s possible to take a “type in a query” kind of interface and make it more incrementally explorable, I think it’s almost always going to produce a more intuitive and powerful interface.

Directory structures still make sense to me (because I’m old) but I don’t have a problem with search. I do have a problem with systems that try to force me to search when I want to drill down into folders.

I have no idea what Google Drive and Dropbox are doing but I don’t like it. They make me feel like the opposite of a power user. Trying to find a file using their interfaces makes me feel like I’m trying to get a printer to work. Randomly press things until something happens.

Anyway. Enough fist-shaking from me. I’m going to ponder Linus’s closing words. Maybe defaulting to a search interface is a cop-out:

Text search boxes are easy to design and easy to add to apps. But I think their ease on developers may be leading us to ignore potential interface ideas that could let us discover better ideas, faster.

Defining the damn thang

Chris recently documented the results from his survey which asked:

Is it useful to distinguish between “web apps” and “web sites”?

His conclusion:

There is just nothing but questions, exemptions, and gray area.

This is something I wrote about a while back:

Like obscenity and brunch, web apps can be described but not defined.

The results of Chris’s poll are telling. The majority of people believe there is a difference between sites and apps …but nobody can agree on what it is. The comments make for interesting reading too. The more people chime in an attempt to define exactly what a “web app” is, the more it proves the point that the the term “web app” isn’t a useful word (in the sense that useful words should have an agreed-upon meaning).

Tyler Sticka makes a good point:

By this definition, web apps are just a subset of websites.

I like that. It avoids the false dichotomy that a product is either a site or an app.

But although it seems that the term “web app” can’t be defined, there are a lot of really smart people who still think it has some value.

Having spent years working on both apps & sites, I think there’s a significant difference. Hard to explain doesn’t mean non-existent.
— Cennydd Bowles (@Cennydd) December 6, 2013

I think Cennydd is right. I think the differences exist …but I also think we’re looking for those differences at the wrong scale. Rather than describing an entire product as either a website or an web app, I think it makes much more sense to distinguish between patterns.

Ok, I’ll try. An app’s primary material is behaviour. A website’s primary material is information.
— Cennydd Bowles (@Cennydd) December 6, 2013

Let’s take those two modifiers—behavioural and informational. But let’s apply them at the pattern level.

The “get stuff” sites that Jake describes will have a lot of informational patterns: how best to present a flow of text for reading, for example. Typography, contrast, whitespace; all of those attributes are important for an informational pattern.

The “do stuff” sites will probably have a lot of behavioural patterns: entering information or performing an action. Feedback, animation, speed; these are some of the possible attributes of a behavioural pattern.

But just about every product out there on the web contains a combination of both types of pattern. Like I said:

Is Wikipedia a website up until the point that I start editing an article? Are Twitter and Pinterest websites while I’m browsing through them but then flip into being web apps the moment that I post something?

Now you could make an arbitrary decision that any product with more than 50% informational patterns is a website, and any product with more than 50% behavioural patterns is a web app, but I don’t think that’s very useful.

Take a look at Brad’s collection of responsive patterns. Some of them are clearly informational (tables, images, etc.), while some of them are much more behavioural (carousels, notifications, etc.). But Brad doesn’t divide his collection into two, saying “Here are the patterns for websites” and “Here are the patterns for web apps.” That would be a dumb way to divide up his patterns, and I think it’s an equally dumb way to divide up the whole web.

What I’m getting at here is that, rather than trying to answer the question “what is a web app, anyway?”, I think it’s far more important to answer the other question I posed:

Why?

Why do you want to make that distinction? What benefit do you gain by arbitrarily dividing the entire web into two classes?

I think by making the distinction at the pattern level, that question starts to become a bit easier to answer. One possible answer is to do with the different skills involved.

For example, I know plenty of designers who are really, really good at informational patterns—they can lay out content in a beautiful, clear way. But they are less skilled when it comes to thinking through all the permutations involved in behavioural patterns—the “arrow of time” that’s part of so much interaction design. And vice-versa: a skilled interaction designer isn’t necessarily the best at old-skill knowledge of type, margins, and hierarchy. But both skillsets will be required on an almost every project on the web.

So I do believe there is value in distinguishing between behaviour and information …but I don’t believe there is value in trying to shoehorn entire products into just one of those categories. Making the distinction at the pattern level, though? That I can get behind.

Addendum

Incidentally, some of the respondents to Chris’s poll shared my feeling that the term “web app” was often used from a marketing perspective to make something sound more important and superior:

Perhaps it’s simply fashion. Perhaps “website” just sounds old-fashioned, and “web app” lends your product a more up-to-date, zingy feeling on par with the native apps available from the carefully-curated walled gardens of app stores.

Approaching things from the patterns perspective, I wonder if those same feelings of inferiority and superiority are driving the recent crop of behavioural patterns for informational content: parallaxy, snowfally, animation patterns are being applied on top of traditional informational patterns like hierarchy, measure, and art direction. I’m not sure that the juxtaposition is working that well. Taking the single interaction involved in long-form informational patterns (that interaction would be scrolling) and then using it as a trigger for all kinds of behavioural patterns feels …uncanny.

Battle for the planet of the APIs

Back in 2006, I gave a talk at dConstruct called The Joy Of API. It basically involved me geeking out for 45 minutes about how much fun you could have with APIs. This was the era of the mashup—taking data from different sources and scrunching them together to make something new and interesting. It was a good time to be a geek.

Anil Dash did an excellent job of describing that time period in his post The Web We Lost. It’s well worth a read—and his talk at The Berkman Istitute is well worth a listen. He described what the situation was like with APIs:

Five years ago, if you wanted to show content from one site or app on your own site or app, you could use a simple, documented format to do so, without requiring a business-development deal or contractual agreement between the sites. Thus, user experiences weren’t subject to the vagaries of the political battles between different companies, but instead were consistently based on the extensible architecture of the web itself.

Times have changed. These days, instead of seeing themselves as part of a wider web, online services see themselves as standalone entities.

So what happened?

Facebook happened.

I don’t mean that Facebook is the root of all evil. If anything, Facebook—a service that started out being based on exclusivity—has become more open over time. That’s the cause of many of its scandals; the mismatch in mental models that Facebook users have built up about how their data will be used versus Facebook’s plans to make that data more available.

No, I’m talking about Facebook as a role model; the template upon which new startups shape themselves.

In the web’s early days, AOL offered an alternative. “You don’t need that wild, chaotic lawless web”, it proclaimed. “We’ve got everything you need right here within our walled garden.”

Of course it didn’t work out for AOL. That proposition just didn’t scale, just like Yahoo’s initial model of maintaining a directory of websites just didn’t scale. The web grew so fast (and was so damn interesting) that no single company could possibly hope to compete with it. So companies stopped trying to compete with it. Instead they, quite rightly, saw themselves as being part of the web. That meant that they didn’t try to do everything. Instead, you built a service that did one thing really well—sharing photos, managing links, blogging—and if you needed to provide your users with some extra functionality, you used the best service available for that, usually through someone else’s API …just as you provided your API to them.

Then Facebook began to grow and grow. I remember the first time someone was showing me Facebook—it was Tantek of all people—I remember asking “But what is it for?” After all, Flickr was for photos, Delicious was for links, Dopplr was for travel. Facebook was for …everything …and nothing.

I just didn’t get it. It seemed crazy that a social network could grow so big just by offering …well, a big social network.

But it did grow. And grow. And grow. And suddenly the AOL business model didn’t seem so crazy anymore. It seemed ahead of its time.

Once Facebook had proven that it was possible to be the one-stop-shop for your user’s every need, that became the model to emulate. Startups stopped seeing themselves as just one part of a bigger web. Now they wanted to be the only service that their users would ever need …just like Facebook.

Seen from that perspective, the open flow of information via APIs—allowing data to flow porously between services—no longer seemed like such a good idea.

Not only have APIs been shut down—see, for example, Google’s shutdown of their Social Graph API—but even the simplest forms of representing structured data have been slashed and burned.

Twitter and Flickr used to markup their user profile pages with microformats. Your profile page would be marked up with hCard and if you had a link back to your own site, it include a rel=”me” attribute. Not any more.

Then there’s RSS.

During the Q&A of that 2006 dConstruct talk, somebody asked me about where they should start with providing an API; what’s the baseline? I pointed out that if they were already providing RSS feeds, they already had a kind of simple, read-only API.

Because there’s a standardised format—a list of items, each with a timestamp, a title, a description (maybe), and a link—once you can parse one RSS feed, you can parse them all. It’s kind of remarkable how many mashups can be created simply by using RSS. I remember at the first London Hackday, one of my favourite mashups simply took an RSS feed of the weather forecast for London and combined it with the RSS feed of upcoming ISS flypasts. The result: a Twitter bot that only tweeted when the International Space Station was overhead and the sky was clear. Brilliant!

Back then, anywhere you found a web page that listed a series of items, you’d expect to find a corresponding RSS feed: blog posts, uploaded photos, status updates, anything really.

That has changed.

Twitter used to provide an RSS feed that corresponded to my HTML timeline. Then they changed the URL of the RSS feed to make it part of the API (and therefore subject to the terms of use of the API). Then they removed RSS feeds entirely.

On the Salter Cane site, I want to display our band’s latest tweets. I used to be able to do that by just grabbing the corresponding RSS feed. Now I’d have to use the API, which is a lot more complex, involving all sorts of authentication gubbins. Even then, according to the terms of use, I wouldn’t be able to display my tweets the way I want to. Yes, how I want to display my own data on my own site is now dictated by Twitter.

Thanks to Jo Brodie I found an alternative service called Twitter RSS that gives me the RSS feed I need, ‘though it’s probably only a matter of time before that gets shuts down by Twitter.

Jo’s feelings about Twitter’s anti-RSS policy mirror my own:

I feel a pang of disappointment at the fact that it was really quite easy to use if you knew little about coding, and now it might be a bit harder to do what you easily did before.

That’s the thing. It’s not like RSS is a great format—it isn’t. But it’s just good enough and just versatile enough to enable non-programmers to make something cool. In that respect, it’s kind of like HTML.

The official line from Twitter is that RSS is “infrequently used today.” That’s the same justification that Google has given for shutting down Google Reader. It reminds of the joke about the shopkeeper responding to a request for something with “Oh, we don’t stock that—there’s no call for it. It’s funny though, you’re the fifth person to ask today.”

RSS is used a lot …but much of the usage is invisible:

RSS is plumbing. It’s used all over the place but you don’t notice it.

That’s from Brent Simmons, who penned a love letter to RSS:

If you subscribe to any podcasts, you use RSS. Flipboard and Twitter are RSS readers, even if it’s not obvious and they do other things besides.

He points out the many strengths of RSS, including its decentralisation:

It’s anti-monopolist. By design it creates a level playing field.

How foolish of us, therefore, that we ended up using Google Reader exclusively to power all our RSS consumption. We took something that was inherently decentralised and we locked it up into one provider. And now that provider is going to screw us over.

I hope we won’t make that mistake again. Because, believe me, RSS is far from dead just because Google and Twitter are threatened by it.

In a post called The True Web, Robin Sloan reiterates the strength of RSS:

It will dip and diminish, but will RSS ever go away? Nah. One of RSS’s weaknesses in its early days—its chaotic decentralized weirdness—has become, in its dotage, a surprising strength. RSS doesn’t route through a single leviathan’s servers. It lacks a kill switch.

I can understand why that power could be seen as a threat if what you are trying to do is force your users to consume their own data only the way that you see fit (and all in the name of “user experience”, I’m sure).

Returning to Anil’s description of the web we lost:

We get a generation of entrepreneurs encouraged to make more narrow-minded, web-hostile products like these because it continues to make a small number of wealthy people even more wealthy, instead of letting lots of people build innovative new opportunities for themselves on top of the web itself.

I think that the presence or absence of an RSS feed (whether I actually use it or not) is a good litmus test for how a service treats my data.

Instagram doesn’t provide an RSS feed of my uploaded photos.
Twitter doesn’t provide an RSS feed of my tweets.
Facebook doesn’t provide an RSS feed of my band’s updates

It might be that RSS is the canary in the coal mine for my data on the web.

If those services don’t trust me enough to give me an RSS feed, why should I trust them with my data?