Journal tags: measurements

4

Fidinpamp

If you’re a fan of gratuitous initialisms, you’ll love Google’s core web vitals. Just get a load of the obfuscation in the important-sounding metrics like CLS, FCP, LCP, and more.

To be fair to Google, this is a problem in the web performance world in general. Practioners prefer to talk about TTFB rather than “time to first byte” even though both contain exactly the same number of syllables.

The big news in the web performance community this month is the arrival of a new initialism. INP sounds like one of those pseudo-scientific psychologic profiles but it’s meant to stand for Interaction to Next Paint (even if they were to swear off pointless initialisms, you’d still have to pry Pointless Capitalisation from Google’s cold dead hands).

This new metric is a welcome one. It’s replacing first input delay. Sorry, First Input Delay, or FID, one of the few web vital initialisms that can be spoken as a word, making it a true acronym (fortunately fid’s successor, inp, also works as an acronym).

First Input Delay has long outstayed its welcome. It was always an outlier in the core web vitals. It didn’t seem to measure anything actually useful. I know it sounds like it’s measuring the delay until the user can interact with a web page, but when you dive into what it actually does, it’s a mess:

FID measures the time from when a user first interacts with a page (that is, when they click a link, tap on a button, or use a custom, JavaScript-powered control) to the time when the browser is actually able to begin processing event handlers in response to that interaction.

See that word “begin” in there? It’s doing a lot of work. First Input Delay doesn’t measure the lag between the user interaction and the browser response; it only measures the lag between the user interaction and the browser beginning to respond. The actual response could take ages, but that lag doesn’t get measured. Unlike the other core web vitals, this metric is very far removed from what actually matters to the user’s experience.

What the fid where they thinking? How the fid did this measurement ever get included in core web vitals in the first place?

Well, feel free to take what I’m about to say as pure gossip, but I have my sources, I trust ’em, and no, I’m not going to reveal ’em…

It’s because of AMP.

Remember Google AMP? An acronym so pointless they eventually just forgot it ever stood for anything?

The AMP project ended up doing incredible damage to Google’s developer relations. By colluding with the search team to privilege the appearance of AMP pages in the top news carousel, Google effectively blackmailed the entire publishing industry into using their format.

In the end, it didn’t work. It was a shit format. All they did was foster resentment and animosity:

AMP seems to have faded away. Most publishers have started dropping support, and even Google doesn’t seem to care much anymore.

It turns out that Google search wasn’t the only team infected by AMP. The core web vitals team also had to play ball.

Originally they had a genuinely useful metric for measuring the lag between input and response. But guess which pages did terribly? That’s right: AMP pages.

Rather than ship an actually-useful measurement, the core web vitals team instead had to include the broken First Input Delay, brainchild of a certain someone on the AMP team.

Now it all makes sense.

So good riddance to FID. Welcome to INP. And here’s hoping it won’t be much longer till we’re finally burying AMP.

Work ethics

If you’re travelling around Ireland, you may come across some odd pieces of 19th century architecture—walls, bridges, buildings and roads that serve no purpose. They date back to The Great Hunger of the 1840s. These “famine follies” were the result of a public works scheme.

The thinking went something like this: people are starving so we should feed them but we can’t just give people food for nothing so let’s make people do pointless work in exchange for feeding them (kind of like an early iteration of proof of work for cryptobollocks on blockchains …except with a blockchain, you don’t even get a wall or a road, just ridiculous amounts of wasted energy).

This kind of thinking seems reprehensible from today’s perspective. But I still see its echo in the work ethic espoused by otherwise smart people.

Here’s the thing: there’s good work and there’s working hard. What matters is doing good work. Often, to do good work you need to work hard. And so people naturally conflate the two, thinking that what matters is working hard. But whether you work hard or not isn’t actually what’s important. What’s important is that you do good work.

If you can do good work without working hard, that’s not a bad thing. In fact, it’s great—you’ve managed to do good work and do it efficiently! But often this very efficiency is treated as laziness.

Sensible managers are rightly appalled by so-called productivity tracking because it measures exactly the wrong thing. Those instruments of workplace surveillance measure inputs, not outputs (and even measuring outputs is misguided when what really matters are outcomes).

They can attempt to measure how hard someone is working, but they don’t even attempt to measure whether someone is producing good work. If anything, they actively discourage good work; there’s plenty of evidence to show that more hours equates to less quality.

I used to think that must be some validity to the belief that hard work has intrinsic value. It was a position that was espoused so often by those around me that it seemed a truism.

But after a few decades of experience, I see no evidence for hard work as an intrinsically valuable activity, much less a useful measurement. If anything, I’ve seen the real harm that can be caused by tying your self-worth to how much you’re working. That way lies burnout.

We no longer make people build famine walls or famine roads. But I wonder how many of us are constructing little monuments in our inboxes and calendars, filling those spaces with work to be done in an attempt to chase the rewards we’ve been told will result from hard graft.

I’d rather spend my time pursuing the opposite: the least work for the most people.

Doing the right thing for the wrong reasons

I remember trying to convince people to use semantic markup because it’s good for accessibility. That tactic didn’t always work. When it didn’t, I would add “By the way, Google’s searchbot is indistinguishable from a screen-reader user so semantic markup is good for SEO.”

That usually worked. It always felt unsatisfying though. I don’t know why. It doesn’t matter if people do the right thing for the wrong reasons. The end result is what matters. But still. It never felt great.

It happened with responsive design and progressive enhancement too. If I couldn’t convince people based on user experience benefits, I’d pull up some official pronouncement from Google recommending those techniques.

Even AMP, a dangerously ill-conceived project, has one very handy ace in the hole. You can’t add third-party JavaScript cruft to AMP pages. That’s useful:

Beleaguered developers working for publishers of big bloated web pages have a hard time arguing with their boss when they’re told to add another crappy JavaScript tracking script or bloated library to their pages. But when they’re making AMP pages, they can easily refuse, pointing out that the AMP rules don’t allow it. Google plays the bad cop for us, and it’s a very valuable role.

AMP is currently dying, which is good news. Google have announced that core web vitals will be used to boost ranking instead of requiring you to publish in their proprietary AMP format. The really good news is that the political advantage that came with AMP has also been ported over to core web vitals.

Take user-hostile obtrusive overlays. Perhaps, as a contientious developer, you’ve been arguing for years that they should be removed from the site you work on because they’re so bad for the user experience. Perhaps you have been met with the same indifference that I used to get regarding semantic markup.

Well, now you can point out how those annoying overlays are affecting, for example, the cumulative layout shift for the site. And that number is directly related to SEO. It’s one thing for a department to over-ride UX concerns, but I bet they’d think twice about jeopardising the site’s ranking with Google.

I know it doesn’t feel great. It’s like dealing with a bully by getting an even bigger bully to threaten them. Still. Needs must.

Numbers

Core web vitals from Google are the ingredients for an alphabet soup of exlusionary intialisms. But once you get past the unnecessary jargon, there’s a sensible approach underpinning the measurements.

From May—no, June—these measurements will be a ranking signal for Google search so performance will become more of an SEO issue. This is good news. This is what Google should’ve done years ago instead of pissing up the wall with their dreadful and damaging AMP project that blackmailed publishers into using a proprietary format in exchange for preferential search treatment. It was all done supposedly in the name of performance, but in reality all it did was antagonise users and publishers alike.

Core web vitals are an attempt to put numbers on user experience. This is always a tricky balancing act. You’ve got to watch out for the McNamara fallacy. Harry has already started noticing this:

A new and unusual phenomenon: clients reluctant (even refusing) to fix performance issues unless they directly improve Vitals.

Once you put a measurement on something, there’s a danger of focusing too much on the measurement. Chris is worried that we’re going to see tips’n’tricks for gaming core web vitals:

This feels like the start of a weird new era of web performance where the metrics of web performance have shifted to user-centric measurements, but people are implementing tricky strategies to game those numbers with methods that, if anything, slightly harm user experience.

The map is not the territory. The numbers are a proxy for user experience, but it’s notoriously difficult to measure intangible ideas like pain and frustration. As Laurie says:

This is 100% the downside of automatic tools that give you a “score”. It’s like gameification. It’s about hitting that perfect score instead of the holistic experience.

And Ethan has written about the power imbalance that exists when Google holds all the cards, whether it’s AMP or core web vitals:

Google used its dominant position in the marketplace to force widespread adoption of a largely proprietary technology for creating websites. By switching to Core Web Vitals, those power dynamics haven’t materially changed.

We would do well to remember:

When you measure, include the measurer.

But if we’re going to put numbers to user experience, the core web vitals are a pretty good spread of measurements: largest contentful paint, cumulative layout shift, and first input delay.

(If you prefer using initialisms, remember that CFP is Certified Financial Planner, CLS is Community Legal Services, and FID is Flame Ionization Detector. Together they form CWV, Catholic War Veterans.)