Journal tags: tracking

19

Ad tech

Back when South by Southwest wasn’t terrible, there used to be an annual panel called Browser Wars populated with representatives from the main browser vendors (except for Apple, obviously, who would never venture onto a stage outside of their own events).

I remember getting into a heated debate with the panelists during the 2010 edition. I was mad about web fonts.

Just to set the scene, web fonts didn’t exist back in 2010. That’s what I was mad about.

There was no technical reason why we couldn’t have web fonts. The reason why we didn’t get web fonts for years and years was because browser makers were concerned about piracy and type foundries.

That’s nice and all, but as I said during that panel, I don’t recall any such concerns being raised for photographers when the img element was shipped. Neither was the original text-only web held back by the legimate fear by writers of plagiarism.

My point was not that these concerns weren’t important, but that it wasn’t the job of web browsers to shore up existing business models. To use standards-speak, these concerns are orthogonal.

I’m reminded of this when I see browser makers shoring up the business of behavioural advertising.

I subscribe to the RSS feed of updates to Chrome. Not all of it is necessarily interesting to me, but all of it is supposedly aimed at developers. And yet, in amongst the posts about APIs and features, there’ll be something about the Orwellianly-titled “privacy sandbox”.

This is only of interest to one specific industry: behavioural online advertising driven by surveillance and tracking. I don’t see any similar efforts being made for teachers, cooks, architects, doctors or lawyers.

It’s a ludicrous situation that I put down to the fact that Google, the company that makes Chrome, is also the company that makes its money from targeted advertising.

But then Mozilla started with the same shit.

Now, it’s one thing to roll out a new so-called “feature” to benefit behavioural advertising. It’s quite another to make it enabled by default. That’s a piece of deceptive design that has no place in Firefox. Defaults matter. Browser makers know this. It’s no accident that this “feature” was enabled by default.

This disgusts me.

It disgusts me all the more that it’s all for nothing. Notice that I’ve repeatly referred to behavioural advertising. That’s the kind that relies on tracking and surveillance to work.

There is another kind of advertising. Contextual advertising is when you show an advertisement related to the content of the page the user is currently on. The advertiser doesn’t need to know anything about the user, just the topic of the page.

Conventional wisdom has it that behavioural advertising is much more effective than contextual advertising. After all, why would there be such a huge industry built on tracking and surveillance if it didn’t work? See, for example, this footnote by John Gruber:

So if contextual ads generate, say, one-tenth the revenue of targeted ads, Meta could show 10 times as many ads to users who opt out of targeting. I don’t think 10× is an outlandish multiplier there — given how remarkably profitable Meta’s advertising business is, it might even need to be higher than that.

Seems obvious, right?

But the idea that behavioural advertising works better than contextual advertising has no basis in reality.

If you think you know otherwise, Jon Bradshaw would like to hear from you:

Bradshaw challenges industry to provide proof that data-driven targeting actually makes advertising more effective – or in fact makes it worse. He’s spoiling for a debate – and has three deep, recent studies that show: broad reach beats targeting for incremental growth; that the cost of targeting outweighs the return; and that second and third party data does not outperform a random sample. First party data does beat the random sample – but contextual ads massively outperform even first party data. And they are much, much cheaper. Now, says Bradshaw, let’s see some counter-evidence from those making a killing.

If targeted advertising is going to get preferential treatment from browser makers, I too would like to see some evidence that it actually works.

Responsibility

My colleague Chris has written a terrific post over on the Clearleft blog: Is the planet the missing member of your project team?

Rather than hand-wringing and finger-wagging, it gets down to some practical steps that you—we—can take on every project.

Chris finishes by asking:

Let me know how you design with the environment in mind. What practical advice would you suggest?

Well, here’s something that I keep coming up against…

Chris shows that the environment can be part of project management, specifically the RACI methodology:

We list who is responsible, accountable, consulted, and informed within the project. It’s a simple exercise but the clarity is useful for identifying what expertise and input we should seek from the named individuals.

Having the planet be a proactive partner in your project ensures its needs are considered.

Whenever responsibilities are being assigned there are some things that inevitably fall through the cracks. One I’ve seen over and over again is responsibility for third-party scripts.

On the face of it this seems like another responsibility for developers. We’re talking about code here, right?

But in my experience it is never the developers adding “beacons” and other third-party embedded scripts.

Chris rightly points out:

Development decisions, visual design choices, content approach, and product strategy all contribute to the environmental impact of your website.

But what about sales and marketing? Often they’re the ones who’ll drop in a third-party script to track user journeys. That’s understandable. That’s kind of their job.

Dropping in one line of JavaScript seems like a victimless crime. It’s just one small script, right? But JavaScript can import more JavaScript. Tools like Request Map Generator can show just how much destruction third-party JavaScript can wreak:

You pop in a URL, it fetches the page and maps out all the subsequent requests in a nifty interactive diagram of circles, showing how many requests third-party scripts are themselves generating. I’ve found it to be a very effective way of showing the impact of third-party scripts to people who aren’t interested in looking at waterfall diagrams.

Just to be clear, the people adding third-party scripts to websites usually aren’t doing so maliciously. They often don’t realise the negative effect the scripts will have on performance and the environment.

As is so often the case, this isn’t a technical problem. At root it’s about understanding people’s needs (like “I need a way to see what pages are converting!”) and finding a way to meet those needs without negatively impacting the planet. A good open-minded discussion can go a long way.

So I echo Chris’s call to think about environmental impacts from the very start of a project. Establish early on who will have the ability to add third-party scripts to the site. Do all of those people understand the responsibility that gives them?

I saw this lack of foresight in action on a project recently. The front-end development was going really well and the site was going to be exceptionally performant: green Lighthouse scores across the board. But when the site went live it had tracking scripts. That meant that users needed to consent to being tracked. That meant adding another third-party script to generate a consent banner. It completely tanked the Lighthouse scores.

I’m sure the people who added the tracking scripts and consent banners thought they had no choice. But there are alternatives. There are ways to get the data you need without the intrusive surveillance and performance-wrecking JavaScript.

The problem is that it’s not the norm. “Everyone else is doing it” was the justification for Flash intros two decades ago and it’s the justification for enshittification via third-party scripts now.

It doesn’t have to be this way.

Ad revenue

It’s been dispiriting but unsurprising to see American commentators weigh in on the EU’s Digital Markets Act. I really wish they’d read Baldur’s excellent explainer first.

John has been doing his predictable “leave Britney alone!” schtick with regards to Apple (and in this case, Google and Facebook too). Ian Betteridge does an excellent job of setting him straight:

A lot of commentators seem to have the same issue as John: that it’s weird that a governmental body can or should define how products should be designed.

But governments mandate how products are designed all the time, and not just in the EU. Take another market which is pretty big: cars. All cars have to feature safety equipment, which varies from region to region but will broadly include everything from seatbelts to crumple zones. Cars have rules for emissions, for fuel efficiency, all of which are designing how a car should work.

But there’s one assumption in John’s post that Ian didn’t push back on. John said:

It’s certainly possible that Meta can devise ways to serve non-personalized contextual ads that generate sufficient revenue per user.

That comes with a footnote:

One obvious solution would be to show more ads — a lot more ads — to make up for the difference in revenue. So if contextual ads generate, say, one-tenth the revenue of targeted ads, Meta could show 10 times as many ads to users who opt out of targeting. I don’t think 10× is an outlandish multiplier there — given how remarkably profitable Meta’s advertising business is, it might even need to be higher than that.

It’s almost like an article of faith that behavioural advertising is more effective than contextual advertising. But there’s no data to support this. Quite the opposite. I wrote about this four years ago.

Once again, I urge you to read the excellent analysis by Jesse Frederik and Maurits Martijn.

There’s also Tim Hwang’s book, Subprime Attention Crisis:

From the unreliability of advertising numbers and the unregulated automation of advertising bidding wars, to the simple fact that online ads mostly fail to work, Hwang demonstrates that while consumers’ attention has never been more prized, the true value of that attention itself—much like subprime mortgages—is wildly misrepresented.

More recently Dave Karpf said what we’re all thinking:

The thing I want to stress about microtargeted ads is that the current version is perpetually trash, and we’re always just a few years away from the bugs getting worked out.

The EFF are calling for a ban. Should that happen, the sky would not fall. Contrary to what John thinks, revenue would not plummet. Contextual advertising works just fine …without the need for invasive surveillance and tracking.

Like I said:

Tracker-driven behavioural advertising is bad for users. The advertisements are irrelevant most of the time, and on the few occasions where the advertising hits the mark, it just feels creepy.

Tracker-driven behavioural advertising is bad for advertisers. They spend their hard-earned money on invasive ad tech that results in no more sales or brand recognition than if they had relied on good ol’ contextual advertising.

Tracker-driven behavioural advertising is very bad for the web. Megabytes of third-party JavaScript are injected at exactly the wrong moment to make for the worst possible performance. And if that doesn’t ruin the user experience enough, there are still invasive overlays and consent forms to click through (which, ironically, gets people mad at the legislation—like GDPR—instead of the underlying reason for these annoying overlays: unnecessary surveillance and tracking by the site you’re visiting).

Opportunity

You can split the web in many ways. Lionel Dricot wrote about one of those ways in a blog post called Splitting the Web. In it, he outlines an ever-increasing divide he sees on the web.

On the one hand you’ve got people experiencing the advertising-driven, tracking-addicted big players who provide a bloated and buggy user experience.

On the hand, you’ve got the more tech-savvy users with tracking blockers (misleadingly called ad blockers) using browsers and search engines that value privacy and performance.

It feels like everyone is now choosing its side. You can’t stay in the middle anymore. You are either dedicating all your CPU cycles to run JavaScript tracking you or walking away from the big monopolies. You are either being paid to build huge advertising billboards on top of yet another framework or you are handcrafting HTML.

Maybe the web is not dying. Maybe the web is only splitting itself in two.

This reminded me of a post by Chris. No, not The Great Divide, although that’s obviously relevant here. Chris wrote a post just yesterday called Other People’s Busted Software is an Opportunity:

One way to look at it is opportunity. If you make software that does work reliably, you’ve got a leg up. Even if your customers don’t tell you “I like your software because it always works”, they’ll feel it and make choices around knowing it.

I like that optimistic take. If the majority seems to be doubling down on more tracking, more JavaScript, and more enshittification, then there’s a potential opportunity there (acknowledging that you’ve still got to battle against inertia and sunk cost).

This reminds of a fantastic talk that Stuart gave a few years ago called Privacy could be the next big thing:

How do you end up shaping the world? By inventing a thing that the current incumbents can’t compete against. By making privacy your core goal. Because companies who have built their whole business model on monetising your personal information cannot compete against that. They’d have to give up on everything that they are, which they can’t do. Facebook altering itself to ensure privacy for its users… wouldn’t exist. Can’t exist. That’s how you win.

Tracking

I’ve been reading the excellent Design For Safety by Eva PenzeyMoog. There was a line that really stood out to me:

The idea that it’s alright to do whatever unethical thing is currently the industry norm is widespread in tech, and dangerous.

It stood out to me because I had been thinking about certain practices that are widespread, accepted, and yet strike me as deeply problematic. These practices involve tracking users.

The first problem is that even the terminology I’m using would be rejected. When you track users on your website, it’s called analytics. Or maybe it’s stats. If you track users on a large enough scale, I guess you get to just call it data.

Those words—“analytics”, “stats”, and “data”—are often used when the more accurate word would be “tracking.”

Or to put it another way; analytics, stats, data, numbers …these are all outputs. But what produced these outputs? Tracking.

Here’s a concrete example: email newsletters.

Do you have numbers on how many people opened a particular newsletter? Do you have numbers on how many people clicked a particular link?

You can call it data, or stats, or analytics, but make no mistake, that’s tracking.

Follow-on question: do you honestly think that everyone who opens a newsletter or clicks on a link in a newsletter has given their informed constent to be tracked by you?

You may well answer that this is a widespread—nay, universal—practice. Well yes, but a) that’s not what I asked, and b) see the above quote from Design For Safety.

You could quite correctly point out that this tracking is out of your hands. Your newsletter provider—probably Mailchimp—does this by default. So if the tracking is happening anyway, why not take a look at those numbers?

But that’s like saying it’s okay to eat battery-farmed chicken as long as you’re not breeding the chickens yourself.

When I try to argue against this kind of tracking from an ethical standpoint, I get a frosty reception. I might have better luck battling numbers with numbers. Increasing numbers of users are taking steps to prevent tracking. I had a plug-in installed in my mail client—Apple Mail—to prevent tracking. Now I don’t even need the plug-in. Apple have built it into the app. That should tell you something. It reminds me of when browsers had to introduce pop-up blocking.

If the outputs generated by tracking turn out to be inaccurate, then shouldn’t they lose their status?

But that line of reasoning shouldn’t even by necessary. We shouldn’t stop tracking users because it’s inaccurate. We should stop stop tracking users because it’s wrong.

Writing on web.dev

Chrome Dev Summit kicked off yesterday. The opening keynote had its usual share of announcements.

There was quite a bit of talk about privacy, which sounds good in theory, but then we were told that Google would be partnering with “industry stakeholders.” That’s probably code for the kind of ad-tech sharks that have been making a concerted effort to infest W3C groups. Beware.

But once Una was on-screen, the topics shifted to the kind of design and development updates that don’t have sinister overtones.

My favourite moment was when Una said:

We’re also partnering with Jeremy Keith of Clearleft to launch Learn Responsive Design on web.dev. This is a free online course with everything you need to know about designing for the new responsive web of today.

This is what’s been keeping me busy for the past few months (and for the next month or so too). I’ve been writing fifteen pieces—or “modules”—on modern responsive web design. One third of them are available now at web.dev/learn/design:

The rest are on their way: typography, responsive images, theming, UI patterns, and more.

I’ve been enjoying this process. It’s hard work that requires me to dive deep into the nitty-gritty details of lots of different techniques and technologies, but that can be quite rewarding. As is often said, if you truly want to understand something, teach it.

Oh, and I made one more appearance at the Chrome Dev Summit. During the “Ask Me Anything” section, quizmaster Una asked the panelists a question from me:

Given the court proceedings against AMP, why should anyone trust FLOC or any other Google initiatives ostensibly focused on privacy?

(Thanks to Jake for helping craft the question into a form that could make it past the legal department but still retain its spiciness.)

The question got a response. I wouldn’t say it got an answer. My verdict remains:

I’m not sure that Google Chrome can be considered a user agent.

The fundamental issue is that you’ve got a single company that’s the market leader in web search, the market leader in web advertising, and the market leader in web browsers. I honestly believe all three would function better—and more honestly—if they were separate entities.

Monopolies aren’t just damaging for customers. They’re damaging for the monopoly too. I’d love to see Google Chrome compete on being a great web browser without having to also balance the needs of surveillance-based advertising.

The cage

I subscribe to Peter Gasston’s newsletter, The Tech Landscape. It’s good. Peter’s a smart guy with his finger on the pulse of many technologies that are beyond my ken. I recommend subscribing.

But I was very taken aback by what he wrote in issue 202. It was to do with algorithmic recommendation engines.

This week I want to take a little dump on a tweet I read. I’m not going to link to it (I’m not that person), but it basically said something like: “I’m afraid to Google something because I don’t want the algorithm to think I like it, and I’m afraid to click a link because I don’t want the algorithm to show me more like it… what a cage.”

I saw the same tweet. It resonated with me. I had responded with a link to a post I wrote a while back called Get safe. That post made two points:

GET requests shouldn’t have side effects. Adding to a dossier on someone’s browsing habits definitely counts as a side effect.
It is literally a fundamental principle of the web platform that it should be safe to visit a web page.

But Peter describes ubiquitous surveillance as a feature, not a bug:

It’s observing what someone likes or does, then trying to make recommendations for more things like it—whether that’s books, TV shows, clothes, advertising, or whatever. It works on probability, so it’s going to make better guesses the more it knows you; if you like ten things of type A, then liking one thing of type B shouldn’t be enough to completely change its recommendations. The problem is, we don’t like “the algorithm” if it doesn’t work, and we don’t like it if works too well (“creepy!”). But it’s not sinister, and it’s not a cage.

He would be correct if the balance of power were tipped towards the person actively looking for recommendations. As I said in my earlier post:

Don’t get me wrong: building a profile of someone based on their actions isn’t inherently wrong. If a user taps on “like” or “favourite” or “bookmark”, they are actively telling the server to perform an update (and so those actions should be POST requests). But do you see the difference in where the power lies?

When Peter says “it’s not sinister, and it’s not a cage” that may be true for him, but that is not a shared feeling, as the original tweet demonstrates. I don’t think it’s fair to dismiss someone else’s psychological pain because you don’t think they “get it”. I’m pretty sure everyone “gets” how recommendation engines are supposed to work. That’s not the issue. Trying to provide relevant content isn’t the problem. It’s the unbelievably heavy-handed methods that make it feel like a cage.

Peter uses the metaphor of a record shop:

“The algorithm” is the best way to navigate a world of infinite choice; imagine you went to a record shop (remember them?) which had every recording ever released; how would you find new music? You’d either buy music by bands you know you already liked, or you’d take a pure gamble on something—which most of the time would be a miss. So you’d ask a store worker, and they’d recommend the music they liked—but that’s no guarantee you’d like it. A good worker would ask what type of music you like, and recommend music based on that—you might not like all the recommendations, but there’s more of a chance you’d like some. That’s just what “the algorithm” does.

But that’s not true. You don’t ask “the algorithm” for a recommendation—it foists them on you whether you want them or not. A more apt metaphor would be that you walked by a record shop once and the store worker came out and followed you down the street, into your home, and watched your every move for the rest of your life.

What Peter describes sounds great—a helpful knowledgable software agent that you ask for recommendations. But that’s not what “the algorithm” is. And that’s why it feels like a cage. That’s why it is a cage.

The original tweet was an open, honest, and vulnerable insight into what online recommendation engines feel like. That’s a valuable insight that should be taken on board, not dismissed.

And what a lack of imagination to look at an existing broken system—that doesn’t even provide good recommendations while making people afraid to click on links—and shrug and say that this is the best we can do. If this really is “is the best way to navigate a world of infinite choice” then it’s no wonder that people feel like they need to go on a digital detox and get away from their devices in order to feel normal. It’s like saying that decapitation is the best way of solving headaches.

Imagine living in a surveillance state like East Germany, and saying “Well, how else is the government supposed to make informed decisions without constantly monitoring its citizens?” I think it’s more likely that you’d feel like you’re in a cage.

Apples to oranges? Kind of. But whether it’s surveillance communism or surveillance capitalism, there’s a shared methodology at work. They’re both systems that disempower people for the supposedly greater good of amassing data. Both are built on the false premise that problems can be solved by getting more and more data. If that results in collateral damage to people’s privacy and mental health, well …it’s all for the greater good, right?

It’s fucking bullshit. I don’t want to live in that cage and I don’t want anyone else to have to live in it either. I’m going to do everything I can to tear it down.

Get the FLoC out

I’ve always liked the way that web browsers are called “user agents” in the world of web standards. It’s such a succinct summation of what browsers are for, or more accurately who browsers are for. Users.

The term makes sense when you consider that the internet is for end users. That’s not to be taken for granted. This assertion is now enshrined in the Internet Engineering Task Force’s RFC 8890—like Magna Carta for the network age. It’s also a great example of prioritisation in a design principle:

When there is a conflict between the interests of end users of the Internet and other parties, IETF decisions should favor end users.

So when a web browser—ostensibly an agent for the user—prioritises user-hostile third parties, we get upset.

Google Chrome—ostensibly an agent for the user—is running an origin trial for Federated Learning of Cohorts (FLoC). This is not a technology that serves the end user. It is a technology that serves third parties who want to target end users. The most common use case is behavioural advertising, but targetting could be applied for more nefarious purposes.

The Electronic Frontier Foundation wrote an explainer last month: Google Is Testing Its Controversial New Ad Targeting Tech in Millions of Browsers. Here’s What We Know.

Let’s back up a minute and look at why this is happening. End users are routinely targeted today (for behavioural advertising and other use cases) through third-party cookies. Some user agents like Apple’s Safari and Mozilla’s Firefox are stamping down on this, disabling third party cookies by default.

Seeing which way the wind is blowing, Google’s Chrome browser will also disable third-party cookies at some time in the future (they’re waiting to shut that barn door until the fire is good’n’raging). But Google isn’t just in the browser business. Google is also in the ad tech business. So they still want to advertisers to be able to target end users.

Yes, this is quite the cognitive dissonance: one part of the business is building a user agent while a different part of the company is working on ways of tracking end users. It’s almost as if one company shouldn’t simultaneously be the market leader in three separate industries: search, advertising, and web browsing. (Seriously though, I honestly think Google’s search engine would get better if it were split off from the parent company, and I think that Google’s web browser would also get better if it were a separate enterprise.)

Anyway, one possible way of tracking users without technically tracking individual users is to assign them to buckets, or cohorts of interest based on their browsing habits. Does that make you feel safer? Me neither.

That’s what Google is testing with the origin trial of FLoC.

If you, as an end user, don’t wish to be experimented on like this, there are a few things you can do:

Don’t use Chrome. No other web browser is participating in this experiment. I recommend Firefox.
If you want to continue to use Chrome, install the Duck Duck Go Chrome extension.
Alternatively, if you manually disable third-party cookies, your Chrome browser won’t be included in the experiment.
Or you could move to Europe. The origin trial won’t be enabled for users in the European Union, which is coincidentally where GDPR applies.

That last decision is interesting. On the one hand, the origin trial is supposed to be on a small scale, hence the lack of European countries. On the other hand, the origin trial is “opt out” instead of “opt in” so that they can gather a big enough data set. Weird.

The plan is that if and when FLoC launches, websites would have to opt in to it. And when I say “plan”, I mean “best guess.”

I, for one, am filled with confidence that Google would never pull a bait-and-switch with their technologies.

In the meantime, if you’re a website owner, you have to opt your website out of the origin trial. You can do this by sending a server header. A meta element won’t do the trick, I’m afraid.

I’ve done it for my sites, which are served using Apache. I’ve got this in my .conf file:

<IfModule mod_headers.c>
Header always set Permissions-Policy "interest-cohort=()"
</IfModule>

If you don’t have access to your server, tough luck. But if your site runs on Wordpress, there’s a proposal to opt out of FLoC by default.

Interestingly, none of the Chrome devs that I follow are saying anything about FLoC. They’re usually quite chatty about proposals for potential standards, but I suspect that this one might be embarrassing for them. It was a similar situation with AMP. In that case, Google abused its monopoly position in search to blackmail publishers into using Google’s format. Now Google’s monopoly in advertising is compromising the integrity of its browser. In both cases, it makes it hard for Chrome devs claiming to have the web’s best interests at heart.

But one of the advantages of having a huge share of the browser market is that Chrome can just plough ahead and unilaterily implement whatever it wants even if there’s no consensus from other browser makers. So that’s what Google is doing with FLoC. But their justification for doing this doesn’t really work unless other browsers play along.

Here’s Google’s logic:

Third-party cookies are on their way out so advertisers will no longer be able to use that technology to target users.
If we don’t provide an alternative, advertisers and other third parties will use fingerprinting, which we all agree is very bad.
So let’s implement Federated Learning of Cohorts so that advertisers won’t use fingerprinting.

The problem is with step three. The theory is that if FLoC gives third parties what they need, then they won’t reach for fingerprinting. Even if there were any validity to that hypothesis, the only chance it has of working is if every browser joins in with FLoC. Otherwise ad tech companies are leaving money on the table. Can you seriously imagine third parties deciding that they just won’t target iPhone or iPad users any more? Remember that Safari is the only real browser on iOS so unless FLoC is implemented by Apple, third parties can’t reach those people …unless those third parties use fingerprinting instead.

Google have set up a situation where it looks like FLoC is going head-to-head with fingerprinting. But if FLoC becomes a reality, it won’t be instead of fingerprinting, it will be in addition to fingerprinting.

Google is quite right to point out that fingerprinting is A Very Bad Thing. But their concerns about fingerprinting sound very hollow when you see that Chrome is pushing ahead and implementing a raft of browser APIs that other browser makers quite rightly point out enable more fingerprinting: Battery Status, Proximity Sensor, Ambient Light Sensor and so on.

When it comes to those APIs, the message from Google is that fingerprinting is a solveable problem.

But when it comes to third party tracking, the message from Google is that fingerprinting is inevitable and so we must provide an alternative.

Which one is it?

Google’s flimsy logic for why FLoC is supposedly good for end users just doesn’t hold up. If they were honest and said that it’s to maintain the status quo of the ad tech industry, it would make much more sense.

The flaw in Google’s reasoning is the fundamental idea that tracking is necessary for advertising. That’s simply not true. Sacrificing user privacy is fundamental to behavioural advertising …but behavioural advertising is not the only kind of advertising. It isn’t even a very good kind of advertising.

Marko Saric sums it up:

FLoC seems to be Google’s way of saving a dying business. They are trying to keep targeted ads going by making them more “privacy-friendly” and “anonymous”. But behavioral profiling and targeted advertisement is not compatible with a privacy-respecting web.

What’s striking is that the very monopolies that make Google and Facebook the leaders in behavioural advertising would also make them the leaders in contextual advertising. Almost everyone uses Google’s search engine. Almost everyone uses Facebook’s social network. An advertising model based on what you’re currently looking at would keep Google and Facebook in their dominant positions.

Google made their first many billions exclusively on contextual advertising. Google now prefers to push the message that behavioral advertising based on personal data collection is superior but there is simply no trustworthy evidence to that.

I sincerely hope that Chrome will align with Safari, Firefox, Vivaldi, Brave, Edge and every other web browser. Everyone already agrees that fingerprinting is the real enemy. Imagine the combined brainpower that could be brought to bear on that problem if all browsers made user privacy a priority.

Until that day, I’m not sure that Google Chrome can be considered a user agent.

Prediction

Arthur C. Clarke once said:

Trying to predict the future is a discouraging and hazardous occupation becaue the profit invariably falls into two stools. If his predictions sounded at all reasonable, you can be quite sure that in 20 or most 50 years, the progress of science and technology has made him seem ridiculously conservative. On the other hand, if by some miracle a prophet could describe the future exactly as it was going to take place, his predictions would sound so absurd, so far-fetched, that everybody would laugh him to scorn.

But I couldn’t resist responding to a recent request for augery. Eric asked An Event Apart speakers for their predictions for the coming year. The responses have been gathered together and published, although it’s in the form of a PDF for some reason.

Here’s what I wrote:

This is probably more of a hope than a prediction, but 2021 could be the year that the ponzi scheme of online tracking and surveillance begins to crumble. People are beginning to realize that it’s far too intrusive, that it just doesn’t work most of the time, and that good ol’-fashioned contextual advertising would be better. Right now, it feels similar to the moment before the sub-prime mortgage bubble collapsed (a comparison made in Tim Hwang’s recent book, Subprime Attention Crisis). Back then people thought “Well, these big banks must know what they’re doing,” just as people have thought, “Well, Facebook and Google must know what they’re doing”…but that confidence is crumbling, exposing the shaky stack of cards that props up behavioral advertising. This doesn’t mean that online advertising is coming to an end—far from it. I think we might see a golden age of relevant, content-driven advertising. Laws like Europe’s GDPR will play a part. Apple’s recent changes to highlight privacy-violating apps will play a part. Most of all, I think that people will play a part. They will be increasingly aware that there’s nothing inevitable about tracking and surveillance and that the web works better when it respects people’s right to privacy. The sea change might not happen in 2021 but it feels like the water is beginning to swell.

Still, predicting the future is a mug’s game with as much scientific rigour as astrology, reading tea leaves, or haruspicy.

Much like behavioural advertising.

Get safe

The verbs of the web are GET and POST. In theory there’s also PUT, DELETE, and PATCH but in practice POST often does those jobs.

I’m always surprised when front-end developers don’t think about these verbs (or request methods, to use the technical term). Knowing when to use GET and when to use POST is crucial to having a solid foundation for whatever you’re building on the web.

Luckily it’s not hard to know when to use each one. If the user is requesting something, use GET. If the user is changing something, use POST.

That’s why links are GET requests by default. A link “gets” a resource and delivers it to the user.

<a href="/items/id">

Most forms use the POST method becuase they’re changing something—creating, editing, deleting, updating.

<form method="post" action="/items/id/edit">

But not all forms should use POST. A search form should use GET.

<form method="get" action="/search">
<input type="search" name="term">

When a user performs a search, they’re still requesting a resource (a page of search results). It’s just that they need to provide some specific details for the GET request. Those details get translated into a query string appended to the URL specified in the action attribute.

/search?term=value

I sometimes see the GET method used incorrectly:

“Log out” links that should be forms with a “log out” button—you can always style it to look like a link if you want.
“Unsubscribe” links in emails that immediately trigger the action of unsubscribing instead of going to a form where the POST method does the unsubscribing. I realise that this turns unsubscribing into a two-step process, which is a bit annoying from a usability point of view, but a destructive action should never be baked into a GET request.

When the it was first created, the World Wide Web was stateless by design. If you requested one web page, and then subsequently requested another web page, the server had no way of knowing that the same user was making both requests. After serving up a page in response to a GET request, the server promptly forgot all about it.

That’s how web browsing should still work. In fact, it’s one of the Web Platform Design Principles: It should be safe to visit a web page:

The Web is named for its hyperlinked structure. In order for the web to remain vibrant, users need to be able to expect that merely visiting any given link won’t have implications for the security of their computer, or for any essential aspects of their privacy.

The expectation of safe stateless browsing has been eroded over time. Every time you click on a search result in Google, or you tap on a recommended video in YouTube, or—heaven help us—you actually click on an advertisement, you just know that you’re adding to a dossier of your online profile. That’s not how the web is supposed to work.

Don’t get me wrong: building a profile of someone based on their actions isn’t inherently wrong. If a user taps on “like” or “favourite” or “bookmark”, they are actively telling the server to perform an update (and so those actions should be POST requests). But do you see the difference in where the power lies? With POST actions—fave, rate, save—the user is in charge. With GET requests, no one is supposed to be in charge—it’s meant to be a neutral transaction. Alas, the reality of today’s web is that many GET requests give more power to the dossier-building servers at the expense of the user’s agency.

The very first of the Web Platform Design Principles is Put user needs first :

If a trade-off needs to be made, always put user needs above all.

The current abuse of GET requests is damage that the web needs to route around.

Browsers are helping to a certain extent. Most browsers have the concept of private browsing, allowing you some level of statelessness, or at least time-limited statefulness. But it’s kind of messed up that private browsing is the exception, while surveillance is the default. It should be the other way around.

Firefox and Safari are taking steps to reduce tracking and fingerprinting. Rejecting third-party coookies by default is a good move. I’d love it if third-party JavaScript were also rejected by default:

In retrospect, it seems unbelievable that third-party JavaScript is even possible. I mean, putting arbitrary code—that can then inject even more arbitrary code—onto your website? That seems like a security nightmare!

I imagine if JavaScript were being specced today, it would almost certainly be restricted to the same origin by default.

Chrome has different priorities, which is understandable given that it comes from a company with a business model that is currently tied to tracking and surveillance (though it needn’t remain that way). With anti-trust proceedings rumbling in the background, there’s talk of breaking up Google to avoid monopolistic abuses of power. I honestly think it would be the best thing that could happen to Chrome if it were an independent browser that could fully focus on user needs without having to consider the surveillance needs of an advertising broker.

But we needn’t wait for the browsers to make the web a safer place for users.

Developers write the code that updates those dossiers. Developers add those oh-so-harmless-looking third-party scripts to page templates.

What if we refused?

Front-end developers in particular should be the last line of defence for users. The entire field of front-end devlopment is supposed to be predicated on the prioritisation of user needs.

And if the moral argument isn’t enough, perhaps the technical argument can get through. Tracking users based on their GET requests violates the very bedrock of the web’s architecture. Stop doing that.

Clean advertising

Imagine if you were told that fossil fuels were the only way of extracting energy. It would be an absurd claim. Not only are other energy sources available—solar, wind, geothermal, nuclear—fossil fuels aren’t even the most effecient source of energy. To say that you can’t have energy without burning fossil fuels would be pitifully incorrect.

And yet when it comes to online advertising, we seem to have meekly accepted that you can’t have effective advertising without invasive tracking. But nothing could be further from the truth. Invasive tracking is to online advertising as fossil fuels are to energy production—an outmoded inefficient means of getting substandard results.

Before the onslaught of third party cookies and scripts, online advertising was contextual. If I searched for property insurance, I was likely to see an advertisement for property insurance. If I was reading an article about pet food, I was likely to be served an advertisement for pet food.

Simply put, contextual advertising ensured that the advertising that accompanied content could be relevant and timely. There was no big mystery about it: advertisers just needed to know what the content was about and they could serve up the appropriate advertisement. Nice and straightforward.

Too straightforward.

What if, instead of matching the advertisement to the content, we could match the advertisement to the person? Regardless of what they were searching for or reading, they’d be served advertisements that were relevant to them not just in that moment, but relevant to their lifestyles, thoughts and beliefs? Of course that would require building up dossiers of information about each person so that their profiles could be targeted and constantly updated. That’s where cross-site tracking comes in, with third-party cookies and scripts.

This is behavioural advertising. It has all but elimated contextual advertising. It has become so pervasive that online advertising and behavioural advertising have become synonymous. Contextual advertising is seen as laughably primitive compared with the clairvoyant powers of behavioural advertising.

But there’s a problem with behavioural advertising. A big problem.

It doesn’t work.

First of all, it relies on mind-reading powers by the advertising brokers—Facebook, Google, and the other middlemen of ad tech. For all the apocryphal folk tales of spooky second-guessing in online advertising, it mostly remains rubbish.

Forget privacy: you’re terrible at targeting anyway:

None of this works. They are still trying to sell me car insurance for my subway ride.

Have you actually paid attention to what advertisements you’re served? Maciej did:

I saw a lot of ads for GEICO, a brand of car insurance that I already own.

I saw multiple ads for Red Lobster, a seafood restaurant chain in America. Red Lobster doesn’t have any branches in San Francisco, where I live.

Finally, I saw a ton of ads for Zipcar, which is a car sharing service. These really pissed me off, not because I have a problem with Zipcar, but because they showed me the algorithm wasn’t even trying. It’s one thing to get the targeting wrong, but the ad engine can’t even decide if I have a car or not! You just showed me five ads for car insurance.

And yet in the twisted logic of ad tech, all of this would be seen as evidence that they need to gather even more data with even more invasive tracking and surveillance.

It turns out that bizarre logic is at the very heart of behavioural advertising. I highly recommend reading the in-depth report from The Correspondent called The new dot com bubble is here: it’s called online advertising:

It’s about a market of a quarter of a trillion dollars governed by irrationality.

The benchmarks that advertising companies use – intended to measure the number of clicks, sales and downloads that occur after an ad is viewed – are fundamentally misleading. None of these benchmarks distinguish between the selection effect (clicks, purchases and downloads that are happening anyway) and the advertising effect (clicks, purchases and downloads that would not have happened without ads).

Suppose someone told you that they keep tigers out of their garden by turning on their kitchen light every evening. You might think their logic is flawed, but they’ve been turning on the kitchen light every evening for years and there hasn’t been a single tiger in the garden the whole time. That’s the logic used by ad tech companies to justify trackers.

Tracker-driven behavioural advertising is bad for users. The advertisements are irrelevant most of the time, and on the few occasions where the advertising hits the mark, it just feels creepy.

Tracker-driven behavioural advertising is bad for advertisers. They spend their hard-earned money on invasive ad tech that results in no more sales or brand recognition than if they had relied on good ol’ contextual advertising.

Tracker-driven behavioural advertising is very bad for the web. Megabytes of third-party JavaScript are injected at exactly the wrong moment to make for the worst possible performance. And if that doesn’t ruin the user experience enough, there are still invasive overlays and consent forms to click through (which, ironically, gets people mad at the legislation—like GDPR—instead of the underlying reason for these annoying overlays: unnecessary surveillance and tracking by the site you’re visiting).

Tracker-driven behavioural advertising is good for the middlemen doing the tracking. Facebook and Google are two of the biggest players here. But that doesn’t mean that their business models need to be permanently anchored to surveillance. The very monopolies that make them kings of behavioural advertising—the biggest social network and the biggest search engine—would also make them titans of contextual advertising. They could pivot from an invasive behavioural model of advertising to a privacy-respecting contextual advertising model.

The incumbents will almost certainly resist changing something so fundamental. It would be like expecting an energy company to change their focus from fossil fuels to renewables. It won’t happen quickly. But I think that it may eventually happen …if we demand it.

In the meantime, we can all play our part. Just as we can do our bit for the environment at an individual level by sorting our recycling and making green choices in our day to day lives, we can all do our bit for the web too.

The least we can do is block third-party cookies. Some browsers are now doing this by default. That’s good.

Blocking third-party JavaScript is a bit trickier. That requires a browser extension. Most of these extensions to block third-party tracking are called ad blockers. That’s a shame. The issue is not with advertising. The issue is with tracking.

Alas, because this software is labelled under ad blocking, it has led to the ludicrous situation of an ethical argument being made to allow surveillance and tracking! It goes like this: websites need advertising to survive; if you block the ads, then you are denying these sites revenue. That argument would make sense if we were talking about contextual advertising. But it makes no sense when it comes to behavioural advertising …unless you genuinely believe that online advertising has to be behavioural, which means that online advertising has to track you to be effective. Such a belief would be completely wrong. But that doesn’t stop it being widely held.

To argue that there is a moral argument against blocking trackers is ridiculous. If anything, there’s a moral argument to be made for installing anti-tracking software for yourself, your friends, and your family. Otherwise we are collectively giving up our privacy for a business model that doesn’t even work.

It’s a shame that advertisers will lose out if tracking-blocking software prevents their ads from loading. But that’s only going to happen in the case of behavioural advertising. Contextual advertising won’t be blocked. Contextual advertising is also more lightweight than behavioural advertising. Contextual advertising is far less creepy than behavioural advertising. And crucially, contextual advertising works.

That shouldn’t be a controversial claim: the idea that people would be interested in adverts that are related to the content they’re currently looking at. The greatest trick the ad tech industry has pulled is convincing the world that contextual relevance is somehow less effective than some secret algorithm fed with all our data that’s supposed to be able to practically read our minds and know us better than we know ourselves.

Y’know, if this mind-control ray really could give me timely relevant adverts, I might possibly consider paying the price with my privacy. But as it is, YouTube still hasn’t figured out that I’m not interested in Top Gear or football.

The next time someone is talking about the necessity of advertising on the web as a business model, ask for details. Do they mean contextual or behavioural advertising? They’ll probably laugh at you and say that behavioural advertising is the only thing that works. They’ll be wrong.

I know it’s hard to imagine a future without tracker-driven behavioural advertising. But there are no good business reasons for it to continue. It was once hard to imagine a future without oil or coal. But through collective action, legislation, and smart business decisions, we can make a cleaner future.

Performance and people

I was helping a client with a bit of a performance audit this week. I really, really enjoy this work. It’s such a nice opportunity to get my hands in the soil of a website, so to speak, and suggest changes that will have a measurable effect on the user’s experience.

Not only is web performance a user experience issue, it may well be the user experience issue. Page speed has a proven demonstrable direct effect on user experience (and revenue and customer satisfaction and whatever other metrics you’re using).

It struck me that there’s a continuum of performance challenges. On one end of the continuum, you’ve got technical issues. These can be solved with technical solutions. On the other end of the continuum, you’ve got human issues. These can be solved with discussions, agreement, empathy, and conversations (often dreaded or awkward).

I think that, as developers, we tend to gravitate towards the technical issues. That’s our safe space. But I suspect that bigger gains can be reaped by tackling the uncomfortable human issues.

This week, for example, I uncovered three performance issues. One was definitely technical. One was definitely human. One was halfway between.

The technical issue was with web fonts. It’s a lot of fun to dive into this aspect of web performance because quite often there’s some low-hanging fruit: a relatively simple technical fix that will boost the performance (or perceived performance) of a website. That might be through resource hints (using link rel=“preload” in the HTML) or adjusting the font loading (using font-display in the CSS) or even nerdier stuff like subsetting.

In this case, the issue was with the file format of the font itself. By switching to woff2, there were significant file size savings. And the great thing is that @font-face rules allow you to specify multiple file formats so you can still support older browsers that can’t handle woff2. A win all ‘round!

The performance issue that was right in the middle of the technical/human continuum was with images. At first glance it looked like a similar issue to the fonts. Some images were being served in the wrong formats. When I say “wrong”, I guess I mean inappropriate. A photographic image, for example, is probably going to best served as a JPG rather than a PNG.

But unlike the fonts, the images weren’t in the direct control of the developers. These images were coming from a Content Management System. And while there’s a certain amount of processing you can do on the server, a human still makes the decision about what file format they’re uploading.

I’ve seen this happen at Clearleft. We launched an event site with lean performant code, but then someone uploaded an image that’s megabytes in size. The solution in that case wasn’t technical. We realised there was a knowledge gap around image file formats—which, let’s face it, is kind of a techy topic that most normal people shouldn’t be expected to know.

But it was extremely gratifying to see that people were genuinely interested in knowing a bit more about choosing the right format for the right image. I was able to provide a few rules of thumb and point to free software for converting images. It empowered those people to feel more confident using the Content Management System.

It was a similar situation with the client site I was looking at this week. Nobody is uploading oversized images in order to deliberately make the site slower. They probably don’t realise the difference that image formats can make. By having a discussion and giving them some pointers, they’ll have more knowledge and the site will be faster. Another win all ‘round!

At the other end of continuum was an issue that wasn’t technical. From a technical point of view, there was just one teeny weeny little script. But that little script is Google Tag Manager which then calls many, many other scripts that are not so teeny weeny. Third party scripts …the bane of web performance!

In retrospect, it seems unbelievable that third-party JavaScript is even possible. I mean, putting arbitrary code—that can then inject even more arbitrary code—onto your website? That seems like a security nightmare!

Remember when I did a countdown of the top four web performance challenges? At the number one spot is other people’s JavaScript.

Now one technical solution would be to remove the Google Tag Manager script. But that’s probably not very practical—you’ll probably just piss off some other department. That said, if you can’t find out which department was responsible for adding the Google Tag Manager script in the first place, it might we well be an option to remove it and then wait and see who complains. If no one notices it’s gone, job done!

More realistically, there’s someone who’s added that Google Tag Manager script for their own valid reasons. You’ll need to talk to them and understand their needs.

Again, as with images uploaded in a Content Management System, they may not be aware of the performance problems caused by third-party scripts. You could try throwing numbers at them, but I think you get better results by telling the story of performance.

Use tools like Request Map Generator to help them visualise the impact that third-party scripts are having. Talk to them. More importantly, listen to them. Find out why those scripts are being requested. What are the outcomes they’re working towards? Can you offer an alternative way of providing the data they need?

I think many of us developers are intimidated or apprehensive about approaching people to have those conversations. But it’s necessary. And in its own way, it can be as rewarding as tinkering with code. If the end result is a faster website, then the work is definitely worth doing—whether it’s technical work or people work.

Personally, I just really enjoy working on anything that will end up improving a website’s performance, and by extension, the user experience. If you fancy working with me on your site, you should get in touch with Clearleft.

BetrayURL

Back in February, I wrote about an excellent proposal by Jake for how browsers could display URLs in a safer way. Crucially, this involved highlighting the important part of the URL, but didn’t involve hiding any part. It’s a really elegant solution.

Turns out it was a Trojan horse. Chrome are now running an experiment where they will do the exact opposite: they will hide parts of the URL instead of highlighting the important part.

You can change this behaviour if you’re in the less than 1% of people who ever change default settings in browsers.

I’m really disappointed to see that Jake’s proposal isn’t going to be implemented. It was a much, much better solution.

No doubt I will hear rejoinders that the “solution” that Chrome is experimenting with is pretty similar to what Jake proposed. Nothing could be further from the truth. Jake’s solution empowered users with knowledge without taking anything away. What Chrome will be doing is the opposite of that, infantalising users and making decisions for them “for their own good.”

Seeing a complete URL is going to become a power-user feature, like View Source or user style sheets.

I’m really sad about that because, as Jake’s proposal demonstrates, it doesn’t have to be that way.

Third party

The web turned 30 this year. When I was back at CERN to mark this anniversary, there was a lot of introspection and questioning the direction that the web has taken. Everyone I know that uses the web is in agreement that tracking and surveillance are out of control. It seems only right to question whether the web has lost its way.

But here’s the thing: the technologies that enable tracking and surveillance didn’t exist in the early years of the web—JavaScript and cookies.

Without cookies, the web was stateless. This was by design. Now, I totally understand why cookies—or something like cookies—were needed. Without some way of keeping track of state, there’s no good way for a website to “remember” what’s in your shopping cart, or whether you’ve authenticated yourself.

But why would cookies ever need to work across domains? Authentication, shopping carts and all that good stuff can happen on the same domain. Third-party cookies, on the other hand, seem custom made for tracking and frankly, not much else.

Browsers allow you to disable third-party cookies, though it’s not yet the default. If enough people do it—and complain about the sites that stop working when third-party cookies are disabled—then maybe it can become the default.

Firefox is taking steps in this direction, automatically disabling some third-party cookies—the ones that known trackers. Safari is also taking steps to prevent cross-site tracking. It’s not too late to change the tide of third-party cookies.

Then there’s third-party JavaScript.

In retrospect, it seems unbelievable that third-party JavaScript is even possible. I mean, putting arbitrary code—that can then inject even more arbitrary code—onto your website? That seems like a security nightmare!

I imagine if JavaScript were being specced today, it would almost certainly be restricted to the same origin by default. But I guess the precedent had been set with images and style sheets: they could be embedded regardless of whether their domain names matched yours. Still, this is executable code we’re talking about here: that’s quite a footgun that the web has given site owners. And boy, oh boy, has it been used by the worst people to do the most damage.

Again, as with cookies, if we were to imagine what the web would be like if JavaScript was restricted by a same-domain policy, there are certainly things that would be trickier to do.

Embedding video, audio, and maps would get a lot finickier.
Analytics would need to be self-hosted. I don’t think that would bother any site owners. An analytics platform like Google Analytics that tracks people across domains is doing it for its own benefit rather than that of site owners.
Advertising wouldn’t be creepy and annoying. Instead of what’s so euphemistically called “personalisation”, advertisers would have to rely on serving relevant ads based on the content of the site rather than an invasive psychological profile of the user. (I honestly think that advertisers would benefit from this kind of targetting.)

It’s harder to imagine putting the genie back in the bottle when it comes to third-party JavaScript than it is with third-party cookies. All the same, I wish that browsers made it easier to experiment with it. Just as I can choose to accept all cookies, reject all cookies, or only accept same-origin cookies, I wish I could accept all JavaScript, reject all JavaScript, or only accept same-origin JavaScript.

As it is, browsers are making it harder and harder to exercise any control over JavaScript at all. So we reach for third-party tools. We don’t call them JavaScript managers though. We call them ad blockers. But honestly, most of the ad-blocker users I know—myself included—are not bothered by the advertising; we’re bothered by the tracking. We should really call them surveillance blockers.

If third-party JavaScript weren’t the norm, not only would it make the web more secure, it would make it way more performant. Read the chapter on third parties in this year’s newly-released Web Almanac. The figures are staggering.

93% of pages include at least one third-party resource, 76% of pages issue a request to an analytics domain, the median page requests content from at least 9 unique third-party domains that represent 35% of their total network activity, and the most active 10% of pages issue a whopping 175 third-party requests or more.

I don’t think all the web’s performance ills are due to third-party scripts; developers are doing a bang-up job of making their sites big and bloated with their own self-hosted frameworks and code. But as long as third-party JavaScript is allowed onto a site, there’s a limit to how much good developers can do to improve the performance of their sites.

I go to performance-related conferences and you know who I’ve never seen at those events? The people who write the JavaScript for third-party tracking scripts. Those developers are wielding an outsized influence on the health of the web.

I’m very happy to see the work being done by Mozilla and Apple to normalise the idea of rejecting third-party cookies. I’d love to see the rejection of third-party JavaScript normalised in the same way. I know that it would make my life as a developer harder. But that’s of lesser importance. It would be better for the web.

Alternative analytics

Contrary to the current consensual hallucination, there are alternatives to Google Analytics.

I haven’t tried Open Web Analytics. It looks a bit geeky, but the nice thing about it is that you can set it up to work with JavaScript or PHP (sort of like Mint, which I miss).

Also on the geeky end, there’s GoAccess which provides an interface onto your server logs. You can view the data in a browser or on the command line. I gave this a go on adactio.com and it all worked just fine.

Matomo was previously called Piwik, and it’s the closest to Google Analytics. Chris Ruppel wrote about using it as a drop-in replacement. I gave it a go on adactio.com and it did indeed collect analytics very nicely …but then I deleted it, because it still felt creepy to have any kind of analytics script at all (neither Huffduffer or The Session have any analytics tracking either).

Fathom isn’t out yet, but it looks interesting:

It will track users on a website, the key actions they are taking, and give you a non-nerdy breakdown of their journey. It’ll do so with user-centric rights and privacy, and without selling, sharing or giving away the data you collect.

I don’t think any of these alternatives offer quite the same ease-of-use that you’d get from Google Analytics. But I also don’t think that should be your highest priority. There’s a fundamental difference between doing your own analytics (self-hosted), and outsourcing the job to Google who can then track your site’s visitors across domains.

I was hoping that GDPR would put the squeeze on third-party tracking, but it looks like Google have found a way out. By declaring themselves a data controller (but not a data processor), they pass can pass the buck to the data processors to obtain consent.

If you have Google Analytics on your site, that’s you, that is.

GDPR and Google Analytics

Enforcement of the European Union’s General Data Protection Regulation is coming very, very soon. Look busy. This regulation is not limited to companies based in the EU—it applies to any service anywhere in the world that can be used by citizens of the EU.

It’s less about data protection and more like a user’s bill of rights. That’s good. Cennydd has written a techie’s rough guide to GDPR.

The Open Data Institute’s Jeni Tennison wrote down her thoughts on how it could change data portability in particular. While she welcomes GDPR, she has some misgivings.

Blaine—who really needs to get a blog—shared his concerns in the form of the online equivalent of interpretive dance …a twitter thread (it’s called a thread because it inevitably gets all tangled, and it’s easy to break.)

It’s increasingly looking like GDPR is a massive scaled-up version of the idiotic and horrifically mis-managed “cookie law”.
— Blaine Cook (@blaine) January 28, 2018

The interesting thing about the so-called “cookie law” is that it makes no mention of cookies whatsoever. It doesn’t list any specific technology. Instead it states that any means of tracking or identifying users across websites requires disclosure. So if you’re setting a cookie just to manage state—so that users can log in, or keep items in a shopping basket—the legislation doesn’t apply. But as soon as your site allows a third-party to set a cookie, it’s banner time.

Google Analytics is a classic example of a third-party service that uses cookies to track people across domains. That’s pretty much why it exists. We, as site owners, get to use this incredibly powerful tool, and all we have to do in return is add one little snippet of JavaScript to our pages. In doing so, we’re allowing a third party to read or write a cookie from their domain.

Before Google Analytics, Google—the search engine business—was able to identify and track what users were searching for, and which search results they clicked on. But as soon as the user left google.com, the trail went cold. By creating an enormously useful analytics product that only required site owners to add a single line of JavaScript, Google—the online advertising business—gained the ability to keep track of users across most of the web, whether they were on a site owned by Google or not.

Under the old “cookie law”, using a third-party cookie-setting service like that meant you had to inform any of your users who were citizens of the EU. With GDPR, that changes. Now you have to get consent. A dismissible little overlay isn’t going to cut it any more. Implied consent isn’t enough.

Now this situation raises an interesting question. Who’s responsible for getting consent? Is it the site owner or the third party whose script is the conduit for the tracking?

In the first scenario, you’d need to wait for an explicit agreement from a visitor to your site before triggering the Google Analytics functionality. Suddenly it’s not as simple as adding a single line of JavaScript to your site.

In the second scenario, you don’t do anything differently than before—you just add that single line of JavaScript. But now that script would need to launch the interface for getting consent before doing any tracking. Google Analytics would go from being something invisible to something that directly impacts the user experience of your site.

I’m just using Google Analytics as an example here because it’s so widespread. This also applies to third-party sharing buttons—Twitter, Facebook, etc.—and of course, advertising.

In the case of advertising, it gets even thornier because quite often, the site owner has no idea which third party is about to do the tracking. Many, many sites use intermediary services (y’know, ‘cause bloated ad scripts aren’t slowing down sites enough so let’s throw some just-in-time bidding into the mix too). You could get consent for the intermediary service, but not for the final advert—neither you nor your site’s user would have any idea what they were consenting to.

Interesting times. One way or another, a massive amount of the web—every website using Google Analytics, embedded YouTube videos, Facebook comments, embedded tweets, or third-party advertisements—will be liable under GDPR.

It’s almost as if the ubiquitous surveillance of people’s every move on the web wasn’t a very good idea in the first place.

Heisenberg

I wrote about Google Analytics yesterday. As usual, I syndicated the post to Ev’s blog, and I got an interesting response over there. Kelly Burgett set me straight on some of the finer details of how goals work, and finished with this thought:

You mention “delivering a performant, accessible, responsive, scalable website isn’t enough” as if it should be, and I have to disagree. It’s not enough for a business to simply have a great website if you are unable to understand performance of channel marketing, track user demographics and behavior on-site, and optimize your site/brand based on that data. I’ve seen a lot of ugly sites who have done exceptionally well in terms of ROI, simply because they are getting the data they need from the site in order make better business decisions. If your site cannot do that (ie. through data collection, often third party scripts), then your beautifully-designed site can only take you so far.

That makes an excellent case for having analytics. But that’s not necessarily the same as having Google analytics, or even JavaScript-driven analytics at all.

By far the most useful information you get from analytics is around where people have come from, where did they go next, and what kind of device are they using. None of that information requires JavaScript. It’s all available from your server logs.

I don’t want to come across all old-man-yell-at-cloud here, but I’m trying to remember at what point self-hosted software for analysing your log traffic became not good enough.

Here’s the thing: logging on the server has no effect on the user experience. It’s basically free, in terms of performance. Logging via JavaScript, by its very nature, has some cost. Even if its negligible, that’s one more request, and that’s one more bit of processing for the CPU.

All of the data that you can only get via JavaScript (in-page actions, heat maps, etc.) are, in my experience, better handled by dedicated software. To me, that kind of more precise data feels different to analytics in the sense of funnels, conversions, goals and all that stuff.

So in order to get more fine-grained data to analyse, our analytics software has now doubled down on a technology—JavaScript—that has an impact on the end user, where previously the act of observation could be done at a distance.

There are also blind spots that come with JavaScript-based tracking. According to Google Analytics, 0% of your customers don’t have JavaScript. That’s not necessarily true, but there’s literally no way for Google Analytics—which relies on JavaScript—to even do its job in the absence of JavaScript. That can lead to a dangerous situation where you might be led to think that 100% of your potential customers are getting by, when actually a proportion might be struggling, but you’ll never find out about it.

Related: according to Google Analytics, 0% of your customers are using ad-blockers that block requests to Google’s servers. Again, that’s not necessarily a true fact.

So I completely agree than analytics are a good thing to have for your business. But it does not follow that Google Analytics is a good thing for your business. Other options are available.

I feel like the assumption that “analytics = Google Analytics” is like the slippery slope in reverse. If we’re all agreed that analytics are important, then aren’t we also all agreed that JavaScript-based tracking is important?

In a word, no.

This reminds me of the arguments made in favour of intrusive, bloated advertising scripts. All of the arguments focus on the need for advertising—to stay in business, to pay the writers—which are all great reasons for advertising, but have nothing to do with JavaScript, which is at the root of the problem. Everyone I know who uses an ad-blocker—including me—doesn’t use it to stop seeing adverts, but to stop the performance of the page being degraded (and to avoid being tracked across domains).

So let’s not confuse the means with the ends. If you need to have advertising, that doesn’t mean you need to have horribly bloated JavaScript-based advertising. If you need analytics, that doesn’t mean you need an analytics script on your front end.

On The Verge

Quite a few people have been linking to an article on The Verge with the inflammatory title The Mobile web sucks. In it, Nilay Patel heaps blame upon mobile browsers, Safari in particular:

But man, the web browsers on phones are terrible. They are an abomination of bad user experience, poor performance, and overall disdain for the open web that kicked off the modern tech revolution.

Les Orchard says what we’re all thinking in his detailed response The Verge’s web sucks:

Calling out browser makers for the performance of sites like his? That’s a bit much.

Nilay does acknowledge that the Verge could do better:

Now, I happen to work at a media company, and I happen to run a website that can be bloated and slow. Some of this is our fault: The Verge is ultra-complicated, we have huge images, and we serve ads from our own direct sales and a variety of programmatic networks.

But still, it sounds like the buck is being passed along. The performance issues are being treated as Somebody Else’s Problem …ad networks, trackers, etc.

The developers at Vox Media take a different, and in my opinion, more correct view. They’re declaring performance bankruptcy:

I mean, let’s cut to the chase here… our sites are friggin’ slow, okay!

But I worry about how they can possibly reconcile their desire for a faster website with a culture that accepts enormously bloated ads and trackers as the inevitable price of doing business on the web:

You realize that “bloat" pays the salaries of editorial, product, design, video, etc etc etc, right?
— nilay patel (@reckless) July 20, 2015

I’m hearing an awful lot of false dichotomies here: either you can have a performant website or you have a business model based on advertising. Here’s another false dichotomy:

To be clear: I’d pick a slow open web loaded with trackers and ads over a walled garden 100 percent of the time.
— nilay patel (@reckless) July 21, 2015

If the message coming down from above is that performance concerns and business concerns are fundamentally at odds, then I just don’t know how the developers are ever going to create a culture of performance (which is a real shame, because they sound like a great bunch). It’s a particularly bizarre false dichotomy to be foisting when you consider that all the evidence points to performance as being a key differentiator when it comes to making moolah.

It’s funny, but I take almost the opposite view that Nilay puts forth in his original article. Instead of thinking “Oh, why won’t these awful browsers improve to be better at delivering our websites?”, I tend to think “Oh, why won’t these awful websites improve to be better at taking advantage of our browsers?” After all, it doesn’t seem like that long ago that web browsers on mobile really were awful; incapable of rendering the “real” web, instead only able to deal with WAP.

As Maciej says in his magnificent presentation Web Design: The First 100 Years:

As soon as a system shows signs of performance, developers will add enough abstraction to make it borderline unusable. Software forever remains at the limits of what people will put up with. Developers and designers together create overweight systems in hopes that the hardware will catch up in time and cover their mistakes.

We complained for years that browsers couldn’t do layout and javascript consistently. As soon as that got fixed, we got busy writing libraries that reimplemented the browser within itself, only slower.

I fear that if Nilay got his wish and mobile browsers made a quantum leap in performance tomorrow, the result would be even more bloated JavaScript for even more ads and trackers on websites like The Verge.

If anything, browser makers might have to take more drastic steps to route around the damage of bloated websites with invasive tracking.

We’ve been here before. When JavaScript first landed in web browsers, it was quickly adopted for three primary use cases:

swapping out images when the user moused over a link,
doing really bad client-side form validation, and
spawning pop-up windows.

The first use case was so popular, it was moved from a procedural language (JavaScript) to a declarative language (CSS). The second use case is still with us today. The third use case was solved by browsers. They added a preference to block unwanted pop-ups.

Tracking and advertising scripts are today’s equivalent of pop-up windows. There are already plenty of tools out there to route around their damage: Ghostery, Adblock Plus, etc., along with tools like Instapaper, Readability, and Pocket.

That option is basically stealing. Don’t feel good about that.
— nilay patel (@reckless) July 21, 2015

I’m sure that business owners felt the same way about pop-up ads back in the late ’90s. Just the price of doing business. Shrug shoulders. Just the way things are. Nothing we can do to change that.

For such a young, supposedly-innovative industry, I’m often amazed at what people choose to treat as immovable, unchangeable, carved-in-stone issues. Bloated, invasive ad tracking isn’t a law of nature. It’s a choice. We can choose to change.

Every bloated advertising and tracking script on a website was added by a person. What if that person refused? I guess that person would be fired and another person would be told to add the script. What if that person refused? What if we had a web developer picket line that we collectively refused to cross?

That’s an unrealistic, drastic suggestion. But the way that the web is being destroyed by our collective culpability calls for drastic measures.

By the way, the pop-up ad was first created by Ethan Zuckerman. He has since apologised. What will you be apologising for in decades to come?

Tracking

Ajax was a really big deal six, seven, eight years ago. My second book was all about Ajax. I spoke about Ajax at conferences and gave workshops all about using Ajax and progressive enhancement.

During those workshops, I would often point out that Ajax had the potential to be abused terribly. Until the advent of Ajax, it was very clear to a user when data was being submitted to a server: you’d have to click a link or submit a form. As soon as you introduce asynchronous communication, it’s possible for the server to get information from the client even without a full-page refresh.

Imagine, for example, that you’re typing a message into a textarea. You might begin by typing, “Why, you stuck up, half-witted, scruffy-looking nerf…” before calming down and thinking better of it. Before Ajax, there was no way that what you had typed could ever reach the server. But now, it’s entirely possible to send data via Ajax with every key press.

It was just a thought experiment. I wasn’t actually that worried that anyone would ever do something quite so creepy.

Then I came across this article by Jennifer Golbeck in Slate all about Facebook tracking what’s entered—but then erased—within its status update form:

Unfortunately, the code that powers Facebook still knows what you typed—even if you decide not to publish it. It turns out that the things you explicitly choose not to share aren’t entirely private.

Initially I thought there must have been some mistake. I erronously called out Jen Golbeck when I found the PDF of a paper called The Post that Wasn’t: Exploring Self-Censorship on Facebook. The methodology behind the sample group used for that paper was much more old-fashioned than using Ajax:

First, participants took part in a weeklong diary study during which they used SMS messaging to report all instances of unshared content on Facebook (i.e., content intentionally self-censored). Participants also filled out nightly surveys to further describe unshared content and any shared content they decided to post on Facebook. Next, qualified participants took part in in-lab interviews.

But the Slate article was referencing a different paper that does indeed use Ajax to track instances of deleted text:

This research was conducted at Facebook by Facebook researchers. We collected self-censorship data from a random sample of approximately 5 million English-speaking Facebook users who lived in the U.S. or U.K. over the course of 17 days (July 6-22, 2012).

So what I initially thought was a case of alarmism—conflating something as simple as simple as a client-side character count with actual server-side monitoring—turned out to be a pretty accurate reading of the situation. I originally intended to write a scoffing post about Slate’s linkbaiting alarmism (and call it “The shocking truth behind the latest Facebook revelation”), but it turns out that my scoffing was misplaced.

That said, the article has been updated to reflect that the Ajax requests are only sending information about deleted characters—not the actual content. Still, as we learned very clearly from the NSA revelations, there’s not much practical difference between logging data and logging metadata.

The nerds among us may start firing up our developer tools to keep track of unexpected Ajax requests to the server. But what about everyone else?

This isn’t the first time that the power of JavaScript has been abused. Every browser now ships with an option to block pop-up windows. That’s because the ability to spawn new windows was so horribly misused. Maybe we’re going to see similar preference options to avoid firing Ajax requests on keypress.

It would be depressingly reductionist to conclude that any technology that can be abused will be abused. But as long as there are web developers out there who are willing to spawn pop-up windows or force persistent cookies or use Ajax to track deleted content, the depressingly reductionist conclusion looks like self-fulfilling prophecy.