How We Built The World Wide Web In Five Days

This talk about recreating the first ever web browser was a joint presentation with Remy Sharp, delivered at the Fronteers conference in Amsterdam in October 2019.

Jeremy

Our story begins with the Big Bang.

13.8 billion years ago

This sets a chain of events in motion that gives us elementary particles, then more complex particles like atoms, which form stars and planets, including our own, on which life evolves, which brings us to the recent past when this whole process results in the universe generating a way of looking at itself: physicists.

A physicist is the atom’s way of knowing about atoms.

—George Wald

By the end of World War Two, physicists in Europe were in short supply. If they hadn’t already fled during Hitler’s rise to power, they were now being actively wooed away to the United States.

64 years ago

To counteract this brain drain, a coalition of countries forms the European Organization for Nuclear Research, or to use its French acronym, CERN.

They get some land in a suburb of Geneva on the border between Switzerland and France, where they set about smashing particles together and recreating the conditions that existed at the birth of the universe.

The Syncrocyclotron.

Every year, CERN is host to thousands of scientists who come to run their experiments.

Remy

Fast forward to February 2019, a group of 9 of us were invited to CERN as an elite group of hackers to recreate a different experiment.

The group.

We are there to recreate a piece of software first published 30 years ago. Given this goal, we need to answer some important questions first:

  • How does this software look and feel?
  • How does it work?
  • How you interact with it?
  • How does it behave?

The software is so old that it doesn’t run on any modern machines, so we have a NeXT machine specially shipped from the nearby museum. This is no ordinary machine. It was one of the only two NeXT machines that existed at CERN in the late 80s.

Now we have the machine to run this special software.

By some fluke the good people of the web have captured several different versions of this software and published them on Github.

So we selected the oldest version we could find. We download it from Github to our computers. Now we have to transfer it to the NeXT machine.

Except there’s no USB drive. It didn’t exist. CD ROM? Floppy drive? The NeXT computer had a “floptical drive”—bespoke to NeXT computers—all very well, but in 2019 we don’t have those drives.

To transfer the software from our machine, to the NEXT machine, we needed to use the network.

Jeremy

62 years ago

In 1957, J.C.R. Licklider was the first person to publicly demonstrate the idea of time sharing: linking one computer to another.

56 years ago

Six years later, he expanded on the idea in a memo that described an Intergalactic Computer Network.

By this time, he was working at ARPA: the department of Defense’s Advanced Research Projects Agency. They were very interested in the idea of linking computers together, for very practical reasons.

America’s military communications had a top-down command-and-control structure. That was a single point of failure. One pre-emptive strike and it’s game over.

The solution was to create a decentralised network of computers that used Paul Baran’s brilliant idea of packet switching to move information around the network without any central authority.

This idea led to the creation of the ARPANET. Initially it connected a few universities. The ARPANET grew until it wasn’t just computers at each endpoint; it was entire networks. It was turning into a network of networks …an internetwork, or internet, for short. In order for these networks to play nicely with one another, they needed to agree on using the same set of protocols for packet switching.

Bob Kahn and Vint Cerf crafted the simplest possible set of low-level protocols: the Transmission Control Protocol and the Internet Protocol. TCP/IP.

TCP/IP is deliberately dumb. It doesn’t care about the contents of the packets of data being passed around the internet. People were then free to create more task-specific protocols to sit on top of TCP/IP.

There are protocols specifically for email, for example. Gopher is another example of a bespoke protocol. And there’s the File Transfer Protocol, or FTP.

Remy

Back in our war room in 2019, we finally work out that can use FTP to get the software across. FTP is an arcane protocol, but we can agree that it will work across the two eras.

Although we have to manually install FTP servers onto our machines. FTP doesn’t ship with new machines because it’s generally considered insecure.

Now we finally have the software installed on the NeXT computer and we’re able to run the application.

We double click the shading looking, partly hand drawn icon with a lightning bolt on it, and we wait…

Once the software’s finally running, we’re able to see that it looks a bit like an ancient word processor. We can read, edit and open documents. There’s some basic styles lots of heavy margins. There’s a super weird menu navigation in place.

But there’s something different about this software. Something that makes this more than just a word processor.

These documents, they have links…

Jeremy

Ted Nelson is fond of coining neologisms. You can thank him for words like “intertwingled” and “teledildonics”.

56 years ago

He also coined the word “hypertext” in 1963. It is defined by what it is not.

Hypertext is text which is not constrained to be linear.

Ever played a “choose your own adventure” book? That’s hypertext. You can jump from one point in the book to a different point that has its own unique identifier.

The idea of hypertext predates the word. In 1945, Vannevar Bush published a visionary article in The Atlantic Monthly called As We May Think.

He imagines a mechanical device built into a desk that can summon reams of information stored on microfilm, allowing the user to create “associative trails” as they make connections between different concepts. He calls it the Memex.

Memex

Also in 1945, a young American named Douglas Engelbart has been drafted into the navy and is shipping out to the Pacific to fight against Japan. Literally as the ship is leaving the harbour, word comes through that the war is over. He still gets shipped out to the Philippines, but now he’s spending his time lounging in a hut reading magazines. That’s how he comes to read Vannevar Bush’s Memex article, which lodges in his brain.

51 years ago

Douglas Engelbart decides to dedicate his life to building the computer equivalent of the Memex.

On December 9th, 1968, he unveils his oNLine System—NLS—in a public demonstration. Not only does he have a working implementation of hypertext, he also shows collaborative real-time editing, windows, graphics, and oh yeah—for this demo, he invents the mouse.

It truly is The Mother of All Demos.

Douglas Engelbart has a posse.

39 years ago

There were a number of other attempts at creating hypertext systems. In 1980, a young computer scientist named Tim Berners-Lee found himself working at CERN, where scientists were having a heck of time just keeping track of information.

He created a system somewhat like Apple’s Hypercard, but with clickable links. He named it ENQUIRE, after a Victorian book of manners called Enquire Within Upon Everything.

ENQUIRE didn’t work out, but Tim Berners-Lee didn’t give up on the problem of managing information at CERN. He thinks about all the work done before: Vannevar Bush’s Memex; Ted Nelson’s Xanadu project; Douglas Engelbart’s oNLine System.

A lot of hypertext ideas really are similar to a choose-your-own-adventure: jumping around from point to point within a book. But what if, instead of imagining a hypertext book, we could have a hypertext library? Then you could jump from one point in a book to a different point in a different book in a completely different part of the library.

The Library Of Babel

In other words, what if you took the world of hypertext and the world of networks, and you smashed them together?

30 years ago

On the 12th of March, 1989, Tim Berners-Lee circulates the first draft of a document titled Information Management: A Proposal.

The diagrams are incomprehensible. But his supervisor at CERN, Mike Sendall, sees the potential. He reads the proposal and scrawls these words across the top: “vague, but exciting.”

Tim Berners-Lee gets the go-ahead to spend some time on this project. And he gets the budget for a nice shiny NeXT machine. With the support of his colleague Robert Cailliau, Berners-Lee sets about making his theoretical project a reality. They kick around a few ideas for the name.

They thought of calling it The Mesh. They thought of calling it The Information Mine, but Tim rejected that, knowing that whatever they called it, the words would be abbreviated to letters, and The Information Mine would’ve seemed quite egotistical.

So, even though it’s only going to exist on one single computer to begin with, and even though the letters of the abbreviation take longer to say than the words being abbreviated, they call it …the World Wide Web.

As Robert Cailliau told us, they were thinking “Well, we can always change it later.”

Tim Berners-Lee brainstorms a new protocol for hypertext called the HyperText Transfer Protocol—HTTP.

He thinks about a format for hypertext called the Hypertext Markup Language—HTML.

He comes up with an addressing scheme that uses Unique Document Identifiers—UDIs, later renamed to URIs, and later renamed again to URLs.

But he needs to put it all together into running code. And so Tim Berners-Lee sets about writing a piece of software…

Remy

Tim Berners-Lee’s document is a proposal at that stage 30 years ago. It’s just theory. So he needs to build a prototype to actually demonstrate how the World Wide Web would work.

The NeXT computer is the perfect ground for rapid software development because the NeXT operating system ships with a program called NSBuilder.

NeXT

NSBuilder is software to build software. In fact, the “NS” (meaning NeXTSTEP) can be found in existing software today - you’ll find references to NSText in Safari and Mac developer documentation.

Tim Berners-Lee, using NSBuilder was able to create a working prototype of this software in just 6 weeks

He called it: WorldWideWeb.

We finally have the software working the way it ran 30 years ago.

But our project is to replicate this browser so that you can try it out, and see how web pages look through the lens of 1990.

So we enter some of our blog urls. https://remysharp.com, https://adactio.com

But HTTPS doesn’t work. There was no HTTPS. There’s no HTTP2. HTTP1.0 hadn’t even been invented.

So I make a proxy. Effectively a monster-in-the-middle attack on all web requests, stripping the SSL layer and then returning the HTML over the HTTP 0.9 protocol.

And finally, we see…

We see junk.

We can see the text content of the website, but there’s a lot of HTML junk tags being spat out onto the screen, particularly at the start of the document.

Jeremy

<h1> <h2> <h3> <h4> <h5> <h6> 
<ol> <ul> <li> <p>

These tags are probably very familiar to you. You recognise this language, right?

That’s right. It’s SGML.

SGML is the successor to GML, which supposedly stands for Generalised Markup Language. But that may well be a backronym. The format was created by Goldfarb, Mosher, and Lorie: G, M, L.

SGML is supposed to be short for Standard Generalised Markup Language.

A flavour of SGML was already being used at CERN when Tim Berners-Lee was working on his World Wide Web project. Rather than create a whole new format from scratch, he repurposed what people were already familiar with. This was his HyperText Markup Language, HTML.

One thing he did add was a tag called A for anchor.

Its href attribute is short for “hypertext reference”. Plop a URL in there and you’ve got a link.

The hypertext community thought this was a terrible way to make links.

They believed that two-way linking was vital. With two-way linking, the linked resource connects back to where the link originates. So if the linked resource moved, the link would stay intact.

That’s not the case with the World Wide Web. If the linked resource moves, the link is broken.

Perhaps you’ve experienced broken links?

When Tim Berners-Lee wrote the code for his WorldWideWeb browser, there was a grand total of 26 tags in HTML. I know that we’d refer to them as elements today, but that term wasn’t being used back then.

Now there are well over 100 elements in HTML. The reason why the language has been able to expand so much is down to the way web browsers today treat unknown elements: ignore any opening and closing tags you don’t recognise and only render the text in between them.

Remy

The parsing algorithm was brittle (when compared to modern parsers). There’s no DOM tree being built up. Indeed, the DOM didn’t exist.

Remember that the WorldWideWeb was a browser that effectively smooshed together a word processor and network requests, the styling method was based (mostly) around adding margins as the tags were parsed.

Kimberly Blessing was digging through the original 7344 lines of code for the WorldWideWeb source. She found the code that could explain why we were seeing junk.

<link rel="..."

In this case, when the parser encountered <link rel="…" it would see the <.

<

“Yes, a tag; let’s slurp it up”.

<li

Then it reads li and the parser is thinking, “This looks like a list item, good stuff.”

<lin

Then encounters the n (of link) and, excusing the paring algorithm because it was the first, would then abort the style it was about to apply and promptly spit out the rest of the content on screen, having already swallowed up the first four characters: <lin.

k rel="stylesheet" href="...">

With that, we decided to make the executive design decision that we would strip out any elements that were unknown to the original WorldWideWeb browser — link, script, video and img — which of course there was no image support in the world’s first browser.

This is the first little cheat we applied, so that the page would be more pleasing to you, the visitor of our emulator. Otherwise you’d be presented with a lot of scary looking junk.

So now we have all the reference we need to be able to replicate this browser:

  • The machine running the original operating system, which gives us colours, fonts, menus and so on.
  • The browser itself, how windows behave, what’s in the menus, what makes the experience unique to that period of time.
  • And finally how it looks when we visit URLs.

So off we go.

🤯

Jeremy

While Remy sets about recreating the functionality of the WorldWideWeb browser, Angela was recreating the user interface using CSS.

Inputs. Buttons. Icons. Menus. All with the exact borders, highlights and shadows used in the UI of the NeXT operating system, including having the scrollbar on the left side of windows.

Meanwhile the rest of us were putting together an explanatory website to give some backstory to what we were doing. I spent most of my time working on a timeline showing thirty years before and thirty years after the original proposal for the web.

Marking up (and styling) an interactive timeline that looks good in a modern browser and still works in the first ever web browser.

The WorldWideWeb browser inherited fonts from the NeXTSTEP operating system. It mostly used Helvetica and a font called Ohlfs (created by Keith Ohlfs). Helvetica is ubiquitous but Ohlfs was never seen outside of a NeXT machine.

Our teammates Mark and Brian were obsessed with accurately recreating the typography. We couldn’t use modern fonts which are vector based. We need pixeliness.

So Mark and Brian took a screenshot of the NeXT machine’s alphabet. With help from afar from font designer David Jonathan Ross, they traced each square pixel in a vector program and then exported that as a web font. Now we’ve got a web font that deliberately isn’t anti-aliased. It’s a vector format that recreates the look of a bitmap.

Put the pixelly font together with the CSS interface elements and you’ve got something that really looks like the old WorldWideWeb programme.

Remy

This is the final product of our work at CERN that week. A fully working WorldWideWeb emulator giving a reasonable close experience of what it was like to surf the web as if it were 30 years ago.

This is entirely in the browser and was written using:

  • React,
  • React Draggable for the windows and menus,
  • React Hotkeys for keyboard combo shortcuts (we replicated the original OS as much as we could),
  • idb-keyval for some local storage,
  • Parcel for bundling.

These tools weren’t chosen particular because they were the best tools for the job, but rather because they were the tools I knew that well enough that would help speed up my development process.

We worked hard to replicate the look and feel as much as we could. We even replicated typos found throughout the WorldWideWeb app:

An excercise in global information availability

Why don’t we see how it looks…

Jeremy

There’s kind of irony in this in that it relies heavily on JavaScript. In fact, there’s nothing there other than JavaScript. But of course the WorldWideWeb browser couldn’t deal with JavaScript—JavaScript hadn’t been invented yet. So the one URL that definitely wouldn’t work in this emulator is …the emulator itself.

Remy

(Which Jeremy was blaming me for.)

This is what you see when you visit the WorldWideWeb browser for the first time. We can see we are welcomed by the universe of hypertext. We’ve got these menus over here that you can drag off and open panels (I always thought this was an ordering bug but the operating system actually works like this).

We’ll go ahead and open the Fronteers website. I go to “Document” and then I go to “Open from full document reference” (because the word URL didn’t exist). I’m going to pop the Fronteers URL in here. And there it is. We’ve got the Fronteers website. Looks pretty good. (One of my favourite UI bits is this scrollbar on the left hand side instead of the right.)

We can follow the links. Actually one of my favourite features that was in this original browser that we replicated was this “Navigate” menu. I’ve just opened the first link in the document, but I can click on “Next”, and “Next” a bunch of times and it will cycle through each one of the links on the page that I launched from and let me read all the pages that the Fronteers site links to (which I really like). I can go backwards and forwards, and so on.

One thing you might have already noticed is that there are no URLs here. And in fact, to view source, it was considered a kind of diagnostic option and it was very very tucked away. The reason for this is that URLs—and the source HTML or SGML—was considered ugly and potentially a bad user experience.

But there’s one thing about navigating here that’s different. To open this link, I had to double-click.

Jeremy

The WorldWideBrowser was more of a prototype than anything else. It demonstrated the potential of the World Wide Web project, but it only worked on NeXT machines.

To show how the World Wide Web could work on any computer, the second ever web browser was the Line Mode Browser, coded by Nicola Pellow. It had a very basic text interface—no clicking on links—but it could be installed anywhere.

Lots of other geeks and nerds were working on their own web browsers, but it was Marc Andreesen’s Mosaic browser that really blew the doors open for the web. It had a nice usable interface, and it (unilaterrally) introduced the innovation of images on the web.

Andreesen went on to found Netscape. The World Wide Web took off at an unprecedented rate. Microsoft brought out their Internet Explorer browser and started trying to catch up with Netscape. We had the browser wars. Later we got even more browsers, like Safari and Chrome, while Netscape morphed into Firefox and Internet Explorer morphed into Edge. And the rest is history.

But all of these browsers were missing something that was in the original WorldWideWeb browser.

Remy

The reason I have to double-click on these links is that, when I do a single click, it actually places the cursor. The cursor is blinking there on “Fronteers.” And the reason I can place the cursor is because I can edit the document.

I see Fronteers here is missing a heading. We want to welcome you all:

Welkom

We want to make that a heading. Let’s style that. It’s a heading.

So the browser was meant to edit documents. Let’s put a bit of text here:

Great talks from Remy and Jeremy

(forget about everyone else). Now if I want to create a link, I’ll go ahead and navigate to Jeremy’s site, https://adactio.com. I’m going to do “Link”, then “Mark all”, which is a way of copying the URL to that window. Then I go back to the Fronteers website, select “Jeremy”, and then do “Link to marked.” I can double-click on Jeremy’s name it will open up his website.

I can save this document as well. I’m going to call it fronteers.html.

Let’s do a hard reboot—a browser refresh. I come back to my machine a couple of days later, “Ah, the Fronteers page!”. I’m going to open that again, and it linked to that really handsome guy in the sprite shirt. And yes, the links still work.

In fact, this documentation that you see when the WorldWideWeb browser launches was written, styled, and linked using the WorldWideWeb browser. The WorldWideWeb browser was for a web that you could read and write.

But this didn’t survive. It was a hurdle that was too tricky to propose or implement across the different types servers that existed and for the upcoming browsers that were on the horizon.

And so it wasn’t standardised and doesn’t exist today.

But this is an important lesson from the time: reducing complexity increases the chances of mass adoption.

In the end, simplicity wins.

Jeremy

I think that’s a pattern we see over and over again, not just in the history of the web, but before the web. Simplicity wins.

Ted Nelson famously to this day thinks that the World Wide Web is weak sauce. It didn’t try to solve complex right out of the gate, like handling micro-payments.

As we saw, the hypertext community that one-way linking was ridiculous. But simplicity does win out.

Unfortunately that’s why browsers ended up just being browsers. We got some of the functionality back with wikis, content management systems, and social media to a certain extent. But I think it’s still a bit of a shame that when I want to browse a web page, I’m using one piece of software—the browser—but when I want to make a web page, I’m using another piece of software (or multiple pieces of software) to get something on to the web.

I feel like we lost something.

Remy

We head home after a week of hacking.

We were all invited back in March earlier this year for the Web@30 event that was taking place to celebrate the web but also Sir Tim Berners-Lee.

A NeXT machine from 1989 running the WorldWideWeb browser and my laptop in 2019 running https://worldwideweb.cern.ch

A few of us, Jeremy, Martin, and myself, went back to CERN for the the first leg of the event. There was even a video showing off our work as part of the main conference. Jeremy and I even chased Tim Berners-Lee back to London at the science museum like obsessive web fanboys. It was a lot of fun!

The night before I got a message from Jean-François Groff, pictured here on the right. JF Groff joined Tim Berners-Lee 30 years ago and created libwww (a precursor to libcurl).

The message read:

Sitting with Tim right now. He loves your browser!

Crushed it.

It’s amazing that we were able to pull this off in a week just with text editors and information that’s freely available. It’s mind boggling how much we can do today and how far it can reach. And it all started on that NeXTSTEP machine 30 years ago.

What I really loved about this project was working with this brilliantly old technology, digging around at the birth of browsers and the web.

I wouldn’t be stood here today, if it weren’t for the web.

I wouldn’t even know Jeremy, if it weren’t for the web.

I wouldn’t have a career, if it weren’t for the web.

I loved seeing how such old technology, the original WorldWideWeb browser was still able to render my blog. Because I put content first, delivered markup from the server. The page rendered because HTML really is backward compatible.

HTML and HTTP are just text. Nothing terribly fancy. Dare I say, beautifully simple, and as we said before, simplicity wins the day.

This same simplicity is what allows us all to have the chance for an equal voice. The web allows us to freely publish our thoughts and experiences. We have to fight to protect that kind of web.

And we’ve got to work at keeping it simple.

Jeremy

When we returned to CERN for the 30th anniversary celebrations, one of the other people there was the journalist Zeynep Tefepkçi.

When @Zeynep met NeXT.

Lou, Zeynep, Tim, Robert, and Jean-François. #Web30

She was on a panel along with Tim Berners-Lee, Robert Caillau, Jean-François Groff, and Lou Montoulli. At the end of the panel discussion, she was asked:

What would you tell the next generation about how to use this wonderful tool?

She replied:

If you have something wonderful, if you do not defend it, you will lose it.

If you do not defend the magic and the things that make it wonderful, it’s just not going to stay magical by itself.

Defend the simplicity and resilience that’s so central to the web.

I don’t know about you, but I often feel that just trying to make a web page has become far too complicated. But this is complexity that we have chosen with our tools, processes, and assumptions. We’ve buried the magic. The magic of linking web pages together. The magic of a working global hypertext system, where nobody needs to ask for permission to publish.

Tim Berners-Lee prototyped the first web browser, but the subsequent world wide web wasn’t created by any one person. It was created by everyone. That. Is. Magical.

I don’t want the web to become a place where only an elite priesthood get to experience the magic of creation. I’m going to fight to defend the openness of the world wide web. This is for everyone. Not just for everyone to use; it’s for everyone to create.

Also on Medium

Responses

Andy

Jeremy Keith: Building the World Wide Web in Five Days – recreating the original CERN web browser using modern web technology.

# Posted by Andy on Monday, October 21st, 2019 at 12:22am

Johan Bové

Amazing story! I found a typo in the article. In the sentence “Back in our war room in 2019, we finally work out that can use FTP to get the software across.”, there is a “we” missing after “we finally work out that {we} can use FTP…”.

# Posted by Johan Bové on Monday, October 21st, 2019 at 7:07pm

1 Share

# Shared by j11g on Monday, October 21st, 2019 at 1:46pm

2 Likes

# Liked by Chris M. on Wednesday, October 16th, 2019 at 2:44pm

# Liked by Johan Bové on Monday, October 21st, 2019 at 7:01pm