Websites as graphs
Everyday, we look at dozens of websites. The structure of these websites is defined in HTML, the lingua franca for publishing information on the web. Your browser's job is to render the HTML according to the specs (most of the time, at least). You can look at the code behind any website by selecting the "View source" tab somewhere in your browser's menu.
HTML consists of so-called tags, like the A tag for links, IMG tag for images and so on. Since tags are nested in other tags, they are arranged in a hierarchical manner, and that hierarchy can be represented as a graph. I've written a little app that visualizes such a graph, and here are some screenshots of websites that I often look at.
I've used some color to indicate the most used tags in the following way:
blue: for links (the A tag)
red: for tables (TABLE, TR and TD tags)
green: for the DIV tag
violet: for images (the IMG tag)
yellow: for forms (FORM, INPUT, TEXTAREA, SELECT and OPTION tags)
orange: for linebreaks and blockquotes (BR, P, and BLOCKQUOTE tags)
black: the HTML tag, the root node
gray: all other tags
Here I post a couple of screenshots, and I plan to make the app available as an applet, so that anybody can look at their websites in a new way.
Update: Here it is: http://www.aharef.info/static/htmlgraph/
CNN has a complicated but typical tag structure of a portal: Lots of links, lots of images. Similar use of divs and tables for layouting purposes. (1316 tags)
boingboing, my favorite blog, has a very simple tag structure: there seems to be one essential container that contains all other tags, essentially links (lots!), images, and tags to layout the text. A typical content driven website. (1056 tags)
As always, simplicity rules at Apple's website. A few images and links, that's it. Note the large yellow cluster, representing a dropdown menu. (350 tags)
Yahoo seems to be stuck in the old days of HTML style: most of the tags are tables, used for layouting - no divs. Very uncommon these days. (952 tags)
The complete opposite of yahoo - this site uses almost no tables at all, only divs (green). It's nice to see how the div tags are holding the other elements, like links and images, together. (454 tags)
Surprisingly, at least to me, Microsoft's portal is very much div-driven. Also of note is it's very scarce use of images. (633 tags)
Today, google is everywhere, but if somebody had asked me 5 years ago why I was using google, and wanted a visual answer, here it is (88 tags):
I finish with two of my own projects:
What can I say? I like it ;-) No tables, lots of links, simple structure. A typical Movable Type site, I guess. (372 tags)
My personal art project. Although I programmed the site myself, I'm surprised by the simplicity of its tag structure. It shows that you can make beautiful websites with just a few tags ;-) (88 tags)
That's it. You can play around with the app, and take a fresh look at websites - here's the applet.
And don't forget to support yours truly by checking out onethousandpaintings.com
Comments
verrry cool
Posted by: p-daddy | 26.05.06 00:59
Great work!
Posted by: Douglas | 26.05.06 03:17
Also, I'm wondering if you've thought about making the code open source? Particularly the graphing part, I'd love to try making graphs of other tree structures, partcularly programming code.
Posted by: Douglas | 26.05.06 03:19
Really amazing visualizations! It really gives you a great high-level overview of a website.
Posted by: Justin Palmer | 26.05.06 03:45
Amazing work, If not opensource, you still can put this app in your website, so that people can atleast buy from you
Posted by: vinod vv | 26.05.06 07:05
This is the kind of art that I'd be prone to sticking up on my walls. Or even bleeding straight into.
Posted by: Reno | 26.05.06 10:08
@Douglas - Yeah, I am happy to make the code open source - problem is, I'm still trying to get this into an applet. As soon as I'm ready, you can find it here.
I've done the programming in processing (i.e Java), and I was using the traer physics library. It's very easy to use for graphing stuff.
Posted by: Sala | 26.05.06 14:11
非常好,不错!
GOOD!
Posted by: StephenZhai | 26.05.06 14:17
google does not like divs or what?
Posted by: elvir | 26.05.06 14:50
You probably wanna visualize mine as well? ;)
Nice work.
Posted by: Jens Meiert | 26.05.06 15:05
where is the black dot for the html tag on the google site?
Posted by: tim | 26.05.06 15:49
Tim, have a better look.
Posted by: DXL | 26.05.06 16:08
@tim & DXL: You were right, there was still an old screenshot with another color scheme. Changed now. Thanks!
Posted by: Sala | 26.05.06 16:12
Drop Java and use JavaScript and use SVG. Do that and you'll see your code everywhere on the web.
Posted by: Daniel Glazman | 26.05.06 16:20
This look really good. Excelent work.
Posted by: jivanov | 26.05.06 16:43
How beautiful. Looking forward to the applet (maybe just release the code if you are comfortable with letting it run free?)
Posted by: Paul Watson | 26.05.06 16:44
Absolutely. I just want to finish that applet, it looks really beautiful when the network unfolds. I will post source code, no problem - I just don't like to put stuff online that doesn't work. But I'm almost there ;-)
Posted by: Sala | 26.05.06 16:53
When I read the title of the article, I thought it would be very dull. Graphs aren't my favorite subject, but this is very interesting and I look forward to the release of the applet!
Posted by: Lawsy | 26.05.06 18:10
Release the code! This is brilliant...open source?
Posted by: Anon | 26.05.06 18:29
This is really twinkle twinkle little star.Good.Can I draw the same ?How please let me know.
Posted by: acmathur | 26.05.06 19:14
Ditto on the javascript plus svg idea. Java applets are a pretty bad idea, they're super slow to start up, lots people don't have the plugin and you're going to have get around the restriction that applets can only access the site they are hosted on. And plus they're so 1990s. ;) Even flash would be a better idea than java.
Posted by: sjf | 26.05.06 19:32
I disagree. I'm not very into Flash, and SVG... I've been there (I wrote a book on the topic). One day SVG + JS will rock, but today is not that day.
Posted by: Sala | 26.05.06 19:41
Ok, applet is online: http://www.aharef.info/static/htmlgraph/
Posted by: Sala | 26.05.06 20:07
Brilliant! I'm also very interested in the source if you're up for posting it.
Posted by: Forrest | 26.05.06 20:24
It didn works with my page :( http://www.aharef.info/static/htmlgraph/?url=http://deejayy.hu/
Posted by: deejayy | 26.05.06 20:34
interesting.
the address, http://www.gasztrojob.hu/ does not exists, but this applet draws a very beauty graph about it :D
Posted by: deejayy | 26.05.06 20:41
@deejayy
1) your site has no html tag. that's needed for the applet to work.
2) the address that does not exist probably causes your browser to display some error message - that's what you see
Cheers,
Sala
Posted by: Sala | 26.05.06 21:33
Sala, this could almost be a dev tool--a supplement to the Mozilla DOM Inspector. It'd be nice to add the ability to roll over a node to find out what it is (element name, classes, id).
Posted by: Justin Watt | 27.05.06 00:16
Yes, roll-over labels would be _very_ useful. For my needs, page title + url-beyond-basename is enough (e.g. just "opinion/" not "www.nytimes.com/opinion/")But,, popping up a tiny page image thumbnail would be nice (fetched via home server so not bound by applet security?)
Posted by: bazzers | 27.05.06 00:36
Howdy. Great tool, really cool. I'm curious about something that may be a bug... When I try to model alistapart.com, after a short bit it spirals right out of the window!
Posted by: anon | 27.05.06 00:40
This is incredibly awesome. Congrats on building something this cool.
I do have one little suggestion, though -- what about a color for list tags (ul, dl, ol) and one for list items (li, dd, dt)? My person website maps out as mostly gray due to my large use of lists, and I suspect a lot of others do, as well. :)
Posted by: Jeff Croft | 27.05.06 01:26
Hey, great applet! Some guy on IRC posted me the link and when I saw it was Processing based I got pretty excited.
Its a shame you've not made it 'open source' - I'd have gone right ahead and put rollover labels in, and an internal colour chart, and all that stuff.
Amazing results though. I've been running it on all the subsites and pages of my own website and the different patterns are amazing. Its fun interpretting the results visually - it makes you think about the structure of your site.
I was almost sort of proud when I ran http://mkv25.net/USy/ and got a sea of orange come up on the display.
Every page has its own personality, and the applet it picks up on that brilliantly.
Posted by: Markavian | 27.05.06 01:33
Another suggestion: make the nodes 'draggable' so you can 'rearrange' the rotation of the image.
Posted by: Markavian | 27.05.06 01:36
yeah, it'd be great if you could package this and offer it as a download so that it could be used on intranets and such.
Posted by: spydrlink | 27.05.06 01:59
I tried this site I know of from a friend: www.sunshineoasistan.com and it starts to appear but then just completely disappears? What's up with that?
Posted by: spydrlink | 27.05.06 02:01
Websites as graphs for analysis. It's very interesting watching the graphs arrange themselves. It's also interesting comparing different site "structures" with your intriguing tool: few commonalities exist between sites.
Very, very cool!
And, Markavian, hit "refresh" and you will have rotation of the image.
Posted by: Sean Fraser | 27.05.06 02:05
Wonderful! I love visual representations of data and structures - I was even more wowed out by the way the applet animates the graph as it processes. Tried http://www.medialens.org/ and a yellow flower blossomed... would agree that a roll-over of the nodes with some tag info would be useful in making this a practical tool.
Posted by: flashparry | 27.05.06 02:27
Bookmarklet: copy the following into a new favourites/boomark and place on your browser links bar for a quick button to visualise the current page you're browsing:
javascript:location.href='http://www.aharef.info/static/htmlgraph/?url='+location.href
And for those using yubnub.org:
vdom URL
Posted by: flashparry | 27.05.06 03:27
Pants. That should've been:
javascript:location.href='http://www.aharef.info/static/htmlgraph/?url='+location.href
Posted by: flashparry | 27.05.06 03:28
OK. so the comments have a line length limit :)
The javascript should end in:
+location.href
Posted by: flashparry | 27.05.06 03:30
bazzers: had the same problem with my site, but as it turns out only on the front page. I think it's the Oxford English Dictionary search box there (which I'm going to remove soom anyway). It is that bit that was giving me red dots for table even though I don't use tables myself, all div/css layout.
The Postcrossing page produces a large red, orage and purple flower :)
Posted by: webchimp | 27.05.06 05:19
Actually, a couple of other pages do it as well. Links, gigs and site map.
Curious, must have a look at the code for those pages to see if thers anything that could be making it go off like that.
Posted by: webchimp | 27.05.06 05:28
This is cool !
Posted by: mcpaige | 27.05.06 07:35
www.bbc.com seems to break the applet - possibly because it's a redirect? - but bbc.co.uk is fine
Posted by: papalaz | 27.05.06 09:25
librarything also seems to break it
Posted by: Anonymous | 27.05.06 09:44
try comparing sites that do similar things - e.g. allmovie and imdb
also wikipedia is sweet
Posted by: papalaz | 27.05.06 09:57
Hey guys - thanks so much for all your feedback. There is indeed some problem with some sites, and I have to look into that. I also think it's a great idea to extend the applet, with rollover and stuff.
I will put the sourcecode online this afternoon (Central European Time). It's great to see all these ideas poppping up, starting from such a simple idea - please keep posting!
Thanks
Sala
Posted by: Sala | 27.05.06 11:12
Great work!
Posted by: Cd0MaN | 27.05.06 12:11
Lovely stuff.. works well with my website without problem.
Posted by: Jee | 27.05.06 12:25
is there any way to export the picture so that it can be printed? that would be amazing. thanks.
Posted by: ben | 27.05.06 12:43
Source code is online. And fixed the bug that caused some networks to disappear.
@Ben: Yeah, you can use some processing libraries to export the picture.
Posted by: Sala | 27.05.06 13:19
I've made this the Geek Toy of the Week for tomorrow over at Deep Thought. It'll show up tomorrow morning at http://www.dtgeeks.com/index.php/features/geektoy/websites_as_graphs .
I agree with one of the comments to add more tags like list tags, with their own colors. But otherwise, this is a very nice toy!
Posted by: Arden | 27.05.06 15:33
The source is up, but what is it in? That's not standard Java -- it's not in a class. What does it run under?
Very nice idea -- I was going to try to reduce it's overhead, and skip or speed the animation, since for large sites it can take a *very* long time to finish.
Posted by: Ken Arnold | 27.05.06 20:23
It runs under processing www.processing.org - a special graphics library for Java.
Posted by: Markavian | 27.05.06 21:05
Thanks so much for posting your source! I made some quick hacks that allow you to drag nodes around, as well as colored list and list item tags:
http://www.forreststevens.com/htmlgraph/
Node selection is a little funky due to the centroid recentering...
Posted by: Forrest | 28.05.06 08:59
Forgot to mention that I also added tag names to the nodes. These show up when you hold the mouse button down... Please feel free to snag any of the changes you like (source code is linked to from the above page).
Posted by: Forrest | 28.05.06 19:07
Very cool! I also believe this would be more useful with more detailed information (eg. url, title/alt, anchor text, etc.)
Is there a way we could combine this with that Liquid Browsing stuff that appeared on digg two months ago?
http://www.infoverse.org/l2dsspace/
Posted by: Mike Smullin | 28.05.06 19:24
This is SWEET! Thank you!!
Posted by: Jordan Meeter | 28.05.06 20:13
Very, very cool stuff! Thank you for creating it.
I also would like to second the "tooltip for details" idea for the nodes. Showing the id of an element would make it easy to identify the different parts at least for your own site.
Posted by: Martin | 28.05.06 20:39
Very nice!
This one turned out to be the most insane I have seen so far:
http://flickr.com/photos/eirikso/154991809/
Does not look like anything else...
But then again, the page it maps is completely insane as well.
Posted by: eirikso | 28.05.06 22:13
Thanks - I love the concept of being able to map virtual entities of any kind.
I remember the following UK site from a few years ago. You may find the similarities of theme very interesting. But the result from your code is clearer.
http://www.cybergeography.org/atlas/web_sites.html
Posted by: Jen | 29.05.06 00:01
Great! It's a really new, fresh look at the webdesign structure! Is it hard to add some interactive action to your applet? Maybe just a mouseover with tag details?
Thanks a lot for this nice tool...
Posted by: Dimitri | 29.05.06 00:21
Simple and powerful technique.
Posted by: abi | 29.05.06 02:18
Not all of Yahoo! is so tables based - check out http://www.aharef.info/static/htmlgraph/?url=http%3A%2F%2Fau.yahoo.com (Yahoo!7, Yahoo! in Australia) - just the one table, and for tabular content too.
Posted by: ferret | 29.05.06 02:42
i never thought about this before...
nice info...
very educational...
Posted by: cassandra | 29.05.06 03:14
wow! amazing stuff!
I would like to see it improved by displaying more info on what each dot is.
Posted by: mark | 29.05.06 04:47
very cool!
a tooltip displaying innerHTML or outerHTML would be a very cool feature to add.
..or clicking on a tag to zoom in, rendering the subgraph.. ooww:-P
Posted by: remcoder | 29.05.06 10:31
Thanks for making this a public tool. It's so beautiful!
I've got one question. What is the circle of grey off the black html tag? Is that a sitemap or something else?
Posted by: mlangfeld | 29.05.06 11:43
wow, very cool to visualise the codes of a site! clear and beautiful... thanks for the wonderful tool!
salute!
Posted by: su | 29.05.06 12:30
no, thats the head tag, with all its meta, link, style and script tags.
Posted by: Sala | 29.05.06 12:30
An impressive piece of work. Good job!
Posted by: Hielscher - Ultrasound Technology | 29.05.06 13:38
Some additional info about nodes, like: mouse on node displays URL. The graph shows some problems in our pages, but where they actually are is hard to find out.
Posted by: seppo | 29.05.06 15:25
If only NASA were as simple as nasa.gov... Check out the bone.
Posted by: Justme | 29.05.06 17:15
@Sala
I tested it with 50 pages at least and it was awesome, I love the way it starts to grow on the screen, however it didn't work with my blog =(
So much for fun, but still a great tool to show people the "artistic" side of the webs.
Posted by: Narcoleptic_ll | 29.05.06 19:18
Sorry I forgot tu put the link to my blog, if anyone could tell me what's wrong would be great.
http://www.aharef.info/static/htmlgraph/?url=http%3A%2F%2Fintoxicatedwithmadness.blogspot.com
Posted by: Narcoleptic_ll | 29.05.06 20:06
Hey, I made some more updates to my modified version. You can click to drag and view node tags and pan and zoom. Thanks again for the great source code!
http://www.forreststevens.com/htmlgraph/
Posted by: Forrest | 29.05.06 23:05
Hello there - What a great idea.
I was thinking about the way in which all manner of things have different meanings depending on how we view them. I wondered if a site could be made where the site content compliments the HTML Graph and vice-versa. My first attempt is a traditional poem, "I have a little nut tree", the graph for which is also a tree, complete with a golden pear!
http://www.robertsharp.co.uk/misc/tree.html
Posted by: Robert | 30.05.06 04:13
Really interesting project!
Posted by: vh | 30.05.06 05:17
Wow.. it is really hard to visualize hundreds lines of HTML and the general structure of a website. Well, this helps sure! :)
Posted by: Alexander Hanhikoski | 30.05.06 07:42
Here's the new Yahoo! site as a graph. Thanks for the fresh take on structure visualization; this is very cool!
Posted by: Kent Brewster | 31.05.06 01:35
Here's the new Yahoo! site as a graph:
http://www.flickr.com/photos/kentbrew/156766991/
The app won't pull it down live; I had to grab source, post it elsewhere, and run it.
Posted by: Kent Brewster | 31.05.06 01:39
Here's the new Yahoo! site as a graph:
http://www.flickr.com/photos/kentbrew/156766991/
The app won't pull it down live; I had to grab source, post it elsewhere, and run it.
Posted by: Kent Brewster | 31.05.06 01:40
where do I test may URL ??
Posted by: Klueter | 31.05.06 08:40
Forrest: I've gotten your source to compile on my computer and it works great but I can't get your version to work online through your website. I just get a grey applet box.
Posted by: Andreas | 31.05.06 12:18
I use WordPress for a lot of different sites, and it's amazing how different each installation of it can be.
Great tool!
Posted by: thomas | 31.05.06 14:44
Andreas: I emailed you regarding the issue, I'm not sure what might cause the applet to fail, other than an error when pulling the HTML to parse.
Posted by: Forrest | 31.05.06 16:25
wow~ interesting!
so beautiful of boingboing.net's graphic~!
Posted by: rr | 31.05.06 18:38
Whoa, check out suicidegirls.com
Posted by: notsirk | 31.05.06 22:03
qqq
Posted by: ahpaol | 01.06.06 06:16
Great applet, can you make the dots clickable?
Posted by: Jan Kees | 01.06.06 09:27
wow, excellent!
Posted by: china2day | 01.06.06 16:07
Very nice, but this is a graph of one web PAGE, not a web SITE (collection of pages).
Posted by: Chris Grant | 01.06.06 20:34
wow, excellent! work.
Posted by: narendra jamakhandi | 02.06.06 08:39
Very Nice Indeed
Posted by: Keith | 02.06.06 12:11
I think you'd get real value if you differentiated between internal and external links (light and dark blue?).
Then you could tell if a website had lots of internal navigation (like Amazon.com/eBay.com) or they were a portal, directory type (boingboing.com).
What'ya think?
Posted by: supagroova | 02.06.06 14:41
good
Posted by: kamal | 05.06.06 11:23
With regards to the comment from Keith. I think the differential between internal and external links is a great idea and will be changing our sites in the future.
www.gr8futures.co.uk
www.gr8websites.co.uk
www.gr8secrets.co.uk
Posted by: Stephen Cook | 05.06.06 14:02
That's pretty cool
Posted by: Monty Loree-AdMan | 07.06.06 01:34
awesome tool. the first thing i did is I analysed my company website ... ound few things studying the graph and post in the forum.
awesome tool ... I love it ...thanx again for such a great stuff :)
Posted by: suman | 07.06.06 07:16
Would not work on my site. I have valid transitional xhtml on my site, only one table. The applet was blank. Worked on several other sites I tried. Very cool tool.
Posted by: Shane | 09.06.06 04:12
this is very cool, but i like the "worse" designed websites the best! because they make the prettiest pictures. but now i understand why i find some websites so much easier to navigate.
Posted by: jessica Smith | 11.06.06 06:46
Thank you! Very cool tool.
Posted by: Adrian | 13.06.06 19:47
I extended this nice idea by drawing graphs of a website's structure and the validation status of its html content.
It's a small free (GPL) standalone java application which spiders a website and paints its pages within such a nice particle graph.
See here for more:
http://blog.peter1402.de/archives/2006/06/13/Validation-Graphs
Posted by: Peter | 14.06.06 17:41
Sweet tool. How easy/hard would it be to create the same graph but instead of mapping the elements on a page, how about mapping the pages in a site? With page names on hover and/or a link?
This would be the sweetest sitemap tool ever. How easy/hard would it be to make it run on xml?
I'd definitely pay for that!
Marc
Posted by: Marc McHale | 19.06.06 12:50
Marc, look here:
http://blog.peter1402.de/archives/2006/06/13/Validation-Graphs
Posted by: Peter | 19.06.06 17:12