“Christian Nold” versus “Anne Helmond” at PICNIC09

At PICNIC09 Christian Nold challenged me in the the Mediamatic RFID installation Ik win: Challenge someone with your web popularity, and let’s see and show other people how high you’re ranking based on the number of Google results. For more info see the Mediamatic Ik win website.

PICNIC09

PICNIC09

PICNIC09

PICNIC09

Winner: Christian Nold! Congratulations.

Thanks to Esther for making the pictures! More PICNIC09 pictures on Flickr.

Archive 2020: Esther Weltevrede – Archiving Web Dynamics

Archive 2020
Internet researchers are confronted with an instable object of study, the ephemerality of the object. The question is how to make the medium permanent so we can study it with care? The shape of the archive informs what I can ask the archive.

This perspective on archives is placed within Weltevrede’s research into National Webs. To think nationally with the web might seem counterintuitively at first because dominant ideas of the web are so global. This originates from the 90s idea of  Cyberspace which is a universal space with ideas of disembodiment and identity play. Crucially, cyberspace is a place that is disembedded from reality. After 2000 cyberspace was confronted with what Weltevrede calls “the national turn.”

This may be seen in a number of places, probably most familiar is Google.com redirects you to the location you are at, for example Google.nl and you get a totally different result page. Another example is “This video is not available in your country” intellectual property is really dominant in the nationalization of web content. You might also think in the terms of language. English used to be the dominant universal language, there is a lot of clustering happening on the web based on a shared language.

To move to the web archive, the most exhaustive project in the field is the Internet Archive which originates from the cyberspace period (1996.) This can also be seen in how the archive was set up. First of all, the scope of the collection is the “whole” internet which is a very broad collection aim. Secondly, when you look at the interface of the archive, the Wayback Machine, what you immediately notice is that you query it by URL and browse from that point on. It is characterized by browsing instead the current dominant form: searching. The Internet Archive therefor privileges single site histories instead of researching its context.

The Internet Archive emerged from the web company Alexa and Alexa provides all the crawls and donates it to the archive. This means that the selection of sites is based on traffic data. If you have the Alexa toolbar installed every page you visit will be included in the archive. It is a very smart way to start thinking about which pages should be included in the archive. After the Internet Archive in 1996 a number of initiatives emerged with a national focus. The general thought behind that was that national web archives can best serve local wishes and demands and serve the community (researchers, general public) best.

As an example we will look at a Dutch web archive maintained by the Royal Library of the Netherlands, the KB. Before we go into the actual project, let’s get a size of the Dutch web. The .nl domain is the fourth largest country domain with 3.2 million sites, an enormous amount.

Archive 2020

How to demarcate the national web

  1. .nl is the 4th largest country domain
  2. A second way to look at the national web (.nl is not the whole Dutch web you could argue) we can look at all the domains registered by the Dutch (sidn.nl 2008)
  3. What do we Dutch people find relevant sites? We can look at the most visited websites as listed by Alexa. We find these sites important through the number of visits.

These are three ways to think of how to define the national web by web means. The definition of the national aspect as used by the Royal Library is. They created a new definition of what is Dutch content.

  • A: Website in Dutch, registered in the Netherlands
  • B: Website in another language, registered in the Netherlands
  • C: Website in Dutch, registered in another country
  • D: Website in another language, registered in another country, topic aimed at the Netherlands.

All of these options seem technically feasible except for the last one. We cannot technically or automatically define content that is aimed at the Netherlands. It makes it highly unlikely that this Dutch web can be archived. What the Royal Library has done, is leave this definition and manually select sites. They started with 100 sites, it became 400 and now just over a 1000. They archive those sites really well.

As an internet researcher Weltevrede is particularly interested in the dynamics of websites. The contribution she would like to put forward is how else can we approach the object of collection, the Dutch web?

Archive 2020

If you start web archiving the most easy and effective method is to follow the possibilities of the medium. You can automate a lot of things and besides that you can also focus on the context and prominence of the website in a particular period. The first point calls attention to the challenge to develop methods that follow the medium to automate the collection process. You could
schedule Google.nl for the query “.nl” because Google takes into account what is relevant, links to a website. These are not only considered relevant by Google but by a large group of people. Hyperlink structures are human acts of association, links die and emerge, what would that information provide us about the context and its network? If you would schedule it over time you could see the relevance of a particular source in a particular period. It would provide context for sources or websites, the born digital.

The final questions are:

  • What would the national Web archive look like when the focus is on capturing hyperlinks, search engine results, and other digital objects?
  • What aspects besides the digital document are relevant to save and why?
  • Can we learn from how born digital devices (e.g. search engines, platforms and recommendation systems) make use of the objects, and if so, how can such uses be repurposed for Web archiving>

Archive 2020

Final personal note: The day after this presentation (this morning) my friend and colleague Esther Weltevrede graduated Cum Laude from the University of Amsterdam on her research on Archiving Web Dynamics. She will continue her research on National Webs as a PhD candidate with the Digital Methods Initiative. Congratulations Esther!

The e-book is more than a medium, it’s a platform

Austerely
I’ve been thinking about buying an e-book lately. I recently moved houses and, boy, books are heavy! I haven’t bought one yet because none of the available e-books currently meet my criteria:

  • Fast: no flickering when turning pages
  • Editable: make notes with qwerty keyboard or touchscreen writing: as a teacher this is a must have!
  • Export as pdf: as a teacher this is a must have!
  • Import pdf without converting into a proprietary format.
  • Access to e-bookstores outside of the USA. I cannot order books on Amazon as a Dutch resident.
  • A strong wish, not a must: wireless/bluetooth connection.

A few weeks ago, at the Lifehacking Academy Amsterdam, I briefly saw and tried the Amazon Kindle 1 (slow, sluggish page-turning, horrible interface) and the Sony Reader (pretty decent interface but low editing capabilities). Two years ago I tried the iRex iLiad reader (a new version has since come out) which currently still seems to have the best editing capabilities compared to the Amazon Kindle and the Sony Reader. The main difference between the iLiad, Kindle and Sony Reader is the price. The iLiad is rather expensive at 599 euro compared to the other two which are around 350-400 dollars.

New Network Theory

What recently sparked my interest in buying an e-book is Steven Johnson’s article in the Wall Street Journal: ‘How the E-Book Will Change the Way We Read and Write.’ In the article he points to the medium specificity of the e-book, connecting it to the web. Johnson describes the fragmentation of text that might occur, just as happened to ‘albums’ with the introduction of buying ‘songs’ in the iTunes store. The web seems to be a medium that favors fragmentation: from the webpage to the blogpost and from the album to the song. Will we be able to buy chapters or a few paragraphs?

Johnson describes a new phenomenon which he calls ‘booklogs’ as a new way of publishing and sharing your commentary on paragrahs, chapters or whole books. Books have ISBN numbers which could serve as a the basis for a new kind of permalink, although the fragmented parts of book would require a new way of referring.

One of the most intriguing parts of the article is the part titled ‘Writing for Google’ in which Johnson describes how the e-book is entangled in web relations with search engines:

Writers and publishers will begin to think about how individual pages or chapters might rank in Google’s results, crafting sections explicitly in the hopes that they will draw in that steady stream of search visitors. [...] Perhaps entire books written with search engines in mind. (Johnson, 2009)

This reflects my thoughts about bloggers writing for Google which I described in detail in ‘Blogging for Engines.’ Through a tight integration of web services the e-book could become a platform. Frank Meeuwsen imagined using e-books in hospitals as a medical resource and annotation tool.

Of course the final question is: to buy or not to buy an e-book? Currently priced at 359 dollars the Kindle 2 seems a reasonable price, especially since I’ll be in the US again soon. Any thoughts?

Anne Helmond is…

This post is inspired by Alex Halavais’ post Halavais is…. where Havalais listed “some of Google’s opinion of me via a search for “halavais is,” “halavais was,” and “halavais will be.” ”

It reminded me of the iTea project made at the RFID Hacker’s Camp 2007:

iTea is an interactive installation in the form of a coffee table.   In the coffee cup on top of the table, you can place your rfid tag – which is given to you at the entrance of the conference and linked to a social network – and the table will start to display information about you. At first it gets your name, description and keywords from the picnic network. Then it will start to Google your firstname and lastname. Then it just googles your first name but substitutes it by your full name. The result is visualized by a projection from within the table to the surface of the table, in the form of drops of information. (Erik Borra project description)

PICNIC07 - iTea

The iTea project makes users aware of both “the gossip” online and the voyeuristic tendencies of datamining search engine indexes. Ego Googling is not so much a self-centric tendency that points to the glorification of the self online as it points to an awareness of the presentation of the self online by search engines. The self online is not shaped by yourself but by search engines. So who am I anno 2009 according to Google?

Anne Helmond is

  • Anne Helmond is a New Media Lecturer at the University of Amsterdam at the Media Studies department.
  • So the brief day that was Blog08 is over and our blogging reporter, Anne Helmond, is back home. She rounds up over on her own blog.
  • Anne Helmond is member #1 of journalism research.
  • Anne Helmond is a member of Mobile Monday Amsterdam.
  • Anne Helmond is a member of Wolvenstraat, Amsterdam OpenCoffee Meetup.
  • Anne Helmond is a new media researcher, graphic designer and photographer.
  • Anne Helmond is docent New Media en onderzoeker bij de afdeling Mediastudies van de Universiteit van Amsterdam.
  • Lovink stresses that the Main object of research Anne Helmond is working on is that bloggers start to realize they are ‘working for google’ and contributing
  • Het profiel van Anne Helmond is gewijzigd 25 Jan.
  • Anne Helmond is een absolute helemaal geekster.
  • Volgens Software Studies onderzoeker Anne Helmond is een door software gecreëerde wereld opaak, het vormt ons media gedrag maar verbergt een achterliggende
  • Anne Helmond (Anne Helmond is silvertje on Twitter).
  • De reactie van Anne Helmond is:. March 25, 2008 @ 7:41 pm.
  • Anne Helmond is lid #340 van Twitterborrel.

Anne Helmond was:

  • Anne Helmond was inspired by both projects and asks MTV to be MADE into a killer PHP programmer.
  • Anne Helmond was one of the fifty blogger from around Europe who participated in the European Bloggers Unconference
  • Anne Helmond was er wel, en deze prachtige foto komt uit haar flickr-stream.
  • Ken het verhaal (nog) niet achter dit zelfportret van Anne Helmond. Was ze melig, of juist heel overtuigd?
  • Anne Helmond was tot nu toe betrokken bij het weblog Masters of Media, een weblog gerelateerd aan de Mastersstudie Nieuwe Media.
  • Anne Helmond was haar naam.
  • Anne Helmond was your fan before and now you two are friends.

Anne Helmond will be:

  • Anne Helmond will be blogging back from Blog08 for us, with a focus on the online journalism aspects.
  • This Thursday Anne Helmond will be giving a lecture on ‘The Widgetized Self‘ a term coined by Nancy Baym.

What surprises me is the amount of social actions indexed such as “Anne Helmond is (now) a member of…” it confirms the increasing trend of indexed social actions and memberships as I previously described in: Google expands its indexing focus to actions within social networks.

I’ve Gone Google

Google Knows t-shirts

Google Knows t-shirts

Google already knows that I blog for Google but now I’ve almost completely gone Google. I recently switched from Netvibes to Google Reader, from the GTD-app Things to the online web service Remember the Milk and I moved a lot of my e-mail correspondence from Mail.app to Gmail online. While I’ve been fairly reluctant to store the main part of my data/information with one provider up in the clouds I have now been convinced.

Netvibes » Google Reader

I’ve been a fairly happy Netvibes reader for over a year but Google Reader has added some great features since I last tried to find a cure for my RSS exhaustion. I started using Netvibes modules as a way of keeping track of my scattered self online but after a while I got rid of the modules but I now prefer simple bookmarks in my browser.

Anne Bookmarks

In Netvibes I managed all my subscriptions with tabs but switching between tabs and individual feed subscription items cost too much time. On top of that scrolling through all the items is not possible, only through a single subscription.

The main reason I prefer Google Reader over Netvibes? The fantastic scrolling features and the “Mark Everything Read” button (in Netvibes you can only mark everything read per subscription or tab). Google automatically marks an item as read once you scroll past it. Love it.

Things » Remember the Milk

Yes, I know Remember the Milk is not a Google product (yet) but it perfectly integrates with Google Reader, Google Mail and Google Calendar. I signed up for a RTM account even before I downloaded Things but I never got the hang of it. Things is an excellent application to manage your to-do items in a flexible GTD-style. However, I did not use half of its features and it felt too much like a standalone application that did not integrate with my calendar.

Remember the Milk offers a tight integration with Google Mail, iGoogle, Google Calendar, Twitter and many more. Forgotthemilk.net wrote a GreaseMonkey script that integrates RTM into Google Reader. Now I can access my todo list from all the applications I use to get things done!

An iPhone application has already been released for Remember the Milk and over 167 users have requested a Symbian application, including me. The mobile website of Remember the Milk is pretty good but I would love an official application for my N95.

Mail.app » Gmail

Hitting Archive and/or starring items is the best way to keep my inbox clean and empty with Gmail. The Better Gmail Firefox add-on adds “useful extra features and skins to Gmail, like hierarchical labels, macros, file attachment icons, and more. ”

The main reason for moving my e-mail to Gmail is the excellent application for my N95 phone. It is far superior to the pre-installed e-mail application in both speed, ease-to-use and functionality.

So what’s next?

I have no idea, please tell me!

Blogging for Engines. Blogs under the Influence of Software-Engine Relations

In February I graduated cum laude with a thesis on blog software and search engines titled ‘Blogging for Engines. Blogs under the Influence of Software-Engine Relations.’ It aims to add the study of software-engine relations to the emerging field of Software Studies, which may open up a new avenue in the field by accounting for the increasing entanglement of the engines with software thus further shaping the field.

This thesis wishes to contribute to the understanding of blogs by approaching blogs as both a medium and bi-product of practice that are both entangled in software-engine relations. In the history of blogging both the medium and practice are constantly being shaped by the search and indexing engines. Not only did the introduction of the ‘nofollow’ attribute have a major impact on the construction of the blogosphere, it also points to how the blogger is (un)willingly entangled in a relationship that the blog software establishes with the engines. The common blog practices of tagging, social bookmarking and the obsessive checking of blog statistics raise the question if we are now blogging to feed the engines. Continue to read an excerpt of my PhD proposal to continue my research on software-engine relations, or download the PDF ‘Blogging for Engines. Blogs under the Influence of Software-Engine Relations.’ (4,2 Mb)

Excerpt PhD Proposal on Software-Engine Relations

Google as the number one search engine is regarded by many to be “the start page for the Internet” (Dodge, 2007) and “Google has become such a commonly used resource that people are beginning to regard it as synonymous with the Web.” (Searls in Gudrais, 2007). What is missing from the current studies into software is the recognition of the central role that the engines play on the web. The engines are considered to be the starting point of the web and play an important editorial role on the web. Introna and Nissenbaum (2000) describe the politics of search engines with the engines

[...] determining any systematic inclusions and exclusions, the wide-ranging factors that dictate systematic prominence for some sites, dictating systematic invisibility for others. These, we think, are political. They are important because what people (the seekers) are able to find on the Web determines what the Web consists of for them. And we all —individuals and institutions alike— have a great deal at stake in what the Web consists of.

The politics of inclusion and exclusion in the search engines, which may also be described as the drama of search engines (Govcom.org, 2007), is clearly visible in the case of the website 911truth.org which suddenly disappeared from Google results. These issues raise the question if and how the web is structured by search engines. Rogers (2008) describes how the engines are demarcating different spheres on the Web. Previous research done with the Digital Methods Initiative (2007) not only showed how the engines construct different spheres but also how these spheres are constructed differently by different engines.  What role does the software play in the construction of these different spheres?

Previous research into the role of software and the engines in the blogosphere showed that there is an increasing symbiotic relationship between the two (Helmond, 2008). In this study into the most prevailing blog software, WordPress, it appeared that is is establishing strong ties with Google, Google Blog Search and Technorati. The blog software and blog engines determine the nature and construction of the blogosphere through co-construction. These software-engine relations enforce a steady regime in the blogosphere that puts the blogger in a position where the politics of inclusion and exclusion are played out in the game of search engine optimization and spam.

(Excerpt from my PhD proposal)