DMI mini-conference Day 1: Michael Stevenson on the Archived Blogosphere

The Digital Methods Initiative is holding a three day mini-conference with workshop presenting papers and research proposals.Today I responded to Michael Stevenson’s paper on the history of the blogosphere through the eyes of EatonWeb and the Internet Archive. The following is my summary of his paper and argument followed by questions.

Michael Stevenson. The archived blogosphere: exploring web historical methods using the Internet Archive

Respondent: Anne Helmond, University of Amsterdam. 20 January 2010.

One of the main questions of Stevenson’s research is: How can we use and repurpose the Internet Archive to study the history of the blogosphere?  The Internet Archive is especially useful for single site histories, as the Archive is browsed by URL. However, websites rarely exist in a vacuum on their own. This is partly recognized by the special collections in the Archive on a particular topic or event. Blogs, and their (in)formal linking policies, constitute a different type of collection of sites that do not converge on topic or event but on their formal characteristics: the blogosphere. As Stevenson notes “The genre (of blogs) was defined less by content than by form, with reverse-chronology and the centrality of linking trumping the extent to which bloggers focused on similar topics.” How to deal with a collection of websites in an archive that constitute a separate websphere when the device used is especially useful for studying the history of single sites?

Historical accounts of the blogosphere are often from an anecdotal perspective (Blood 2000 & Rosenberg 2009). Stevenson notes that:

What is missing in this approach, however, is reflection on the changing conditions for historical research when the object of study is the Web, or (as may increasingly be the case) is studied with the Web. (p. 75)

The Internet Archive is described as a legacy system in the sense that it is based on browsing instead of the current trend of searching and in this sense displays aspects of an earlier (web) culture. What is sustained is cyberculture. Cyberculture (1980s-1990s) is characterized by a “commitment to egalitarian and universal access to information” (78). Cyberspace is described as “somewhere else” which is still visible in the IA which prefers browsing over querying. The rise of the blogosphere may be seen as “the rejection of cyberspace” and as a transition phase from cyberculture (egalitarian) to web culture (A-lists). The blogosphere is marked with a strong tension between the idea of egalitarianism and the actual compilation of A-lists by disproportionate linking.

Case study
How to delimit the object of study? DMI asks how the dominant devices do it, for example blogs are defined by the engines as anything that publishes a feed. In this case study the first dominant blogosphere device EatonWeb was taken as a starting point. EatonWeb was a manually created collection (expert-list) of blogs and inclusion was based on the formal characteristic of blogs: reverse-chronological ordered entries. “Of the 947 blogs listed by the directory, 857 (or 85.5%) were present in the Internet Archive.” The missing blogs in the Archive were located by following the outlinks of the blogs in the set. This presents a map of the “whole” early blogosphere.

Contribution
Stevenson contributes to studies on the history of the blogosphere by compiling a new special collection, the Early Blogosphere (according to EatonWeb), that may be mapped and queried. By mapping the outlinks of the blogs in EatonWeb the non-archived blogs (the missing pieces of the archived blogosphere by the Internet Archive) are positioned within the network.

Questions
“The organization of the EatonWeb Portal suggested egalitarianism” which is in line with the characteristics of cyberspace. Are ranking devices the official end of cyberspace? Do you consider EatonWeb in that sense a transitional device?
You have now compiled your own special collection of the early blogosphere. Querying this collection, in contrast to the IA, is now possible. What would you like to ask the collection?
The focus is now on outlinks. Where were these links taken from? The whole page? Suggestion for detailed focus: blogroll analysis only. Do they provide a different map?

Further research
Platform specific maps. Actors receiving links from EatonWeb blogs that are not in the EatonWeb themselves are often blog platforms such as Blogger.com and Pitas.com. Redo map with a focus on platforms. Do platforms cluster?
There are some specific Pitas blogs on the maps, but no specific Blogger.com websites. Is it possible to look “beyond” pitas.com (*.pitas.com) or blogger.com (*.blogger.com) which sites were there?

More info on Michael Stevenson’s & DMI research on the DMI wiki:
Tracing And Mapping The Evolution Of The Early Blogosphere With The Internet Archive
Profiling the Archived Blogosphere
Wayback Web Collections
Early Blog Features

Blogs are boring

When I saw Clay Shirky’s appearance on the Colbert Report of the 3rd of April 2008 I wrote down this quote:

Communications tools don’t get socially interesting until they get technologically boring. Social effects are more important than just how the technology works. (Clay Shirky)

It has stuck to me ever since and when I saw Shirky’s keynote talk at PICNIC08 in September he continued to expand on this quote.
PICNIC08

Shirky states that social software has fewer features than other types of software which are part of its social success. In the case of blog software this is true for Blogger which turned the blog software and blogging world upside down with easy ‘one-button publishing’ but WordPress is a complicated tool. Even if one sticks with the default settings it is not a tool one uses right away.

This is why blogging platforms such as Blogger stay very popular and new tools such as Habari find a small but solid user base. Simplicity. However, does simplicity equal technologically boring or is technologically boring not thinking about the technique behind the tools anymore?

In my thesis I argued that not only bloggers and blog software but also search engines cause the blogging phenomenon to grow. The simplification of blog technology and the masking of technological features such as automatic sending of trackbacks and pingbacks could be described as software becoming technologically boring. However, social effects are connected to the working of technology and to say that these social effects are more important is a rather simplified way of putting things. I would say that the way technology works may cause social effects such as update and/or stat addiction.

Shirky’s quote keeps on raising questions. As it is part of his larger argument in Here Comes Everybody I just ordered the book to read during my Christmas holidays. Can’t wait to read it and continue thinking about this quote.

Blogging for Engines. Blogs under the Influence of Software-Engine Relations

In February I graduated cum laude with a thesis on blog software and search engines titled ‘Blogging for Engines. Blogs under the Influence of Software-Engine Relations.’ It aims to add the study of software-engine relations to the emerging field of Software Studies, which may open up a new avenue in the field by accounting for the increasing entanglement of the engines with software thus further shaping the field.

This thesis wishes to contribute to the understanding of blogs by approaching blogs as both a medium and bi-product of practice that are both entangled in software-engine relations. In the history of blogging both the medium and practice are constantly being shaped by the search and indexing engines. Not only did the introduction of the ‘nofollow’ attribute have a major impact on the construction of the blogosphere, it also points to how the blogger is (un)willingly entangled in a relationship that the blog software establishes with the engines. The common blog practices of tagging, social bookmarking and the obsessive checking of blog statistics raise the question if we are now blogging to feed the engines. Continue to read an excerpt of my PhD proposal to continue my research on software-engine relations, or download the PDF ‘Blogging for Engines. Blogs under the Influence of Software-Engine Relations.’ (4,2 Mb)

Excerpt PhD Proposal on Software-Engine Relations

Google as the number one search engine is regarded by many to be “the start page for the Internet” (Dodge, 2007) and “Google has become such a commonly used resource that people are beginning to regard it as synonymous with the Web.” (Searls in Gudrais, 2007). What is missing from the current studies into software is the recognition of the central role that the engines play on the web. The engines are considered to be the starting point of the web and play an important editorial role on the web. Introna and Nissenbaum (2000) describe the politics of search engines with the engines

[...] determining any systematic inclusions and exclusions, the wide-ranging factors that dictate systematic prominence for some sites, dictating systematic invisibility for others. These, we think, are political. They are important because what people (the seekers) are able to find on the Web determines what the Web consists of for them. And we all —individuals and institutions alike— have a great deal at stake in what the Web consists of.

The politics of inclusion and exclusion in the search engines, which may also be described as the drama of search engines (Govcom.org, 2007), is clearly visible in the case of the website 911truth.org which suddenly disappeared from Google results. These issues raise the question if and how the web is structured by search engines. Rogers (2008) describes how the engines are demarcating different spheres on the Web. Previous research done with the Digital Methods Initiative (2007) not only showed how the engines construct different spheres but also how these spheres are constructed differently by different engines.  What role does the software play in the construction of these different spheres?

Previous research into the role of software and the engines in the blogosphere showed that there is an increasing symbiotic relationship between the two (Helmond, 2008). In this study into the most prevailing blog software, WordPress, it appeared that is is establishing strong ties with Google, Google Blog Search and Technorati. The blog software and blog engines determine the nature and construction of the blogosphere through co-construction. These software-engine relations enforce a steady regime in the blogosphere that puts the blogger in a position where the politics of inclusion and exclusion are played out in the game of search engine optimization and spam.

(Excerpt from my PhD proposal)

Rethinking the Blog as Database: My First Post on the Blog Herald

I am proud to announce that I have joined the Blog Herald. The Blog Herald has been blogging about the blogosphere since 2003 and has since become an established source in the blogosphere. I have been reading the Blog Herald for a while now and was absolutely thrilled when they asked me to write for them. I will be joining an excellent team of bloggers including Lorelle VanFossen, Tony Hung, Chris Garrett, (founder & ex-Blog Herald/now TechCrunch-blogger) Duncan Riley and more.

I will be blogging about blogging and blog software from an “academic” point of view. My first series of posts will be related to my upcoming thesis on Blog Software and the Act of Blogging.

You are welcome to read and comment on my first post at the Blog Herald: “Rethinking the Blog as Database

WordPress Glitches: Visual/Code editor & Advanced Visual editor

I noticed another glitch in the software when I was saving an unpublished post. When you press Save and Continue Editing the editor switches from Visual to Code and back to Visual. So while saving you are temporarily faced with the code while you are actually working in the Visual mode. I think this is another example that demonstrates that the visual editor is an added option (see my previous post: WordPress problems: Visual editor gone (fixed), or: A glitch in the software)

What is even more interesting is that I recently discovered that the visual editor offers even more options than it actually displays. This extra toolbar is not activated by default nor is there a button to activate it. The extra formatting options were hidden from the general user until someone noticed it and posted it to a WordPress forum. Activating the extra options is done by pressing Alt-V in Internet Explorer or Alt-Shift-V in Firefox.

Visual editor standard
The standard visual editor toolbar
Visual editor advanced
The advanced visual editor toolbar

But everytime you press Save and Continue Editing the extra toolbar disappear again (not with automatic saving) so you have to press Alt-(Shift)-V again and again until you have finished your post. To solve this issue a Visualize Advanced Features plugin has been released that adds a button to your standard visual editor toolbar that enables you to toggle the advanced options on or off (and leaves the options on or off.)

With some advanced knowlegde of TinyMCE, the Javascript WYSIWYG editor implemented in WordPress, the toolbar can contain many more features such as using subscript, superscript, tables, layers, CSS layout etc. I wonder if using the CSS layout creates conflicting stylesheets, or does it create an inline stylesheet, or does the existing stylesheet always overrule, or if there is the possibility of other conflicting issues when using these advanced issues (which might be a reason WordPress hasn’t implemented them.)