Social buttons are breaking search

In my previous post I wondered if social sharing services are breaking the web with data-rich hyperlinks and today I would like to pose that social sharing services are breaking search. Let’s assume the following scenario: You search for Facebook “proprietary protocol” in Google Web (the “regular” Google) and are presented with the following results:

facebook _proprietary protocol_ - Google zoeken-1

While we are used to skim through the results for the most relevant results, the social buttons produce an artifact that disrupts the search index. A result titled “Is VTP a proprietary protocol of CISCO?” is the fifth, unrelevant, result and is only shown due to the fact that they are using a Facebook social button on their website.The social buttons are flooding the index with keywords such as Facebook, Twitter, Share, Add that as a side-effect of sharing technologies. Because of the high penetration of social buttons this may also disrupt research practices on the web.

The following example shows what happens when you search for the keywords Facebook homosexuality in Google Scholar.

facebook homosexuality - Google Scholar

None of the shown results are relevant for my query and are shown because of a Facebook social button on their website. Social buttons are producing an artifact that disrupts search.

Are social sharing services breaking the web with data-rich hyperlinks?

Social sharing services such as Summify allow users to subscribe to a daily digest of stories that have been shared by their Twitter and/or Facebook users in what they call a “summary of your social news feeds.” In the process of tracking shared links on social media platforms, these sharing services are renaming and transforming the shared links. A link to Dave Winer’s article on “Facebook is scaring me” in Summify’s daily summary no longer directly points to Dave Winer’s blogpost, but instead the URL has been renamed to a Summify URL and the blogpost is framed in a Summify toolbar.

Summify toolbar

Summify renames http://scripting.com/stories/2011/09/24/facebookIsScaringMe.html into http://summify.com/story/Tn3zdo3fhyiIAD6A/scripting.com/stories/2011/09/24/facebookIsScaringMe.html

By rerouting all hyperlinks through their service they are able to gather statistics on shared stories and track how many times a story has been tweeted, liked and shared, and of course, clicked, which is not visible to users but to Summify only. They are creating data-rich links because the link does not only refer to the location of the source on the web but also carries quantitative metadata and possible affective metadata, think for example of the possible new Facebook intentions of ToRead and Want. Short-url services such as Bit.ly operate on the same principle: By transforming hyperlinks they are creating short but data-rich links.

What bothers me, as a researcher, is how this framing of the sharable web may break hyperlink analysis and affect research.

Look for example at the LinkedIn digest which provides me with the “Top Headlines in Internet, Online Media.” LinkedIn also renames the headlines’ URLs into LinkedIn URLs and presents these headlines in a frame with a LinkedIn toolbar on top.

LinkedIn toolbar

LinkedIn toolbar and frame

Because LinkedIn renamed the original URL into a data-rich LinkedIn URL, this is the URL we will now be working with, whatever action follows next. This seems disastrous, not only for services such as Delicious, but also for researchers because the original URL will now also be saved (and possibly shared) as a LinkedIn URL, a Summify URL, or any other service that renames URLs. I am a URL purist and I want to save and share the original URL and not a renamed URL but many users will simply share or save the URL they are presented with. This means that tracking the original URL is no longer sufficient for analysis if the URL is also shared and saved as different URLs.

On top of that the LinkedIn URL is either badly formatted or Delicious is not able to interpret it correctly. In any case, attempting to save an article I discovered trough the LinkedIn digest to Delicious is impossible as it attempts to save the generic “http://www.linkedin.com/news?actionBar=”.

Save a Bookmark on Delicious

Failed attempt to save a bookmark on Delicious

Finally, some websites such as the New York Times do not allow their content to embedded within (social-sharing) frames which breaks the user-experience:

Summify: New York Times

Should I be worried as a URL purist and researcher about social sharing sites and short URL services renaming URLs?

This post is part of a larger series that looks into the status of the hyperlink in Web 2.0.

Facebook becomes a database for your life

These are my quick and short notes on the Facebook F8 Developers Conference 2011 related to my research.

Mark Zuckerberg describes how your Facebook profile acts as a five minute introduction when you meet someone and you share your common demographics such as your name, age, job and interests with them. The Facebook stream represents the next 15 minutes where you slowly get to know someone by seeing what they share and like. Facebook introduces the Timeline as the new heart of the Facebook experience to tell the story of your life by gathering all your stories, all your apps and all your activities in a new place as a new way to express who you are.

Curating your life
Facebook Timeline taps into two big webtrends: Documenting the self and the curation of stories (eg Storify). In ‘Identity 2.0: Constructing identity with cultural software‘ I depict a historical account of the documentation of the self from ping messages to personal homepages, to blogs to social network profiles to lifestream platforms. Now Facebook wants to become the new central player for documenting and curating your lifestory. Facebook wants to be the database to store your life. It aims to provide a place that feels like home where you can highlight and curate all your stories to express who you are.

Your life was previously documented on your wall, the News Feed, but it provides a very fleeting type of documentation where old content is only accessible by infinitely scrolling down. Content and activities in the Timeline, on the other hand, are neatly organized per year or filtered by content type. Activities are presented in reports and a summary of what you’ve done is deemed to be more relevant than all things you have done. These reports, or summaries, provide quantified overviews of your activities which may be capitalized on by Facebook.

Timeline is not a new concept, the documentation of the self is reminiscent of the ‘old’ Microsoft MyLifeBits project which in itself is based on Vannevar Bush’s 1945 Memex:

MylifeBits is a lifetime store of everything. It is the fulfillment of Vannevar Bush’s 1945 Memex vision including full-text search, text & audio annotations, and hyperlinks. (Microsoft)

Re-centralisation of the self
Whereas MylifeBits documented produced content and aimed to interlink it, the Timeline is a “re-centralisation of the self” (Carolin Gerlitz). It recentralizes all content and activities performed on external content through the Facebook platform using the Open Graph API (Helmond and Gerlitz 2011). While activities for Facebook were previously confined to Liking and Sharing the Timeline opens up for new applications and new activities. A smart move is that Facebook is now re-centralizing all “quantified self” apps through its platform. During the F8 keynote the example of the Social Running app is shown and Facebook will now know how many times a week you run and how far. While quantified self apps are often used to document and evaluate the self in private Facebook will now open up this trend to more public sharing with your friends.

Paper: Hit, Link, Like and Share. Organizing the social and the fabric of the web in a Like economy.

Co-authored paper by: Carolin Gerlitz (Goldsmiths, University of London) and Anne Helmond (University of Amsterdam). Paper presented at the DMI mini-conference, 24-25 January 2011 at the University of Amsterdam.

Introduction
Different types of social buttons have diffused across blogs, news websites, social media platforms and other types of websites. These buttons allow users to share, bookmark or recommend the webpage or blogpost across different platforms such as Facebook, Twitter, Digg, Reddit, Delicious, Stumbleupon, etc. The buttons often show a counter of how many times the page/post has been shared or recommended: x likes, x shares, x tweets. These likes, shares and tweets may be approached from a new media studies perspective as new types of hyperlinks and from an economic sociology perspective open up questions about the increasing interrelation between the social, technicity and value online. Within new media studies the hyperlink has previously been studied as a form of currency of the web establishing an economy of links (Walker 2002 & Jarvis 2009) and as an indicator of a discursive relationship (Rogers 2002).

The economy of links describes the link as a currency of the informational web in which search engines use hyperlinks to look at the relations between websites in order to establish a ranking. The term informational web is often used to describe the world wide web as a publication medium for publishing content (Ross 2009) and is characterized by the linking of information (Wesh 2007).2 In this web search engines act as main actors to be able to navigate through all the information by recommending pages based on authority measures.

According to social networking site Facebook “the informational Web is being eclipsed by the social Web” (Claburn 2009). In contrast to the informational web where search engines focus on links between websites, the social web “is a set of relationships that link together people over the Web” and “the applications and innovations that can be built on top of these relationships” (Halpin & Tuffield 2010) and is characterized by the linking of people (Wesh 2007).3 Within the social web search engines and social media platforms look at the connections between people and their relations to other web users or web objects. Facebook popularized the term Social Graph “to describe how Facebook maps out people’s connections” (Zuckerberg 2009). As Facebook considers its services inherently social and its plugins and buttons are called ‘Social plugins’ we summarize the activities they generate as so-called “social activities.”

Where Google can be seen as the main agent of the informational web and the regulator of the link economy, Facebook is currently seen as the emerging agent of the social web. Especially the company’s recent efforts to make the entire web experience more social mark the advent of a different type of economy which is based on social indexing of the web: the Like economy. Key elements of this economy are the social buttons, the activities they generate and the way they connect Facebook with the entire web.

According to Facebook, liking and sharing are valuable for users and the company because they enable to experience the web more socially. A similar connection between the social and economic value has been developed by Adam Arvidsson (2009) with his idea of an ethical economy in which value creation is based on collective negotiation and in which economic value creation is related to the quality of social bonds that are generated. Within this paper we want to question the centrality of social dynamics and social relations as key driver for platform engagement and the Like economy. Through merging a new media with an economic sociology perspective, we will shift attention away from the users and the social to the impact of issues on social activities, as well as their interrelation with technicity and the fabric of the web. Based on an extensive empirical study of button presence and engagement within a sample of 592 URLs, we ask how issues, technicity and the social create a productive assemblage of value creation in an emerging Like economy.

In what follows, this paper aims to address these questions by first looking at the history of different types of web economies over time. How do these ‘new’ social activities central within the social web relate to the hit and link economy of the informational web? What creates engagement and how does this engagement organize the fabric of the web and sociality? And finally, what are the perspectives of a Like economy?

Download full paper as PDF: GerlitzHelmond-HitLinkLikeShare.pdf

We’d be happy to receive any comments and feedback!

Article Series - The status of the hyperlink in Web 2.0

  1. How Web 1.0 is the Issuecrawler?
  2. The Like, the Share and the (Re)Tweet as pre-configured links
  3. Paper: Hit, Link, Like and Share. Organizing the social and the fabric of the web in a Like economy.
  4. Are social sharing services breaking the web with data-rich hyperlinks?
  5. Social buttons are breaking search

Visualizing data with Gephi: Abstract interpretations of the Dutch blogosphere #madewithgephi

Abstract interpretation of the Dutch blogosphere 2001 #1

Abstract interpretation of the Dutch blogosphere 2001 #1

I am currently working on analyzing the Dutch blogosphere with my colleague Esther Weltevrede with help of colleague Erik Borra from the Digital Methods Initiative. In an early exploratory phase Esther and I started to learn how to use Gephi to visualize our data and networks. In one of my early attempts I created this beautifully abstract interpretation of the Dutch blogosphere. Gephi creates design by research!

Abstract interpretation of the Dutch blogosphere 2001 #2

Abstract interpretation of the Dutch blogosphere 2001 #2

Actual findings and paper will follow in a few weeks!

Article Series - Dutch Blogosphere Analysis

  1. Mapping the Dutch Blogosphere #Bloghelden
  2. Mapping Festival at Mediamatic
  3. Mapping the Dutch Blogosphere at Mapping Ignite
  4. Snapshot of the Dutch Blogosphere December 2010
  5. Visualizing data with Gephi: Abstract interpretations of the Dutch blogosphere #madewithgephi

Mapping the Dutch Blogosphere at Mapping Ignite

On July 9th, Esther Weltevrede and I presented our ongoing research on the Dutch Blogosphere at the Mediamatic Mapping Ignite event. Here are the slides and notes from our 5 minute superfast and condensed informational Ignite talk on researching and mapping the Dutch Blogosphere.



Slide 1:
Hi, I’m Anne and this is Esther and we are PhD’s at the University of Amsterdam with the Digital Methods Initiative. We will be showing the first results of a mapping project on the Dutch Blogosphere. It is a work in progress.

Slide 2:
Author on the Dutch blogosphere, Frank Schaap, distinguishes between two types of blogs: linklogs and lifelogs. Linklogs primarily post links to other websites (right), whereas Lifelogs primarily post details about their personal life and everyday experiences (left).

Slide 3:
The current Dutch blogosphere, however, seems to be characterized by the many references to social media platforms. Did the Dutch blogosphere transform from link- and lifelogs into platform-oriented blogs?

Slide 4:
Our aim is to map the changing linking practices of blogs in order to empirically analyze this shift. Following the definition of the blogosphere as the collection of all blogs and their interconnections we aim to map and characterize the Dutch blogosphere. So… which blogs?

Slide 5:
Well, good question! Starting points are very important! This collection of blogs is compiled from several expert sources, namely: lists from Frank Schaap, Merel Roze, Flabber, Frank Meeuwsen and Arie Altena.

Slide 6:
We used the Issue Crawler; a software tool that locates and visualizes networks on the web. It crawls the startingpoints, which means that it follows the hyperlinks from one page to the next, then analyzes and visualizes these connections.

Slide 7:
So what is the Dutch blogosphere? It is what the Dutch blogs link to. This means it also includes non-blogs. Moreover, these apparent strangers in our midst characterize the current Dutch blogosphere.

Slide 8:
First of all, there is a densely linked Dutch blogosphere. This snapshot from June 2010 shows the top 100 prominent blogs and related websites including news sites and social media platforms.

Slide 9:
When we zoom in we can see the links between the nodes and clusters made visible. What you see here is a literary cluster that includes professional writers like Ivo Victoria, Merel Roze, and Walter van den Berg.

Slide 10:
This second cluster is a marketing and technology cluster. It includes Bright, Frankwatching, and Dutch Cowboys. The latter is on the fringe of the networkcluster because, as you can see, it does not link back.

Slide 11:
In this detailed view of map we see the prominence of social media platforms in the Dutch blogosphere, including Twitter, Facebook, and YouTube. These platforms are most prominent within the marketing & technology and news & opinion cluster.

Slide 12:
One of the most central nodes, the micro-blogging platform Twitter is also the largest node in the Dutch blogosphere. When we look at the statistics we see that Twitter almost receives 35 thousand links from the rest of the network.

Slide 13:
Analyzing the links from the current Dutch blogosphere, platforms take a central and prominent position within it. How would one do an analysis on the historical Dutch blogosphere? Was the early 2003 blogosphere indeed organized around lifelogs and linklogs?

Slide 14:
Well, the historical Dutch blogosphere is a work in progress. The first question is: Which starting points to use? We took all the blogs on the Loglijst, a blog indexing site that was started in 2001. The Loglijst scraped and indexed Dutch blogs.

Slide 15:
However, when we checked all the blogs listed in the Loglijst for their response code, or put differently, check to see if they are still online and alive, we notice that many popular blogs from 2003 are no longer online.

Slide 16:
Fortunately, many of the “dead” blogs live on in the Internet Archive which has archived millions of pages from 1996 onward. One can revisit blogs from the past through their WayBackMachine which is the interface to the archive.

Slide 17:
The Internet Archive allows one to search for the history of one specific website or blog and as such privileges single site histories. When entering a URL the output is a list of archived snapshots ordered by date. (asterixes indicate changes to the website)

Slide 18:
This is one of the earliest archived Dutch blogs from 1999. We are automatically going to look up all the blogs from the starting list with one of our tools. Then rip all the links within the blogs and create network visualizations like we have seen before.

Slide 19:
The Dutch blogosphere is an under studied object and we wish to contribute by mapping its history. This proposed study enables us to create collections from the Dutch blogosphere for every year between 1999 and 2009, and compare and analyze these pasts states of the Dutch blogosphere.

Slide 20:
Thank you for your attention, kthnxbai, see you on digitalmethods.net