Twitter data available in CSV and JSON with a nice HTML view

Eight months after I requested my own Twitter data from Twitter through a legal request under the European privacy law, Twitter now allows you to download your own tweets through their interface. The archive can be downloaded from the settings page (see this blog post from Twitter) and the file named contains all your tweets from the beginning.

Twitter archive

Twitter archive

The tweets are stored in two different formats: CSV and JSON which makes it a versatile archive to work with for both users and developers. The archive does not only contain your own tweets but also tweets you have retweeted but excludes DMs and favorites. The archive is neatly organized and tweets are stored in files per year per month, for example: 2007_08.js. The .zip file also includes an interface to browse through your archive per year per month:

The JSON export is also used to power the archive browser interface (index.html).

The JSON export is also used to power the archive browser interface (index.html).

My previous archive which I received from Twitter contains more data because back then I requested all data Twitter keeps about me, which includes direct messages, metadata and logins, IP addresses, contacts, etc. The data that is available per tweet in both archives is quite similar:

Tweet data from old archive

Tweet data from old archive

Tweet data from new archive

Tweet data from new archive

When comparing my old archive to the new archive what seems to be different however is the availability of a retweet count. The old archive contained a line “retweet_count”: *, which would show the number of retweets for that particular tweet. This (valuable) data has been removed from the new archive.

Where my username comes from

I have been using the username silvertje for several services and multiple social media platforms for years and people have been asking me where my username comes from. The answer is IRC. When I first came online, somewhere in 1995, I quickly discovered IRC and became an avid user :)

In contrast to current social networking sites and social media platforms there was no way to “register” a username on IRC. This meant that you either had to hope no-one else would use the same name, or have a very unique username, or be online all the time and “claim” your username through persistant onlineness. Being on a 14K4 dail-up connection which cost over 5 Dutch Guilders (about 5 dollars) per hour I had a clear disadvantage compared to the American and Finnish IRC users who were able to be online all the time for free through their universities. So every time I used IRC I had to dail-up, get online, get on IRC and hope no one was using my username. The first username I picked was ‘sliver’ (after the Nirvana song) but that one was taken very often. Then I decided on ‘_sliver_’ but that one was also often taken. Then I chose ‘slivertje’ to create a Dutch diminutive but apparently another Dutch user thinking the same thing. Finally, I settled on ‘silvertje,’ which means little silver in Dutch, which never seemed taken on IRC and I have happily been using it ever since (although I am not on IRC anymore).

Trying out some new services:, State, Branch, Medium, Kippt, Buffer

Over the past couple of weeks I have joined a variety of new services including, State, Branch, Medium, Kippt, Buffer.
I recently backed my first kickstarter-ish project ever and decided to join (AppDotNet or ADN). People keep asking me if I think it can ever compete with Twitter or will it ever reach critical mass or if it will stay a ghost town like Google+? For me the question is not whether ADN will be able to “replace” Twitter but rather I see it as a reflection of the current zeitgeist. ADN is not simply an ad-free alternative to Twitter. Instead, alternatives to major platforms such as Facebook and Twitter are increasingly gaining momentum. ADN is definitely not the first, think for example of Diaspora (launched as a Facebook alternative) and (formerly Status.Net) which calls itself “a stream oriented social network service” (FAQ). Both services never really went mainstream, maybe because they were both ahead of their time.

ADN, at a first glance, seems similar to but there is one important distinction which also differentiates it from Twitter because with “You can install the StatusNet software that runs on your own servers, since it’s Free and Open Source software. You can make groups, and share privately with those groups.” This allows you to run on your own server, a decentralized model, while both Twitter and ADN rely on a centralized model. ADN follows a centralized model which is very common for the current era of social media platforms. As a platform, ADN is operating as software as service“a software delivery model in which software and associated data are centrally hosted on the cloud,” and offers an API for developers. The API is the main core of ADN and is only one possible way of how an ADN application or service can look or function. Two great write-ups deal with these issues: First, Dan Wineman describes the relation between the social graph, publishing and aggregation and how social platforms like Twitter and ADN deal with these differently, and second, Orian Marx describes what ADN is, could possibly be and how it is different from its alternatives. Yes, ADN costs 50 dollars (or 100 if you are a developer) and it is still a centralized service but I can’t even begin to describe what has been developed with the ADN API in less than three weeks.

ADN isn’t the only thing that is currently brewing as an alternative to Twitter which is increasingly shutting out other services and third-party developers. Dave Winer hypothetically proposes a “A microblogging server that’s a simple install on EC2 or Rackspace or any other easy cloud-based server in other words,decentralized easy self-install Twitter alternative in the cloud.  Another initiative that is currently buzzing in the blogosphere is “a protocol for open, decentralized social networking” which looks interesting but Winer reminds us that “What matters is what software is supporting the protocol, what content is available through it and how compelling is the content.” There is also critique on developing Yet Another Protocol while it could use existing protocols, which reminded me of the following XKCD comic on standards:

My username is @silvertje if you would like to contact me on ADN. I have created a Google Doc which lists about 80 other Dutch ADN users, @adrianus has built Appnetizens streams, a “Tweedeck” like interface for ADN (for which I did some CSS-color-advice) with multiple column-view and tons of other features such as a “Netherlands” view with all known Dutch users, @frankmeeuwsen has started a blog titled Appdotnet Culture which documents ADN’s early developer and user culture and @richardk writes about ADN developments. I’ve also created an IFTTT recipe that allows you to cross-post selectively from Twitter to ADN whenever the tweet contains the hashtag #adn.


I started using Buffer to cross-post some messages from Twitter to ADN using an IFTTT recipe I created: Send Tweets with Hashtag #ADN to via Buffer However, IFTTT just added ADN as a channel to their service so I don’t have to pipe everything through Buffer anymore, so until I find another use for this service I am putting it on pause.

At the first glance State looks like a Netvibes made for the platform & cloud era. It’s not simply a service to aggregate your streams because State also allows you to interact with your streams. In other words, you can reply to your Tweets and ADN posts and when you click on a user it brings you to the user profile displayed within State. However, not all actions that can be performed on objects within these platforms are available yet. You can also add RSS feeds but it is not immediately clear how this works. You can “search” for a feed, where it seems to search the web for your query and then grabs the feed from these results. When I ego-search for myself I get feeds for my Flickr photos, Quora profile etc but I cannot seem to find the main feed for my own blog. Adding a custom feed by URL would be a great option. I’ve only used it for a few hours but I love it so far and ReadWriteWeb calls it “A Streams App Of The Future“. It looks clean, minimal and good and they respond very quickly to feature suggestions (they implemented a reply to Instagram photos function after I suggested it on Twitter!), always a bonus :)

Update: Joshua from State kindly answered my question concerning the RSS feature. State is currently using “Google’s Feed API ( to search for feeds using the text you type into the box” which interestingly enough brings up the feeds for my presence elsewhere but not my own blog.



Branch, Medium, Kippt

Branch, Medium, Kippt are three more new platforms I joined recently for publishing, discussing and link sharing but so far I have merely glanced at them, as one can only spend so much time online.

On a final note, I’m happy to contribute as a female to the all these new services which are dominated by “alpha geeks” aka white males according to BuzzFeed’s latest article on the early adopters of these platforms.

My Notes for Geert Lovink’s book launch of Networks Without a Cause: A critique of Social Media

Anne Helmond & Geert Lovink

Anne Helmond & Geert Lovink during Geert Lovink's book launch of Networks Without a Cause: A critique of Social Media. Photo by Sabine Niederer. Background image:

The Institute of Network Cultures, Eva van den Eijnde and myself would like to welcome you to the official book launch of Geert Lovink’s new book Networks Without a Cause. A Critique of Social Media. Thank you very much for being here. Today I would like to start with a brief introduction to Geert’s new book and how it relates to his previous work. Afterwards Geert will talk about his new book, followed by a few questions and comments from Eva van den Eijnde and myself, and of course questions from the audience.

Networks Without a Cause is the fourth book by Geert in his series of studies into critical internet culture. For those unfamiliar with Geert’s work, the first book in this series is Dark Fiber (2001) which deals with early internet culture, from cyber culture to His second book My First Recession (2003) describes the aftermath of the mania and looks at the transition period of the crash to the early blogging years. His third book Zero Comments (2008) looks back on the blogging hype that has commenced since and addresses blogs as an unfolding process of “massification” and blogging as a “nihilistic venture.” It also looks at the Web 2.0 hype or Web 2.0 mini-bubble that echoes the dot-com era but also differs from it as described by Geert. His new monograph, Networks Without a Cause (2012), continues where Zero Comments has left off by describing the late Web 2.0 era.

Geert Lovink's net critique series

Geert Lovink's net critique series. Publisher: Polity Press 2012 Design: Studio Leon Loes.

The introduction of Networks Without a Cause starts with the important umbrella question “How do we capture Web 2.0 before its disappearance?” The rise of the real-time signifies a fundamental shift from the static archive and handcoded HTML websites toward “flow” and the “river” as metaphors of the real-time, where the software, social media platforms, are automatically generating content flows from the input from their users. Blogs and blog software have played an important role in this shift, with the reverse-chronology of blog entries and the river of fresh content produced by RSS feeds. Real-time is a key feature of social media platforms such as Facebook with its news feed and Twitter with its timeline, where content flows by, begging the question for researchers how to capture and archive this flow in order to be able to analyze it, and for Geert also the question of “why store a flow?” related to the notion of users no longer saving their files for offline retrieval but instead moving, storing and syncing everything in the cloud (think for example about Gmail and Dropbox) but also the question of identity management because “how do you shape the self in real-time flows?” (p. 11) These and many other questions posed throughout the book are part of a “Net criticism” project that seeks to develop sustainable concepts as individual building blocks that through dialogues and debates “will ultimately culminate in a comprehensive materialist (read: hardware- and software-focused) and affect-related theory.” (p. 22)1

Question 1: Web 2.0 versus social media
Is it a coincidence that a number of books dealing critically dealing with “social media” are coming out at the same time? This book Networks Without a Cause with its subtitle A Critique of Social Media, also The Social Media Reader a volume on the topic with contributions by well-known authors on the subject where, in the introduction the term Web 2.0 is called a buzzword, that on the one hand has been “emptied of its referent, it is an empty signifier: it is a brand.” (p. 4)2 but on the other hand encapsulates an aspect of the phenomenon of social media. And finally, the upcoming book by Andrew Keen called Digital Vertigo that addresses the threat of the social and the tension between the collective social and the individual in “today’s creeping tyranny of an ever-increasingly transparent social network that threatens the individual liberty.”3 Geert also addresses related issues in his book when describes “The social as a feature.” He describes how “Social media as a buzzword of the outgoing Web 2.0 era is just a product of business management strategies and should be judged accordingly.” (p. 6)

Is Web 2.0 a thing from the past? As a lecturer in the first year of Mediastudies at the University of Amsterdam I was surprised to learn this year that my students were not familiar with the term Web 2.0 at all! Everyone had heard of social media, everybody except for one privacy conscious student, was a member of Facebook, but none of them had heard of the term Web 2.0. This is also illustrated in the following image:

Social Media versus Web 2.0

Social Media versus Web 2.0

What is the relation between Web 2.0 and social media when thinking not only about terminology but also about software, practices and critiques?

Question 2: Comment cultures
While in Zero Comments Geert focused on the average blog with its zero comments, in Networks Without a Cause he focuses on the other end of the Power Law diagram and looks at blogs that have reached a critical mass. In the introduction he writes how in Web 2.0: “Current software invites users to leave short statements but often excludes the possibility for others to respond. Web 2.0 was not designed to facilitate debate with its thousands of contributions. […] What the back-office software does is merely measure “responsiveness”: in other words, there have been that many users, that much judgment, and that little debate.” p. 19

Measuring responsiveness

Measuring responsiveness

While blogs offers a form of facilitated debate by offering the possibility of comments, it is highly hierarchical due to the strict separation of content and comments. On top of that bloggers are continuously debating how to improve the old blog comment infrastructure in order to deal with the “tragedy of the comments” that have caused some bloggers to shut down their comments.

Geert argues that thinking about the software-architecture to design the comment ecology is important because software co-produces a social order. Could you further elaborate on current comment cultures, your ideas to go beyond taming the commentators, and the increasing splintering comment ecology with the conversation also moving to social media platforms such as Twitter and Facebook with no proper way to connect all these distributed comments back to the original text?

  1. Lovink, Geert. Networks Without a Cause. A Critique of Social Media. Polity Press, 2012.[]
  2. Mandiberg, Michael (editor). The Social Media Reader. New York University Press, 2012.[]
  3. Keen, Andrew. Digital Vertigo: How Today’s Online Social Revolution Is Dividing, Diminishing, and Disorienting Us. Forthcoming, May 2012.[]

David Gelernter on the lifestream, time, pace and space.

Last year Erik Borra, Taina Bucher, Carolin Gerlitz, Esther Weltevrede and I worked on a project “One day on the internet is enough” which we have since referred to as “Pace Online.”

Pace Online by Erik Borra, Taina Bucher, Carolin Gerlitz, Anne Helmond, Esther Weltevrede

The project aims to contribute to thinking about temporality or pace online by focusing on the notion of spheres and distinct media spaces. Pace isn’t the most important question, respect for the objects and the relation between objects and pace per sphere are also of interest in this study. Both in terms of how the engines and platforms handle freshness, as well as currency objects that are used by the engines and platforms to organize content. Moving beyond a more general conclusion that there are multiple presents or a multiplicity of time on the internet, we can try to start specifying how paces are different, and overlap, empirically. The aim is to specify paces and to investigate the relation between freshness and relevance per media space. The assumption is that freshness and relevance create different paces and that the pace within each sphere and plattform is internally different and multiple in itself. (continue reading on the project wiki page)

I was reminded of the project when I read Rethinking the Digital Future, a piece by in the Wall Street Journal on David Gelernter and the lifestream. Gelernter describes a particular relationship between streams and pace when talking about the worldstream and an individual stream. In this subset of the worldstream things move at a slower pace because individual objects are added less frequently than when looking at the aggregate, the worldstream. We argue something similar in Pace Online, where – translated into Gelernter vocabulary – this worldstream consists of different spaces with different paces. Zooming into a space, such as Twitter or Facebook or Flickr, creates a subset within the worldstream. There are numerous subsets of subsets that may be created as one can zoom into the stream of Twitter and then further zoom into this stream based on a hashtag or an individual user profile where each of these subsets of streams have different paces.

In “Time to start taking the internet seriously” (2010) David Gelernter describes a shift from space to time and with it the lifestream as the organizing principle of the web: “The Internet’s future is not Web 2.0 or 200.0 but the post-Web, where time instead of space is the organizing principle.” Interestingly enough he does see a history in the fleeting stream: “Every month, more and more information surges through the Cybersphere in lifestreams — some called blogs, “feeds,” “activity streams,” “event streams,” Twitter streams. All these streams are specialized examples of the cyberstructure we called a lifestream in the mid-1990s: a stream made of all sorts of digital documents, arranged by time of creation or arrival, changing in realtime; a stream you can focus and thus turn into a different stream; a stream with a past, present and future. The future flows through the present into the past at the speed of time.” A stream with a past is something rare, for example you cannot go back to your first tweet if you have published over 3200 tweets on Twitter and you cannot search for tweets over 14 days old. While Twitter partner Gnip announced “Historical Twitter Data” yesterday, this history of tweets is only 30 days old. It also points to an interesting relation between the past, present and future of a stream as it offers the past because we cannot anticipate the future:

We have solved a fundamental challenge our customers face when working with realtime social data streams,” said Jud Valeski, Co-Founder and CEO of Gnip. “Since you can’t predict the future, it’s impossible to filter the realtime stream to capture every Tweet you need. Hindsight, however, is 20/20. With 30-Day Replay for Twitter, our customers can now replay history to get the data they want. (Gnip Blog)

This is also one of the problems, or challenges, for researchers using Twitter because it is impossible to predict an event and its hashtag and most publicly available tools for collecting Tweets using hashtags do not go back in time. One can only research the now and not the past.

Digital Methods Winterschool 2012: APIs – Variations and Change

After the introduction to APIs and API critiques Bernard Rieder talked about APIs from the perspective of  “Variation and Change.” This transcript is compiled from collaborative notes by the Digital Methods Initiative.

API: a means and protocol for two systems to exchange data and functionality.

APIs can be seen as data sources and as objects of study that can be historicized, analyzed, critiqued, etc. Before taking the API as a research object we also need to get a better understanding of “what we can get” out of APIs and asses our level of confidence when researching. The API can be used as a means to study a service and possibly the evolution of the Web?

The ‘past’

Andrew D. Birrel and Bruce Jay Nelson, Implementing Remote Procedure Calls, ACM Transactions on Computer Systems 2(1):39-59, February 1984.

Webservices, SOA –   XML-RPC, SOAP, WSDL – B2B, e-commerce

Google SOAP Web API: 2002 (Java, .NET), Amazon Web Services: 2002

The history of APIs; they came out of business context, B2B, e-commerce transactions, to ensure transactional integrity. They were heavy protocols first written in ‘hard-core’ programming languages such as Java and not PEARL, PHP and JavaScript.

The ‘turn’

Flickr (feb 2004), API (aug 2004): Easy to use API. Less about transactional integrity.

Google maps (feb 2005). The Housing Maps project (march 2005)  used two scrapers. Google Maps was reverse-engineered to extract the tiles (the individual images that make up the map). Next, he scraped the data from Craigslist and combined the two. After this, Google hired the guy and implemented the API a few months later (June 2005).

Programmable  Web has a API directory and lists the most used APIs which allows for historical comparison on APIs. For example, in 2007 there are no social networks and Google maps is 1st and Flickr is 2nd. Now, in 2012, Google maps is still 1st but Twitter is 2nd and Facebook is 6th.

The turn also entails a shift from a hard heavy business logic, to a soft logic.

Lines of variation and change

An investigation into synchronous/diachronous lines of variation and change can serve as API critique or historical analysis. Questions may concern:

  • technical structure and use (how?) – how similar is the tech infrastructure for developers to the platform’s view?
  • intended audience, intended use (who?) – audience: both developers and end-users
  • economic model (why?)
  • restriction and tolerance (legal, technical, transparency, etc). Restrictions: Explicit or not
  • developer relations (communication, support, etc). Questions: How do their organize there documents? How do they communicate it? What does it say over there relationship with users? How does it change over time?
  • publicness and authentication (privacy, ego-view) – Facebook has an open API, the search API. There are variations of authentication.
  • coverage and discrepancy (API, “user view”) – The API and frontend often do not have the same results
  • read/write capacities (location in the flow) – and possible use of this information to infer how the service views itself vis-a-vis other systems


Pages: Prev 1 2 3 4 5 6 7 8 9 10 ... 13 14 15 Next