All your tweets are belong to us
This is big. The ENTIRE archive containing ALL tweets? But if we read the official announcement on the Library of Congress blog it states “all public tweets” which seems like it will not include protected accounts and direct messages. The LoC blog went down due to the amount of attention so they decided to post the announcement on Facebook (as it contained more than 140 characters ;)) where a discussion immediately started off. Users are either surprised by this acquisition because they don’t see the value in it, or they are upset because they have acquired their personal tweets. However, as Manuel Magaña notes on Facebook, everytime you press “tweet” you agree to Twitter’s Terms of Service. Even if Twitter feels like a common good, it is still a company that can sell your personal user generated content. However, the Library of Congress is a “federal cultural institution and serves as the research arm of Congress” (About) and as such serves the members of Congress which may raise critical inquiries of using Twitter’s archive for political purposes and investigations.
Twitter as a historical tool
So how could the LoC tweet archive be used by researchers? In response to the value of the Twitter archive Randy Rice on Facebook describes how Twitter may serve as a people’s history for historians. With the Digital Methods Initiative we have previously used Twitter to write about the Iran (Green) Revolution by using tweets containing the #iranelection hashtag. Twitter is currently very limited in its use for historical accounts as documented by people present at events. Twitter’s search archive only goes back two weeks and only a custom built scraper may be able to retrieve older tweets. This is not within the skills of the sociologist or historian but an accessible archive may open up a new, huge, sourceset. How does one make sense of an enormous database filled with tweets? One way is to scrape hashtags for a certain event. Two questions remain: 1. will the entire archive become public? 2. will it contain a search function?
A mosaic of humanity
Jonathan Harris and Sep Kamvar’s “I Want You To Want Me” is an installation that documents our search for love on online dating sites. By scraping all the public data from dating sites it is “a very fertile ground for building a mosaic of humanity” according to Harris. When we enter our thoughts and feelings into databases we can use these for datamining to say something about our culture. And that is exactly what the Library of Congress seems to want. It acknowledges that not only books are part of our cultural heritage but also the updates on Twitter:
We also operate the National Digital Information Infrastructure and Preservation Program www.digitalpreservation.gov, which is pursuing a national strategy to collect, preserve and make available significant digital content, especially information that is created in digital form only, for current and future generations. (Raymond 2010)