What does Twitter know about me? My .zip file with 50Mb of data
Three weeks ago I read a tweet from @web_martin who had requested all his data from Twitter under European law and received a .zip file with his data from Twitter. He linked to the Privacy International blog which has written down step by step how to request your own data. On March 27, 2012 I initiated my request following the instructions from the Privacy International blog, which included sending a fax (fortunately I work at the Mediastudies department) to Twitter with a copy of a photo ID (I blanked out all personal info, I just kept my picture and name visible) to verify my request. Within a day, after verification of my identity, I received an email reply with instructions to get my own basic data. These instructions were basically API calls which provide very limited data.
While the above did not provide me with any new information I did appreciate the quick response from Twitter to point out how to get publicly accessible data through the API. However, I was more interested in the data that they keep but do not allow me to directly access, that is, without a legal request. Well within the 40-day timeframe, three weeks later, Twitter sent me a big .zip file with all my data. They explained in detail in their email what is in the the .zip file:
The previously emailed API calls are also in other-sources.txt in the .zip file and provide a way into the “real time data” in contrast to the archived data: “Certain information about your Twitter account is available in real time via the links below or by making the following API calls while authenticated to your Twitter account.”
Let’s briefly go into some findings:
silvertje-contacts.txt
- Contains all the contacts in my phone, which is a Google phone, so it has my complete Gmail address book, enabled by the ‘Find Friends’ feature. The file lists 152 phonenumbers and 1186 e-mail addresses. I must have used the ‘Find Friends’ feature once, probably when I first installed the official Twitter Android app. After becoming aware of the fact that Twitter copies your complete address book I have avoided this feature and similar features in other applications and other social media platforms. However, my data is still being kept by Twitter and there is no way to delete it. Twitter knows all my friends and acquaintances. Update: learn how to remove this data.
silvertje-dms.txt:
- The first DM in this file is from 2009: created_at: Tue Nov 24 19:33:12 +0000 2009
- Unfortunately I have no way to check whether this file contains deleted DMs because I cannot access old DMs through the new interface anymore.
- Lists all logins to my Twitter account and associated IP addresses between February 1, 2012 – April 12, 2012.
- Listed are quite a few IP-addresses that resolve to: Host: ec2-107-20-112-109.compute-1.amazonaws.com. Country: United States. Any idea what this might be? An external service I have authorized to access my Twitter account that uses Amazon Web Services?
silvertje-tweets.txt
- This almost 50MB text file contains all my tweets. All 47455 of them.
My computer had a hard time opening this large text file:
The collection presents a really readable and searchable archive of all my tweets. It contains the ID to every tweet, so you can also easily see the tweet on Twitter by adjusting the following permalink: https://twitter.com/#!/username/status/tweetid. Here’s my first tweet:
Finally joined Twitter. Working at Lowlands for VPRO 3voor12. Just finished my last photos: M.I.A. was great!
— Anne Helmond (@silvertje) August 18, 2007
Here’s an overview of what is contained for every tweet:
While this is a rather rigorous method to retrieve your own data I do hope that more (European) users will request their own data and as a consequence further open up the debate about being able to easily download your own data from a service.
To start archiving your own tweets I recommend using ThinkUp “a free, open source web application that captures all your activity on social networks like Twitter, Facebook and Google+.” Because it actually schedules API calls, and the Twitter API only allows you to fetch your latest 3200 tweets, it does not enable you to get all your own tweets but it does create a good archive as of now.
Update 1: As my colleague Bernhard Rieder points out, the data is in JSON format and can be directly picked up with a script without parsing. That opens up possibilities to further use, process and analyze the data.
Update 2: The Guardian published an interview with Tim Berners-Lee this morning who calls on people to “demand your data from Google and Facebook” and Twitter of course.
Update 3: One of the major Dutch newspapers, NRC, has written a story about this case: 50 MB aan tweets, adressen en al je nummers. Dit is wat Twitter van je weet.
Update 4: This is Facebook’s automatic answer to my request: http://pastebin.com/xe0LvJJY. In other words: “We’ll fix it with a new tool in a few months.” They do not give a timeframe in which I can expect this new tool, nor do I expect the tool to give me full access to my data. The Europe versus Facebook group, where I got my instructions from, notes the following: “Facebook has made it more and more difficult to get access to your data. The legal deadline of 40 days is currently ignored. Users get rerouted to a “download tool” that only gives you a copy of your own profile (about 22 data categories of 84 categories). You can make a complaint to the Irish Data Protection Commission, but the Commission seems to turn down all complaints that were filed. Therefore we have now also posted forms which allow you to complain at the European Commission if the Irish authority does not enforce your right to access.”
I do not expect to get my data from Facebook within 40 days, or at all, and I do plan to file a complaint with the Irish Data Protection Commission and the European Commission if they fail to comply with my request.
Update 5: An easy solution to remove your contacts from Twitter.
You can leave a response, or trackback from your own site.




What does Twitter know about me? My .zip file with 50Mb of data « Anne Helmond http://t.co/ej6l6Cw7
@davewiner Maybe JSON? I got all my Twitter data from Twitter (50.000 tweets) in JSON: http://t.co/7wJXzm89
@davewiner Maybe JSON? I got all my Twitter data from Twitter (50.000 tweets) in JSON: http://t.co/7wJXzm89
@davewiner Maybe JSON? I got all my Twitter data from Twitter (50.000 tweets) in JSON: http://t.co/7wJXzm89
Missed this previously – the results of a data protection subject access request to twitter http://t.co/pmZRBYkG via @charlesarthur
Missed this previously – the results of a data protection subject access request to twitter http://t.co/pmZRBYkG via @charlesarthur
Missed this previously – the results of a data protection subject access request to twitter http://t.co/pmZRBYkG via @charlesarthur
[...] What does Twitter know about me? My .zip file with 50Mb of data « Anne Helmond Three weeks ago I read a tweet from @web_martin who had requested all his data from Twitter under European law and received a .zip file with his data from Twitter. He linked to the Privacy International blog which has written down step by step how to request your own data. [...]
What twitter knows about you (and your contacts): http://t.co/mxHw4wxo
What Twitter knows about you; http://t.co/7HdOJmGZ
@ictrecht Als experiment wil ik dit herhalen:http://t.co/7wJXzm89 met iemand uit USA om zijn tweets op te vragen, benieuwd jullie expertise
[...] What does Twitter know about me? My .zip file with 50Mb of data « Anne Helmond Three weeks ago I read a tweet from @web_martin who had requested all his data from Twitter under European law and received a .zip file with his data from Twitter. He linked to the Privacy International blog which has written down step by step how to request your own data. [...]
[...] What does Twitter know about me? My .zip file with 50Mb of data « Anne Helmond Three weeks ago I read a tweet from @web_martin who had requested all his data from Twitter under European law and received a .zip file with his data from Twitter. He linked to the Privacy International blog which has written down step by step how to request your own data. [...]
Anne Helmond>> What does Twitter know about me? My .zip file with 50Mb of data: http://t.co/0XPI572v #medialaw
What does Twitter know about me? http://t.co/pFhukPeT
What does Twitter know about me? My .zip file with 50Mb of data http://t.co/8vgwYsDM
[...] What does Twitter know about me? My .zip record with 50Mb of information « Anne Helmond Three weeks ago we review a twitter from @web_martin who had requested all his information from Twitter underneath European law and perceived a .zip record with his information from Twitter. He related to a Privacy International blog that has created down step by step how to ask your possess data. [...]
@frak and this: http://t.co/bKeKFduC
[...] is. Zelfs als je dit denkt te hebben ge-delete. Anne Helmond heeft al haar gegevens van Twitter opgevraagd en kwam tot de conclusie dat zelfs tweets die zij had ge-delete, nog steeds in deze database [...]
@babetterumt Scrollen werkt niet. Zit een limiet op. Dit wel: http://t.co/RiGjeNsf
What does Twitter know about you? @silvertje requested all her data from Twitter under EU law & received 50MB of data http://t.co/OyAjm3FS
What does Twitter know about you? @silvertje requested all her data from Twitter under EU law & received 50MB of data http://t.co/OyAjm3FS
What does Twitter know about you? @silvertje requested all her data from Twitter under EU law & received 50MB of data http://t.co/OyAjm3FS
What does Twitter know about you? @silvertje requested all her data from Twitter under EU law & received 50MB of data http://t.co/OyAjm3FS
What does Twitter know about you? @silvertje requested all her data from Twitter under EU law & received 50MB of data http://t.co/OyAjm3FS
What does Twitter know about you? @silvertje requested all her data from Twitter under EU law & received 50MB of data http://t.co/OyAjm3FS
What does Twitter know about you? @silvertje requested all her data from Twitter under EU law & received 50MB of data http://t.co/OyAjm3FS
What does Twitter know about you? @silvertje requested all her data from Twitter under EU law & received 50MB of data http://t.co/OyAjm3FS
What does Twitter know about you? @silvertje requested all her data from Twitter under EU law & received 50MB of data http://t.co/OyAjm3FS
What does Twitter know about you…
http://t.co/iLVPi8JC
Twitter start optie om al je tweets te downloaden http://t.co/NRQCjjHS Dus geen moeilijke manier zoals bij @silvertje http://t.co/og8PvMaO
Kijk! Iemand die zijn burgerplicht doet. http://t.co/og8PvMaO
[...] months after I requested my own Twitter data from Twitter through a legal request under the European privacy law, Twitter now allows you to download your own tweets through their [...]
@Bjorn_W Dat is niet via een tool. Heb ik bij Twitter opgevraagd via een legal request, zie: http://t.co/7wJXzm89
[...] uma forma de obter meu arquivo de tweets, cheguei a este artigo da holandesa Anne Helmond, com um passoa-a-passo para solicitar uma cópia de todos os seus dados junto ao setor jurídico do [...]