Session 1: Data Production, Usage and Integration
The Guardian was collecting data all the time, not doing anything with it. So we set up a blog (Guardian Data Blog) and we thought people who would be using this blog would be developers. However, “real people” want the data, not only developers. Putting raw data on the blog, putting it out as a Google Spreadsheet, easy for people to download, and high traffic friendly. Using known tools, such as ManyEyes (quick and easy). What we try and do is engage the public, journalists used to be people who used to create stories, but it is now a mutualized process. Eg, there is a Flickr group where people post visualizations. Invite the public to participate: Investigate your MP’s expenses. Ask the people to help review/classify the expenses, crowdsourcing the investigative work. We want to be a source for data and information. Most popular dataset: Dr Who.