Jonathan Gray: Open Data and Data Driven Journalism

Session 1: Data Production, Usage and Integration

From rivalrous goods (print) to non-rivalrous goods (bits).
We need to move beyond datasets used to illustrate reports.

There is an opportunity for an ecosystem of open data:
- small pieces, loosely joined
- easy to reuse, easy to recombine
- lots of contributors/maintainers
- distributed, decentralised
- divide and conquer
- revision, itirative, wiki like

Making the news:
- Finding new stories in datasets
- Bigger picture by linking datsets
- More pairs of eyes to spot patterns
- Harnessing more external expertise
- Analysing data behind the stories
- Responding to interest from public
- Putting stories into context
- Publishing datasets wihh stories
- Create new interfaces to data and stories

Spreading the news:
- Visually representing data
- Demand driven delivery
- Datasets for others to reuse
- Enabling users to comment flag
- Integration with other services
- Connecting data to stories

Data-Driven Journalism: Status and Outlook

Data-driven Journalism: What is There to learn?
Organized by the European Journalism Centre

Opening – Mirko Lorenz: Five Ws (and one H)

Premise: data provides a new perspective for journalism.
Where do journalists meet within the new field of data-driven journalism? What is their common ground? Programming journalists versus storytelling journalists.

Data-driven journalism (DDJ) is a workflow, where data is the basis for analysis, visualization and storytelling.

Foggy: Platforms, tools, formats, business models, financing.

Data > filter > visualize > story (value to the public chain)

DDJ as the rescue for the old journalism model of subscriptions and ads that has fallen apart? The future directions of DDJ:
- Reduce time to search
- Minimize need to reformat
- Enable decisions (eg politicians, oil cleaning companies)
- Detect anomolies earlier
- Be trustable

How? DDJ is in search of insights how to structure their material better in platforms like Holovaty/Everyblock. Inspiration: Alexander L. Holley (1832-1882) who had a deep interest in technology, trained as an engineer and writer, and a foreign technical correspondent for the New York Times. 

The Wired Graveyard

My colleague Esther Weltevrede has compiled this excellent list of things Wired has declared dead over the past years.

Wired: intitle:”* is dead” site:wired.com

The Wired Graveyard:
The Web Is Dead
Futurism Is Dead
Skype on Fring Is Dead
The desktop is dead
“Cyberspace” Is Dead
Devo Is Dead
Copyright Dead
DRM Is Dead
The Police Interceptor Is Dead.
Cthulhu Is Dead
Pontiac is Dead
Advertising Dead?
The Viper is Dead
Missile Defense Is Dead
The Western Blot is Dead
eDonkey Is Dead
‘Long-Term Accumulate’ Is Dead
WiFi is Dead
Garfield is Dead
Rock Is Dead
Tim Russert is Dead
Divx Is Dead
Paper Is Dead
iTunes Dead?
PUNK IS NOT DEAD!!!
The Album Is Dead
The Tivo Box Is Dead
Code Red Is Dead
PDA Is Dead
The Creator of the Ramen InstaNoodle Is Dead
Is Interactive Dead?
The TV Is Dead

Thanks @esthr!

How Web 1.0 is the Issuecrawler?

This is the transcript of the Digital Methods Initiative Advanced Program Projects week 2 opening talk on Issuecrawler 1.0 and Social Media by Anne Helmond.

The 2.0 denotes an ‘improved’ or progressional version of the web that builds upon and develops Web 1.0. [...] Implicitly rooted in this vision of the web is a sense of teleological progress, of purposeful and directed development, of continual and designed improvement. (Beer 2009: 986)

Instead of looking at Web 2.0 as the “next” version of the web, we can also look at the changes in the structure of the web, specifically looking at web native objects. In this view, Web 1.0 consists of the static page, whereas Web 2.0 consists of dynamic pages filled with the web native object of the status update or the post. This may be seen in the blog and specifically in RSS – denoting changes to a page-, which could be considered a main object of study in the shift from Web 1.0 to Web 2.0 and in the social networking site with its profiles that display a page (The Wall) filled with posts. An important shift has taken place in the structure of the web: in Web 1.0 hyperlinks mainly link to static pages and objects and in Web 2.0 the hyperlink links to dynamic pages and objects. This shift affects the way we map and analyze the web.

In general terms, Web 2.0 is a concept that forms part of the lexicon of a range of emerging accounts that commentate on a large-scale shift toward a ‘participatory’ and ‘collaborative’ version of the web, where users are able to get involved and create content. (Beer 2009: 986)

This ‘participatory’ and ‘collaborative’ web has created new objects and new types of hyperlinks that characterize Web 2.0: the subscribe, the like, the share, the nr of retweets, the submit to Digg, the save to Delicious, the social network profile, the shortened url, etc. The question also becomes, are these new characteristics forming a new currency of the web? In Links and Power: The Political Economy of Linking on the Web, Jill Walker describes links as the currency of the web and asks what its currency is. Even though there is a black market for links she notes that “The more common form of trade in this economy of links is barter exchange. Reciprocal linking and link exchange are common practice, and are loosely organised as favours or more systematically in web rings and blogrolling.” (Walker 2002)

Is the hyperlink still the currency of the web in Web 2.0?

If we want to map the current web, how can we use, or adjust, the IssueCrawler to deal with these new objects and new types of links? How do we map a dynamic web? Currently, the IssueCrawler collapses all social networking links from platforms like Twitter and Facebook. Current web mapping and analysis focuses on the interrelations between users on for example Twitter by isolating it. How can we map the current web by not looking at these platforms in isolation but as part of the so-called “ecosystem” they are part of?

The traditional web site is static, but the Internet specializes in flowing, changing information. The “velocity of information” is important — not just the facts but their rate and direction of flow. […] The structure called a cyberstream or lifestream is better suited to the Internet than a conventional website because it shows information-in-motion, a rushing flow of fresh information instead of a stagnant pool. […] Internet culture is a culture of nowness. (Gelernter 2010)

The lifestream is characterized by both time (which we will deal with later) and cross-syndication. The interwoven social media platforms gathered into a central source. How can we analyze cross-platform syndication, which tools do we currently have at hand and which tools do we need to perform such an analysis?

The profile is a common feature of Web 2.0, and is the place where information is gathered about us, our activities, our choices, tastes and preferences and so on. (Beer 2009: 996)

One way into operationalizing Web 2.0’ifying the IssueCrawler is looking at the structure of different social networking sites and platforms. Profile structures may be checked by looking into username checkers. A second way is, instead of categorizing sites by their domain name (.edu, .us, .nl) is by type of platform. A third way is to move beyond the hyperlink as the prime object of mapping as proposed by for example Greg Elmer (2006).

How are networks formed in 2.0? One could argue that a network is formed through liking, sharing and saving in addition to linking. What are the web native objects and characteristics that form networks in the 2.0? What is the role of platforms in the formation of networks in 2.0? Considering the politics of platforms (Gillespie 2010), are some platforms more central than others? How open or closed are these platforms and how does this affect mapping?

The text above describes three meta-issues, which would translate into three projects:

  1. Issuecrawler 2.0 > How to deal with the 2.0 in the network?
  2. Types of 2.0 links/The link 2.0 > Is the hyperlink still the currency of the web in Web 2.0? How to compare recommendation objects? Hyperlink vs. the like or the share? What do they do to the quality of the web?
  3. Cross-platform syndication > cross-spherical comparison of platforms? Content circulation analysis has become difficult in the social web
  4. Platform dependency > Changing linking practices > Dutch Blogosphere. How and where to find issues in 2.0? How do you define what an actor is?

Video from my presentation on Identity 2.0

Last week I was invited by Sven Goyvaerts from the Transmedia Postgraduate Program in Arts + Media + Design to give a lecture on Identity 2.0 as part of the Social Media & the Avatar Day, organized at the Memories of the Future symposium in Vooruit Ghent, Belgium.

ANNE HELMOND / Identity 2.0 from sven g on Vimeo.

June 25th 2010 – 1h30min lecture presentation + group discussion

Anne Helmond is New Media PhD candidate with the Digital Methods Initiative at the Mediastudies department at the University of Amsterdam where she studied New Media from 2004-2008. For our Social Media & the Avatar Day, organized at the Memories of the Future symposium in Vooruit Ghent, we invited Anne to elaborate on her recent paper IDENTITY 2.0 – Constructing identity with cultural software.

Mapping Festival at Mediamatic

Mediamatic is organizing a three day mapping festival where Esther Weltevrede and I will present our research on the Dutch blogosphere at the Mapping Ignite evening.

“Map Fest takes place at Mediamatic on July 6, 8 and 9. Map Fest brings together kindred spirits to explore, create, define and oppose maps.”

Day 1 is Mapping for Change. Day 2 is Mapping for Clarity with our Professor Richard Rogers. Day 3 is Mapping Ignite with super-fast-speedy-wonderful lightning talks including one by Esther and me!

Come and join us!

Sneak preview. Snapshot of the Dutch blogosphere 27th June 2010, with a marketing & technology blog cluster:

Snapshot of the Dutch blogosphere 27th June 2010, with a marketing & technology blog cluster