On the Evolution of Methods: Banditry and the Volatility of Methods

I was honored to be invited by The Berkman Center for Internet & Society and the University of St. Gallen to participate in the expert-workshop “Research Methods in the Digitally Networked Information Age” in Brunnen, Switzerland from 10 to 12 May 2010.

Switzerland 2010

Rob Faris and Christian Sandvig

On Tuesday Christian Sandvig and I moderated the “Evolution of Methods” panel in which we addressed two topics: 1. banditry (Sandvig) and 2. the volatility of methods (Helmond).

Banditry

Christian Sandvig proposes banditry as a metaphor for looking at the evolution of methods. We need to celebrate the bandits and ask ourselves how we can become better bandits in order to take banditry seriously. What methods can we borrow or which methods would we like to steal from other disciplines? In line with the banditry metaphor I would like to add a biological notion of evolution to this idea by taking into account the parasitic and symbiotic way of transferring methods or taking them from other disciplines. Banditry in that sense could be considered a parasitic method of transferring methods.

Gerhard Buurman notes that banditry has a negative connotation, it’s an angry action which often involves a victim. Instead of thinking about the evolution of methods from a banditry metaphor it might be more useful to think through the notions of translation and evolution.

Does transferring methods from one discipline to another involve a translation? Different disciplines use different definitions complicating the interdisciplinary movement of methods. Adopting methods should ideally involve the process of cultivation: “to improve by labor, care, or study.”1. Cultivated methods have been transferred from the environment or object of study originally applied to and as such are “no longer in the natural state.”2

The volatility of methods

The web has a focus on freshness (see The Perceived Freshness Fetish) and an update culture and as such “Internet methods are incessantly volatile due to the update culture of the Internet itself.”3 Digital methods may be volatile if we build tools (scrapers, crawlers, plugins) on top of devices that change.

There are different data gathering methods: The API is the polite way of gathering data and scraping could be considered the impolite way of harnessing data: “You can arrange digital research methods on a spectrum of niceness. On the one hand you use the industry-provided API. On the other you scrape Facebook for all it is worth.”4 APIs often limit which information you can retrieve and the amount of information you can retrieve. APIs bring back the notion of scarcity in the digital age which is often considered to be the domain of abundance. According to Chris Anderson in ‘The Tragically Neglected Economics of Abundance’ “clearly abundance (AKA “plentitude”) is all around us, especially in technology” but the limit on API calls show differently. The Twitter REST API allows general users only 150 requests per hour. Once you pass this number you are temporarily ‘banned’5. For developers this can be expanded to 20000 requests per hour by whitelisting your IP address or account but maintains update and followers limits. Social graph/social network analysis applications build on top of Twitter using the API like Wow.ly and Mailana still very often hit the API limits. Another important aspect for researchers is that the Twitter Search API is limited: “We also restrict the size of the search index by placing a date limit on the updates we allow you to search. This limit is currently around 1.5 weeks but is dynamic and subject to shrink as the number of tweets per day continues to grow.” Artificial limits cause a scarcity in retrieval methods.

APIs often change which has major implications for the applications built on top of them. In a worst case scenario applications may stop to function, especially if the platform providing the API fails to notify developers. Gowalla developer Ben Dodson wrote an extensive open letter to Gowalla about their lack of communication in API changes:

The major problem with the API is its fluid and changeable nature. Whilst we accept that any application will inevitably have bug fixes and changes, an API is supposed to provide a stable endpoint on which third party services can rely on. (Dodson via Techcrunch)

In a ‘perfect’ networked information ecosystem an API is open and stable for developers and researchers to be able to rely on the continuity of tools.

In the case of scraping a seemingly simple interface change can also break the tools built on top of them. This happened to Scroogle, “serving up privacy-friendly Google search results,” which was built on top of google.com/ie. When Google decided to discontinue IE6 support the google.com/ie page automatically redirected to http://www.google.com/toolbar/ie8/sidebar.html and Scroogle stopped working. Scroogle has since been brought back to life with the help of its users.6

So how can we address the issues of volatile methods caused by the ephemerality of the web? Martina Mertz introduces the notion of plastic methods, methods that are not solid, and methods that can monitor change. Urs Gasser calls for methods that can learn themselves. Sandvig notes that the pace of science is different than the pace of the web. Can scientific methods keep up with the pace of the web?

Switzerland 2010

Eszter Hargittai and Christian Sandvig at the workshop

  1. “cultivating.” Merriam-Webster Online Dictionary. 2010. Merriam-Webster Online. 17 May 2010 <http://www.merriam-webster.com/dictionary/cultivating>[]
  2. “cultivated.” WordNet Search. 2010. 17 May 2010 <http://wordnetweb.princeton.edu/perl/webwn?s=cultivated>[]
  3. Helmond paraphrased by Sandvig[]
  4. Helmond paraphrased by Sandvig[]
  5. banned implies that you cannot access Twitter but your Twitter activity is actually ‘frozen’ until your rate limit is over[]
  6. Tuesday evening, thanks to some help from a trio of Scroogle users, Brandt was able to replicate his setup via another page – google.com/search – by adding an extra parameter (“&output=ie”) to the url. “It appears that both methods,” Brandt says, meaning the old and the new, “amount to the same thing.” Metz, Cade, ‘Scroogle scrapes back to life’, The Register, 2010 [accessed 17 May 2010].[]

15 Replies

  1. Farida Vis Reply

    Thanks for posting this. As someone who is thinking through similar issues it was good to be reminded of some of the sticking points you highlights, especially in relation to tools and applications built on top of APIs.

    I am also wondering if the bandit metaphor is very helpful and to even take it a step further, whilst certain methods have traditionally been used within certain disciplines, this clinging on to disciplines is just not that helpful.

    Much of this familiar (and safe) disciplinary groundedness is called into question when trying to research online platforms such as YouTube, or in your example Twitter. In my opinion such research is a not-to-be missed opportunity for different researchers across disciplines to collectively work to better grasp these technologies and their users and the larger social implications.

    Whilst my own disciplinary home is the broad church of Media and Cultural studies, I benefitted enormously from working with a computer scientist on my current project. We look at the ways in which young people have used YouTube to respond to anti Islam film Fitna made by right wing MP Geert Wilders. One of the main issues we faced was the collection of the data and it wasn’t until I came up against a whole host of problems (coding metadata manually!) that I ended up in a spontaneous collaboration with Mike Thelwall, webometrics expert, who built us a (polite) tool using the YouTube/Google API. Such tools or knowledge thereof is not readily available to many within ‘my’ field.

    But that’s only the start of it though. Whilst Computer and Information Science may be very good at gathering data on a large scale, their focus is quantitative. So whist it was a phenomenal evolution in the project to all of sudden have 1413 videos, with all metadata automatically coded, the real questions of what this data actually means, why it matters socially and so on can of course not be answered by such quantitative means. We thus employed a range of further methods (genre analysis, content analysis, network analysis, survey/interviews and other textual analyses) to investigate more deeply what can be said about what is going on with this corpus. The data gathering, although a very important step, is thus only an initial one.

    Moreover, in our collaboration, Mike was not simply reduced to being a technician, but through constant feedback loops he has become an integral part of the project and its methodological development. So much so, that following this project (ending this month) we will continue to work together in this area.

    I like what you say about the ‘plastic’ methods, the ephemeral nature of the web and the possibility for tracking change. Your points about the API are well put and we will face the same problem with our tool, but in line the open access mentality we have made it available for anyone who wishes to use it, pointing out its limitations and its potential sell by date. It cannot bypass the search limit imposed by Google, but through widely extending your search terms you can still build a strong corpus with a reasonable level of reliability.

    A conference that you might be interested in, The Social Life of Methods (Mike and I are presenting a methods paper on YouTube there) will hopefully deal with some the issues you raised: http://bit.ly/acLUs9

    For more information on our project: http://bit.ly/cj0m2G

    • Anne Reply

      @Farida Thank you very much for your extensive insightful comment. I think you’re absolutely right that interdisciplinary research is important in the current era of humanities and specifically mediastudies. The field of humanities is changing with an increase of ‘texts’ appearing on the web which has its own structuring mechanisms that require need new methods to study and analyze these texts and their structuring mechanisms. These new methods often require skills that previously belonged to computer science or studies like artificial intelligence.

      At the Digital Methods Initiative we are also blessed to have someone from a science and technology studies background who is not only a collaborator but a full member and one of the main forces behind the new tools and methods. It seems that such research groups as also described by you with Mike may be a key element in a way forward in developing new methods. Do you maybe have a link to the tool you describe that you’re currently working on?

      Thank you for the link to the The Social Life of Methods, it sounds absolutely interesting!

  2. Daniel Haeusermann Reply

    Hi Anne, very nice post!

    I guess the difference between banditry and parasitic behavior and the use of methods from other disciplines is that methods are a public good, whereas bandits and parasites steal private goods. The banditry metaphor is nonetheless nice, not just because the victim will be angry but also because it relates to the effrontery which is necessary on the part of the bandit.

  3. Farida Vis Reply

    For more information on the e-tool, including a user guide and how it can be downloaded along with a set of other webometrics tools developed by Mike Thelwall, please visit: http://lexiurl.wlv.ac.uk/searcher/youtube.html

    Your Digital Methods Initiative looks really great and I wonder if we cannot organise some sort of meet-up in the not so distant future. We might think about doing something together.

    I was quite last minute in sending in my abstract for the Social Life of Methods conference, but I did send out a rather last minute tweet to see who might be interested in putting panel together (The_Ed cc you in), but I think you missed that one and it was super last minute in any case. Next time!

  4. Anne Reply

    Thanks for the links, I sent them to my colleagues!

Reply