<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Anne Helmond &#187; web archiving</title>
	<atom:link href="http://www.annehelmond.nl/tag/web-archiving/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.annehelmond.nl</link>
	<description>Anne Helmond. New Media Research Blog</description>
	<lastBuildDate>Thu, 12 Jan 2012 20:06:53 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.1</generator>
	<atom:link rel='hub' href='http://www.annehelmond.nl/?pushpress=hub'/>
		<item>
		<title>Archive 2020: Esther Weltevrede &#8211; Archiving Web Dynamics</title>
		<link>http://www.annehelmond.nl/2009/05/19/archive-2020-esther-weltevrede-archiving-web-dynamics/</link>
		<comments>http://www.annehelmond.nl/2009/05/19/archive-2020-esther-weltevrede-archiving-web-dynamics/#comments</comments>
		<pubDate>Tue, 19 May 2009 21:39:58 +0000</pubDate>
		<dc:creator>Anne</dc:creator>
				<category><![CDATA[Digital Methods Initiative]]></category>
		<category><![CDATA[Events]]></category>
		<category><![CDATA[archive2020]]></category>
		<category><![CDATA[cyberspace]]></category>
		<category><![CDATA[dmi]]></category>
		<category><![CDATA[esther weltevrede]]></category>
		<category><![CDATA[google]]></category>
		<category><![CDATA[internet archive]]></category>
		<category><![CDATA[national webs]]></category>
		<category><![CDATA[web archiving]]></category>

		<guid isPermaLink="false">http://www.annehelmond.nl/?p=769</guid>
		<description><![CDATA[Internet researchers are confronted with an instable object of study, the ephemerality of the object. The question is how to make the medium permanent so we can study it with care? The shape of the archive informs what I can ask the archive. This perspective on archives is placed within Weltevrede&#8217;s research into National Webs. [...]]]></description>
			<content:encoded><![CDATA[<p><a target="_blank" href="http://www.flickr.com/photos/silvertje/3545352811/" title="Archive 2020 by Anne Helmond, on Flickr" ><img src="http://farm3.static.flickr.com/2459/3545352811_7058cd69ae.jpg" alt="Archive 2020" width="500" height="332" /></a><br />
Internet researchers are confronted with an instable object of study, the ephemerality of the object. The question is how to make the medium permanent so we can study it with care? The shape of the archive informs what I can ask the archive.</p>
<p>This perspective on archives is placed within Weltevrede&#8217;s research into National Webs. To think nationally with the web might seem counterintuitively at first because dominant ideas of the web are so global. This originates from the 90s idea of  Cyberspace which is a universal space with ideas of disembodiment and identity play. Crucially, cyberspace is a place that is disembedded from reality. After 2000 cyberspace was confronted with what Weltevrede calls &#8220;the national turn.&#8221;</p>
<p>This may be seen in a number of places, probably most familiar is Google.com redirects you to the location you are at, for example Google.nl and you get a totally different result page. Another example is &#8220;This video is not available in your country&#8221; intellectual property is really dominant in the nationalization of web content. You might also think in the terms of language. English used to be the dominant universal language, there is a lot of clustering happening on the web based on a shared language.</p>
<p>To move to the web archive, the most exhaustive project in the field is the Internet Archive which originates from the cyberspace period (1996.) This can also be seen in how the archive was set up. First of all, the scope of the collection is the &#8220;whole&#8221; internet which is a very broad collection aim. Secondly, when you look at the interface of the archive, <a target="_blank" href="http://www.archive.org/index.php" title="IA" >the Wayback Machine</a>, what you immediately notice is that you query it by URL and browse from that point on. It is characterized by browsing instead the current dominant form: searching. The Internet Archive therefor privileges single site histories instead of researching its context.</p>
<p>The Internet Archive emerged from the web company Alexa and Alexa provides all the crawls and donates it to the archive. This means that the selection of sites is based on traffic data. If you have the Alexa toolbar installed every page you visit will be included in the archive. It is a very smart way to start thinking about which pages should be included in the archive. After the Internet Archive in 1996 a number of initiatives emerged with a national focus. The general thought behind that was that national web archives can best serve local wishes and demands and serve the community (researchers, general public) best.</p>
<p>As an example we will look at a Dutch web archive maintained by the <a target="_blank" href="http://www.kb.nl/index-en.html" title="KB" >Royal Library of the Netherlands</a>, the KB. Before we go into the actual project, let&#8217;s get a size of the Dutch web. The .nl domain is the fourth largest country domain with 3.2 million sites, an enormous amount.</p>
<p><a target="_blank" href="http://www.flickr.com/photos/silvertje/3545360557/" title="Archive 2020 by Anne Helmond, on Flickr" ><img src="http://farm4.static.flickr.com/3301/3545360557_3beee6ee07.jpg" alt="Archive 2020" width="500" height="332" /></a></p>
<p><strong>How to demarcate the national web</strong></p>
<ol>
<li> .nl is the 4th largest country domain</li>
<li>A second way to look at the national web (.nl is not the whole Dutch web you could argue) we can look at all the domains registered by the Dutch (sidn.nl 2008)</li>
<li>What do we Dutch people find relevant sites? We can look at the most visited websites as listed by Alexa. We find these sites important through the number of visits.</li>
</ol>
<p>These are three ways to think of how to define the national web by web means. The definition of the national aspect as used by the Royal Library is. They created a new definition of what is Dutch content.</p>
<ul>
<li>A: Website in Dutch, registered in the Netherlands</li>
<li>B: Website in another language, registered in the Netherlands</li>
<li>C: Website in Dutch, registered in another country</li>
<li>D: Website in another language, registered in another country, topic aimed at the Netherlands.</li>
</ul>
<p>All of these options seem technically feasible except for the last one. We cannot technically or automatically define content that is aimed at the Netherlands. It makes it highly unlikely that this Dutch web can be archived. What the Royal Library has done, is leave this definition and manually select sites. They started with 100 sites, it became 400 and now just over a 1000. They archive those sites really well.</p>
<p>As an internet researcher Weltevrede is particularly interested in the dynamics of websites. The contribution she would like to put forward is how else can we approach the object of collection, the Dutch web?</p>
<p><a target="_blank" href="http://www.flickr.com/photos/silvertje/3545359413/" title="Archive 2020 by Anne Helmond, on Flickr" ><img src="http://farm3.static.flickr.com/2462/3545359413_5f368140ea.jpg" alt="Archive 2020" width="500" height="332" /></a></p>
<p>If you start web archiving the most easy and effective method is to follow the possibilities of the medium. You can automate a lot of things and besides that you can also focus on the context and prominence of the website in a particular period. The first point calls attention to the challenge to develop methods that follow the medium to automate the collection process. You could<br />
schedule Google.nl for the query &#8220;.nl&#8221; because Google takes into account what is relevant, links to a website. These are not only considered relevant by Google but by a large group of people. Hyperlink structures are human acts of association, links die and emerge, what would that information provide us about the context and its network? If you would schedule it over time you could see the relevance of a particular source in a particular period. It would provide context for sources or websites, the born digital.</p>
<p><strong>The final questions are:</strong></p>
<ul>
<li>What would the national Web archive look like when the focus is on capturing hyperlinks, search engine results, and other digital objects?</li>
<li>What aspects besides the digital document are relevant to save and why?</li>
<li>Can we learn from how born digital devices (e.g. search engines, platforms and recommendation systems) make use of the objects, and if so, how can such uses be repurposed for Web archiving&gt;</li>
</ul>
<p><a target="_blank" href="http://www.flickr.com/photos/silvertje/3545358105/" title="Archive 2020 by Anne Helmond, on Flickr" ><img src="http://farm3.static.flickr.com/2178/3545358105_e3073fd18a.jpg" alt="Archive 2020" width="500" height="332" /></a></p>
<p>Final personal note: The day after this presentation (this morning) my friend and colleague Esther Weltevrede graduated Cum Laude from the University of Amsterdam on her research on Archiving Web Dynamics. She will continue her research on <a target="_blank" href="http://wiki.digitalmethods.net/Dmi/NationalWebConditionDiagnostics" title="NationalWebConditionDiagnostics" >National Webs</a> as a PhD candidate with the <a target="_blank" href="http://www.digitalmethods.net/" title="DMI" >Digital Methods Initiative</a>. Congratulations Esther!
<div id="tweetbutton769" class="tw_button" style=""><a target="_blank" href="http://twitter.com/share?url=http%3A%2F%2Fbit.ly%2Fa5oUpH&amp;via=silvertje&amp;text=Archive%202020%3A%20Esther%20Weltevrede%20%26%238211%3B%20Archiving%20Web%20Dynamics&amp;related=&amp;lang=en&amp;count=horizontal&amp;counturl=http%3A%2F%2Fwww.annehelmond.nl%2F2009%2F05%2F19%2Farchive-2020-esther-weltevrede-archiving-web-dynamics%2F"  class="twitter-share-button"  style="width:55px;height:22px;background:transparent url('http://www.annehelmond.nl/wordpress/wp-content/plugins/wp-tweet-button/tweetn.png') no-repeat  0 0;text-align:left;text-indent:-9999px;display:block;">Tweet</a></div>
 
<span class = "" style = " "><iframe src="http://www.facebook.com/plugins/like.php?href=http://www.annehelmond.nl/2009/05/19/archive-2020-esther-weltevrede-archiving-web-dynamics/&layout=button_count&send=false&show_faces=true&width=&action=like&colorscheme=light&font=" scrolling="no" frameborder="0" allowTransparency="true" style="border:none; overflow:hidden; width:px; height:px"></iframe></span> <div class='series_toc'><h4><strong>Article Series - Archive 2020 </strong></h4><ol><li><a href="http://www.annehelmond.nl/2009/05/19/archive-2020-introduction-by-annet-dekker/"  title='Archive 2020: Introduction by Annet Dekker'>Archive 2020: Introduction by Annet Dekker</a></li><li><a href="http://www.annehelmond.nl/2009/05/19/archive-2020-christiane-paul-whitney-artport/"  title='Archive 2020: Christiane Paul &#8211; Whitney Artport'>Archive 2020: Christiane Paul &#8211; Whitney Artport</a></li><li><a href="http://www.annehelmond.nl/2009/05/19/archive-2020-eric-kluitenberg-the-living-archive/"  title='Archive 2020: Eric Kluitenberg &#8211; The Living Archive'>Archive 2020: Eric Kluitenberg &#8211; The Living Archive</a></li><li><a href="http://www.annehelmond.nl/2009/05/19/archive-2020-olga-goriunova-runmeorg-reversion/"  title='Archive 2020: Olga Goriunova &#8211; Runme.org Reversion'>Archive 2020: Olga Goriunova &#8211; Runme.org Reversion</a></li><li><a href="http://www.annehelmond.nl/2009/05/19/archive-2020-monika-fleischmann-netzspannungorg/"  title='Archive 2020: Monika Fleischmann &#8211; Netzspannung.org'>Archive 2020: Monika Fleischmann &#8211; Netzspannung.org</a></li><li>Archive 2020: Esther Weltevrede &#8211; Archiving Web Dynamics</li></ol></div> <div class='series_links'><a href="http://www.annehelmond.nl/2009/05/19/archive-2020-monika-fleischmann-netzspannungorg/"  title='Archive 2020: Monika Fleischmann &#8211; Netzspannung.org'>Previous in series</a> </div>]]></content:encoded>
			<wfw:commentRss>http://www.annehelmond.nl/2009/05/19/archive-2020-esther-weltevrede-archiving-web-dynamics/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Archive 2020: Introduction by Annet Dekker</title>
		<link>http://www.annehelmond.nl/2009/05/19/archive-2020-introduction-by-annet-dekker/</link>
		<comments>http://www.annehelmond.nl/2009/05/19/archive-2020-introduction-by-annet-dekker/#comments</comments>
		<pubDate>Tue, 19 May 2009 15:30:38 +0000</pubDate>
		<dc:creator>Anne</dc:creator>
				<category><![CDATA[Events]]></category>
		<category><![CDATA[archive2020]]></category>
		<category><![CDATA[born digital]]></category>
		<category><![CDATA[virtueel platform]]></category>
		<category><![CDATA[web archiving]]></category>

		<guid isPermaLink="false">http://www.annehelmond.nl/?p=763</guid>
		<description><![CDATA[On Monday 18 May 2009 in Amsterdam Virtueel Platform organized Archive 2020, an international event on the archiving of born digital cultural content. Born digital content describes digital materials that originated in the digital realm, and have no print or analog counterpart. (see full project description) Virtueel Platform bought second hand diskette disks at Marktplaats [...]]]></description>
			<content:encoded><![CDATA[<p><a target="_blank" href="http://www.flickr.com/photos/silvertje/3546180428/" title="Archive 2020 by Anne Helmond, on Flickr" ><img src="http://farm3.static.flickr.com/2153/3546180428_30f2f36f82.jpg" alt="Archive 2020" width="500" height="332" /></a></p>
<p>On Monday 18 May 2009 in Amsterdam Virtueel Platform organized Archive 2020, an international event on the archiving of born digital cultural content. Born digital content describes digital materials that originated in the digital realm, and have no print or analog counterpart. (<a target="_blank" href="http://www.virtueelplatform.nl/en/#2489" >see full project description</a>)</p>
<p>Virtueel Platform bought second hand diskette disks at Marktplaats and transformed them into name badges. I received a badge with a Windows 3.1 installation disk part 1 which I will never be able to use because I don&#8217;t own any equipment with a diskette drive and I don&#8217;t have the other disks. The problems archives are facing are addressed before coffee is being served.</p>
<p>Annet Dekker opens the Archive 2020 expert meeting with the remark that Virtueel Platform was slightly surprised that archiving is still hot. Recent publications about the topic point to the urgency of archiving in the case of <a target="_blank" href="http://www.abc.net.au/news/stories/2009/05/07/2563251.htm" >Australia&#8217;s online history &#8216;facing extinction.&#8217;</a> The Wired article on &#8216;<a target="_blank" href="http://www.wired.com/epicenter/2008/12/forget-storage/" >Forget Storage</a>, If You Want Files to Last Try Movage&#8217; includes <a target="_blank" href="http://www.kk.org/thetechnium/archives/2008/12/movage.php" >Kevin Kelly&#8217;s</a> somewhat poetic approach to archiving which he describes as &#8220;in, out, in, out. Copy, move, copy, move.&#8221;</p>
<p>The title of the event refers to two things:</p>
<ol>
<li>The archive as potentially envisioned in 2020. This includes the idea that 2020 is just a year and that the internet as we know it will not be there anymore.</li>
<li>20/20 also means perfect vision: archives are looking for a perfect vision.</li>
</ol>
<p><a target="_blank" href="http://www.flickr.com/photos/silvertje/3545375539/" title="Archive 2020 by Anne Helmond, on Flickr" ><img src="http://farm4.static.flickr.com/3413/3545375539_42499a9ae3.jpg" alt="Archive 2020" width="500" height="332" /></a>
<div id="tweetbutton763" class="tw_button" style=""><a target="_blank" href="http://twitter.com/share?url=http%3A%2F%2Fbit.ly%2F94N8hh&amp;via=silvertje&amp;text=Archive%202020%3A%20Introduction%20by%20Annet%20Dekker&amp;related=&amp;lang=en&amp;count=horizontal&amp;counturl=http%3A%2F%2Fwww.annehelmond.nl%2F2009%2F05%2F19%2Farchive-2020-introduction-by-annet-dekker%2F"  class="twitter-share-button"  style="width:55px;height:22px;background:transparent url('http://www.annehelmond.nl/wordpress/wp-content/plugins/wp-tweet-button/tweetn.png') no-repeat  0 0;text-align:left;text-indent:-9999px;display:block;">Tweet</a></div>
 
<span class = "" style = " "><iframe src="http://www.facebook.com/plugins/like.php?href=http://www.annehelmond.nl/2009/05/19/archive-2020-introduction-by-annet-dekker/&layout=button_count&send=false&show_faces=true&width=&action=like&colorscheme=light&font=" scrolling="no" frameborder="0" allowTransparency="true" style="border:none; overflow:hidden; width:px; height:px"></iframe></span> <div class='series_toc'><h4><strong>Article Series - Archive 2020 </strong></h4><ol><li>Archive 2020: Introduction by Annet Dekker</li><li><a href="http://www.annehelmond.nl/2009/05/19/archive-2020-christiane-paul-whitney-artport/"  title='Archive 2020: Christiane Paul &#8211; Whitney Artport'>Archive 2020: Christiane Paul &#8211; Whitney Artport</a></li><li><a href="http://www.annehelmond.nl/2009/05/19/archive-2020-eric-kluitenberg-the-living-archive/"  title='Archive 2020: Eric Kluitenberg &#8211; The Living Archive'>Archive 2020: Eric Kluitenberg &#8211; The Living Archive</a></li><li><a href="http://www.annehelmond.nl/2009/05/19/archive-2020-olga-goriunova-runmeorg-reversion/"  title='Archive 2020: Olga Goriunova &#8211; Runme.org Reversion'>Archive 2020: Olga Goriunova &#8211; Runme.org Reversion</a></li><li><a href="http://www.annehelmond.nl/2009/05/19/archive-2020-monika-fleischmann-netzspannungorg/"  title='Archive 2020: Monika Fleischmann &#8211; Netzspannung.org'>Archive 2020: Monika Fleischmann &#8211; Netzspannung.org</a></li><li><a href="http://www.annehelmond.nl/2009/05/19/archive-2020-esther-weltevrede-archiving-web-dynamics/"  title='Archive 2020: Esther Weltevrede &#8211; Archiving Web Dynamics'>Archive 2020: Esther Weltevrede &#8211; Archiving Web Dynamics</a></li></ol></div> <div class='series_links'> <a href="http://www.annehelmond.nl/2009/05/19/archive-2020-christiane-paul-whitney-artport/"  title='Archive 2020: Christiane Paul &#8211; Whitney Artport'>Next in series</a></div>]]></content:encoded>
			<wfw:commentRss>http://www.annehelmond.nl/2009/05/19/archive-2020-introduction-by-annet-dekker/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Goodbye Geocities: On Archiving Websites</title>
		<link>http://www.annehelmond.nl/2009/04/28/goodbye-geocities-on-archiving-websites/</link>
		<comments>http://www.annehelmond.nl/2009/04/28/goodbye-geocities-on-archiving-websites/#comments</comments>
		<pubDate>Tue, 28 Apr 2009 21:34:05 +0000</pubDate>
		<dc:creator>Anne</dc:creator>
				<category><![CDATA[Digital Methods Initiative]]></category>
		<category><![CDATA[Events]]></category>
		<category><![CDATA[Software Studies]]></category>
		<category><![CDATA[aesthetics]]></category>
		<category><![CDATA[geocities]]></category>
		<category><![CDATA[nostalgia]]></category>
		<category><![CDATA[web archiving]]></category>

		<guid isPermaLink="false">http://www.annehelmond.nl/?p=751</guid>
		<description><![CDATA[In 1996 I created one of my first homepages, a tribute website to the Canadian band Eric&#8217;s Trip. I was able to claim a beautiful Geocities url: http://www.geocities.com/SunsetStrip/3500/ It&#8217;s one big piece of pure nostalgia and 1996 web aesthetics: Photoshop flares, optimized for Netscape and hand-coded with HTML Notepad. The Last Update JavaScript stamp reads [...]]]></description>
			<content:encoded><![CDATA[<p>In 1996 I created one of my first homepages, a tribute website to the Canadian band Eric&#8217;s Trip. I was able to claim a beautiful Geocities url: <a target="_blank" href="http://www.geocities.com/SunsetStrip/3500/ " title="http://www.geocities.com/SunsetStrip/3500/ " >http://www.geocities.com/SunsetStrip/3500/</a></p>
<p><img class="alignnone" title="ET" src="http://www.annehelmond.nl/geocities/welcome.jpg" alt="" width="556" height="96" /></p>
<p>It&#8217;s one big piece of pure nostalgia and 1996 web aesthetics: Photoshop flares, optimized for Netscape and hand-coded with HTML Notepad. The Last Update JavaScript stamp reads 08/30/1997 18:50:02 but I haven&#8217;t updated the page since 1996. The stamp says 1997 because Geocities used to insert advertising which would fool the script into thinking the page had been updated.</p>
<p>ASCII and the Archive Team have started to archive Geocities and the progress is described in <a target="_blank" href="http://ascii.textfiles.com/archives/1961" >&#8216;Geocities: Lessons So Far.&#8217;</a> There are two great applications to backup your own, long forgotten, Geocities website:</p>
<ul>
<li>PC: <a target="_blank" href="http://www.tensons.com/products/websiterippercopier/" title="Website Ripper Copier" >Website Ripper Copier</a></li>
<li>MAC: <span class="status-body"><span class="entry-content"><a target="_blank" href="http://www.sitesucker.us/home.html" title="SiteSucker" >SiteSucker</a> (Thanks <a target="_blank" href="http://twitter.com/daniel_rehn" >Daniel Rehn!</a>) </span></span></li>
</ul>
<p>I now host <a href="http://www.annehelmond.nl/geocities/" title="Anne Geocities" >The Unofficial Eric&#8217;s Trip Homepage</a> on this webserver, have a look at my 1996 design skills (watch your steps: broken links, hint: choose l0-fi). PC World wrote a great nostalgic article on the end of the Geocities era: &#8216;<a target="_blank" href="http://www.pcworld.com/article/163765/so_long_geocities_we_forgot_you_still_existed.html" >So Long, GeoCities: We Forgot You Still Existed</a>&#8216;</p>
<p>Esther Weltevrede, my colleague at the <a target="_blank" href="http://www.digitalmethods.net" title="Digital Methods Initiative" >Digital Methods Initiative</a>, will be talking about Archiving Web dynamics at the <a target="_blank" href="http://www.virtueelplatform.nl/en/#2489" >Archive 2020 meeting</a> which I will blog about for Virtueel Platform. Looking forward to it!
<div id="tweetbutton751" class="tw_button" style=""><a target="_blank" href="http://twitter.com/share?url=http%3A%2F%2Fbit.ly%2Fd2ntKB&amp;via=silvertje&amp;text=Goodbye%20Geocities%3A%20On%20Archiving%20Websites&amp;related=&amp;lang=en&amp;count=horizontal&amp;counturl=http%3A%2F%2Fwww.annehelmond.nl%2F2009%2F04%2F28%2Fgoodbye-geocities-on-archiving-websites%2F"  class="twitter-share-button"  style="width:55px;height:22px;background:transparent url('http://www.annehelmond.nl/wordpress/wp-content/plugins/wp-tweet-button/tweetn.png') no-repeat  0 0;text-align:left;text-indent:-9999px;display:block;">Tweet</a></div>
 
<span class = "" style = " "><iframe src="http://www.facebook.com/plugins/like.php?href=http://www.annehelmond.nl/2009/04/28/goodbye-geocities-on-archiving-websites/&layout=button_count&send=false&show_faces=true&width=&action=like&colorscheme=light&font=" scrolling="no" frameborder="0" allowTransparency="true" style="border:none; overflow:hidden; width:px; height:px"></iframe></span>]]></content:encoded>
			<wfw:commentRss>http://www.annehelmond.nl/2009/04/28/goodbye-geocities-on-archiving-websites/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
	</channel>
</rss>

