Lesson about the fundamentals of the web and web archiving for historians.
This lesson teaches the basic technology of the web to historians based on a case study in which two websites on the interview collection of the psychologist David Boder created in 2000 and 2009 are compared. It then explores the uses and challenges of web archives for historians.
An animated video lecture of 10 minutes offers an introduction to the basics of web technology. Then a series of assignments covers the history of the web, the technologies upon which it is based, and how to deal with web archives.
Lars Wieneke about changes in web technology
In the clip above, engineer Lars Wieneke explains how over time web technologies increasingly broadened the range and scale of data that could be shared and shown through the web. To illustrate these changes he elaborates on the two websites about the interview collection of the psychologist David Boder, the topic of another lesson on Ranke2, that were developed in 2000 and 2009. Understanding the changes brought about by software and languages such as XML (Extensible Markup Language) and PHP (Hypertext Preprocessor) is crucial in being able to apply source criticism to a website. However, as historians we should first place the topic into its historical context: how did websites evolve in the first place and what technologies were needed to make them work? These assignments will briefly explore the history of the web and the technological developments that make it work. They will then dive into the differences between the web and the internet, before discussing the physical infrastructure that allows the world to be globally connected.
Watch this 35 min documentary created by the Web Foundation about how Tim Berners Lee created the world wide web.
As described in the clip by Lars Wieneke, the David Boder websites from 2000 and 2009 changed over time as new technology became available. The older version from 2000 no longer exists on the “live” web, but a comparison between the version published in 2000 and the new 2009 version is possible thanks to the archived web. In this assignment, you will learn about the basics of the archived web and become familiar with one of the most popular and useful resources to access archived web pages – the Internet Archive’s Wayback Machine. At the same time you will learn about the challenges and limits of web archiving from a historian’s point of view.
See an archived version of the very first website ever created in 1991 by Tim Berners-Lee and archived by CERN in Switzerland:
Fortunately, the Internet Archive is not the only institution that is trying to archive the web. Several other institutions are active on a smaller scale, usually for websites considered important or relevant to specific countries. Several European countries, such as Finland, France, Ireland, Spain and Sweden, have even included web archives in the legal deposit of their country, which means that they have attributed a similar status to websites as that given to archive material such as books and newspapers for centuries. See the examples of the UK, Denmark and France.
Yet the task of archiving the entire web is not an easy one. The explosive growth of online content, especially since the 2000s, has made it impossible for archives and organisations to archive every single website and its various versions over time. Even as the technological expertise at these institutions has increased, a decrease in activities at the national level can be observed, which leads to a stronger dependency on the IA. This is luckily not the case in Luxembourg, where the National Library (Bibliothèque nationale du Luxembourg or BnL) has been tasked with archiving the Luxembourgish web since 2016.
The act of archiving is not just driven by neutral concerns for preservation. It is very much embedded in ways of prolonging and solidifying one’s identity, status and position. According to Janne Nielsen, who proposes a clear distinction between “macro” and “micro” archiving, it is important to differentiate, for example, between a powerful institution that designs a preservation strategy for prosperity with a broad imaginary future audience in mind (“macro”) and a scholar at the end of a funded project who manages to conserve her data for future use within her academic career (“micro”). In the case of the EU, as the examples below show, preservation is also relevant for reasons of transparency about how decisions are taken or how legal frameworks intended to protect citizens and their cultural heritage evolve over time. The case study presented here – how the European Union deals with the preservation of its web archives – is an example of macro-archiving. The “level” of archiving in this context should be kept in mind throughout the example.
As creating websites has become increasingly easy for people without a technological background, there has been a steady increase in historical websites created by individuals that make use of open source applications. Wordpress, Google and Weebly are examples of such resources. The people who set up websites like this are often volunteers who want to share their passion for a subject or relatives of an older generation who want to preserve a record of a certain historical practice that they regard as important. The focus is often on local topics: the landscape, the village, the city or the neighbourhood. Yet the most documented experiences are of migration and war. The practice of passing family photo albums from one generation to the next does not exist when it comes to digital family heritage. How should websites or pages (with potential historical relevance) that are not embedded in an institutional setting or are posted on social media platforms be preserved for the future? On the basis of a case study below, you are going to explore the best strategies for micro-archiving.