The digitisation of newspapers: how to turn a page

Newspapers have become an important medium since the 19th century. They offer valuable information about events and opinions in past and present societies. The large-scale digitisation of newspapers that is currently taking place in many (national) libraries holds the promise of a bonanza of data for historical research. However, in the process of creating online access to digitised newspapers, choices are made and changes occur that affect the informational and artefactual value of a source, and historians should be able to identify and understand these changes. What appears on your screen at home is quite different from what you can hold in your hands at an archive. Moreover, national libraries have to deal with all kinds of financial, legal and technical constraints, which, together with their own institutional history, determine their digitisation policy. Some libraries outsource the work of processing and exploiting data through public-private partnerships with companies. This assignment covers two approaches: 1. You are asked to consider the choices made and the technologies used for the digitisation and online publication of various newspapers; 2. You will explore the diversity of news coverage and exploit the opportunities of online access to newspapers by conducting comparative research.

card

Instructions

For a general introduction to the subject of newspaper digitisation, watch these clips on two large-scale digitisation projects, one from the United States that discusses all aspects of the Chronicling America project and the digitisation process, and one from Europe, produced to promote the Europeana Newspapers project. Also watch the clip about retired engineer Tom Tryniski, who decided to digitise a gigantic corpus of newspapers and offer them online free of charge.

Then go through the key questions that should be asked when applying source criticism to digitised newspapers:

Selection

  • Why was this collection selected for digitisation?
  • Is it complete? What part of the collection is covered?
  • Is it representative?

Transformation from analogue to digital source

  • How were the newspapers processed? Have all the newspaper titles been processed in the same way?
  • What do the original newspapers look like? Are they in black and white or in colour? What is the colour of the paper? Is this visible on your screen?
  • Was the digitisation carried out from microfilms or from hard copies?
  • What colour makes them easier to read in digital form?
  • How does digitisation handle differences between newspapers from different periods?

Retrieval

  • Are the newspapers searchable? Can we perform a full text search on their content?
  • What is the quality of the optical character recognition (OCR) process?
  • Has article segmentation been applied? Can we search within advertisements and images?

Instructions

You are going to find out how and why newspapers are digitised and what is needed to be able to consult them online. Because of the diversity in approaches to digitisation projects and web design and possible unfamiliarity with online archives, there is a risk of not being able to see the wood for the trees while completing this assignment.
To give you an example of what kind of answers are expected in response to these questions and what the variables in the table mean, we have created a sample answer on the basis of the Luxembourg online newspaper archive eluxemburgensia.lu and the Tageblatt newspaper.

  • Open up this PDF cit, print it and use it as a reference to search for other newspapers.

6.a Who digitises, with what purpose and how is access arranged?

  • A list of initiatives across the globe has been collected on Wikipedia. What projects have been completed or are ongoing in your own country?
  • Open up the links to the collections of digitised newspapers in the table, browse through the content and choose two newspapers in a language that you master.
  1. Institution 2. Type of access 3. Interface 4. Collection specificities 5. Metadata title
Europeana newspapers (multilingual)          
E-Newspapers CH (FR+DE)          
Gallica (FR)            
Retronews (FR)          
ProQuest (EN)          
Chronicling America (EN)          
Delpher (NL)          
ANNO (DE)          
Zeitungsportal DDR-Presse (DE)           

Complete the table in your template with the answers to the following questions:

  • What type of institution carried out the digitisation? Public/private, for commercial/academic/preservation purposes?
  • How can you access these collections?
  • Is a subscription needed? Is there remote access or is it limited to the premises of the institution?
  • Is there a dedicated interface for the digitised newspapers or is the interface common to other sources such as books and images?
  • Where do the titles come from? What languages are they available in? Are there any special subcollections?
  • What information is provided on each individual title by the institution?

6.b The newspaper as a historical source

Newspapers are published regularly and collected on a daily or weekly basis, representing events that take place in society. There is generally a great diversity of newspapers, sometimes with unexpected titles or languages, in a national library’s collection. For instance, in the Austrian National Library, we can find newspapers printed throughout the former Austro-Hungarian Empire until 1918. In the Polish National Library, there are newspapers reflecting the presence of specific cultural communities in the past, written in Hebrew, Yiddish, Russian and German. This item on the news website Gizmodo about newspapers with long runs illustrates the continuity of some newspapers. It is striking to see the difference in appearance between old and current newspapers. Reading a newspaper is also a practice that contributes to a sense of national identity. You will be comparing print newspapers with digital newspapers and the representation of events over time.

  • Collect articles on two events, in either print or digital form, and recreate the chronology of these events.

For the “hard-copy” search, go to the library and collect articles on that event from the available newspapers (suggestions: an election, a sports event, an artistic performance such as a concert, etc.).

For the digitised version, choose a historical event such as the sinking of the Titanic, the assassination of Archduke Franz Ferdinand of Austria in 1914, the publication of Anne Frank’s diary in 1952, the Woodstock festival of 1969, the Watergate scandal in 1974 or the fall of the Berlin Wall in 1989.

  • For each, collect:
    • 3 articles published before (anticipation), during (description) and after (comments, consequences) the event,
    • Search for 3 articles commemorating this event 10, 20 and 100 years after.
       
  • Complete the tables in your template.
  • Write a short essay of about 500 words on your findings based on your answers to these questions:
    • What has changed in the perception of the event/people involved/situation? (short term, long term)
    • What are the main differences between collecting articles from print newspapers and digitised newspapers? (sitting behind your screen, going to the archive/library)

6.c Newspapers and websites as sources of knowledge

Depending on your age and educational background, you will be accustomed to specific kinds of news: television, radio, newspapers, news feeds through your mobile phone or websites that publish news. This assignment asks you to reflect on the sources of your knowledge on current cultural, economic and political developments.

  • Choose a current topic or event and compare how it is represented in a print newspaper, on a news website, in a newsreel on the radio and on the television.
  • Complete the table in your template on the basis of these key questions:
    • Why have you chosen these particular news sources?
    • Do you trust these sources?
    • What is your trust based on?

Reading/viewing suggestions