Complexity Explorer Santa Few Institute

Foundations & Applications of Humanities Analytics (spring 2022)

Lead instructor:

This course is no longer in session.

8.1 Case Study: Capitalism & Democracy » Chapter 5 part 2 Overview

What you will learn from this chapter

We will continue to explore the case study of 'capitalism' and 'democracy' in newspapers over time. You will learn how simple quantitative techniques can be used to interrogate linkages between concepts within a corpus. Specifically, you will learn how to measure the degree to which the two concepts "capitalism" and "democracy" are linked within the newspaper corpus through co-occurrance. You will learn how to identify historical inflection points in the conceptual linkage between "capitalism" and "democracy" by performing a diachronic analysis of linkage.


Key terms to keep in mind


Co-occurence   Being present in the same place/time. Two words "co-occur" when they appear in the same unit of analysis within a corpus. In the case study used in this chapter, the unit of analysis is a newspaper article, and two words co-occur when they both appear in the same newspaper article. As used in humanities analytics, co-occurence does not necessitate that words are directly apposed or in a particular order. "Co-appear" is also used in the lecture to mean the same.


Linkage   An operationalization of conceptual connection: the extent to which two ideas (A and B) are connected in a particular context (e.g., a particular newspaper’s archives). Linkage is measured as the ratio between (1) the fraction of time an article contains at least one word associated with idea A and at least one word associated with idea B, and (2) the fraction of the time the word(s) associated with idea A and word(s) associated with idea B are expected to appear by chance in the same article. A larger linkage value indicates that it is more likely the two ideas are meaningfully – intentionally – connected.

A specific example of linkage used in this lecture is the R-ratio: we divide (1) the probability of an article in a given newspaper in a given year mentioning both "capitalism" and "democracy" by (2) the probability of co-occurence that we would expect under the null model, in which "capitalism" and "democracy" are not conceptually linked. The greater this ratio and the larger the R-value, the more linked capitalism and democracy are in a particular newspaper in a particular year.


Null model   A model that assumes that a given hypothesis does not hold. In the case study used in this Chapter, the hypothesis is: "There is a conceptual linkage between capitalism and democracy", and so the null model is: "There is no conceptual linkage between capitalism and democracy." The null model assumes the hypothesis of independence, which posits that two events are independent, or unlinked.