The Use of Sentiment Analysis to Understand Historical Public Opinion

Introduction: A New Lens on the Past

For centuries, historians relied on a limited set of primary sources—eyewitness accounts, official documents, and published works—to reconstruct the emotional landscape of bygone eras. These sources, while invaluable, are necessarily selective and often reflect the views of a small, literate elite. The vast majority of historical voices—the reactions of ordinary citizens, the shifting moods of a populace, the silent waves of approval or dissent—remained largely inaccessible. The emergence of digital humanities and computational text analysis has begun to change this. Sentiment analysis, or opinion mining, offers historians a powerful new tool to systematically measure emotional tone across massive collections of text, providing a quantitative backbone for understanding public opinion in the past. By training algorithms to detect positive, negative, or neutral sentiment, researchers can trace how collective feelings evolved in response to wars, economic depressions, political movements, and cultural shifts. This article explores the methodology, applications, challenges, and transformative potential of using sentiment analysis to uncover the emotional history of societies.

What Is Sentiment Analysis? A Technical Primer

At its core, sentiment analysis is a subfield of natural language processing (NLP) that uses computational methods to identify and extract subjective information from text. The goal is to classify the polarity of a given passage—whether it expresses a positive, negative, or neutral sentiment—and often to gauge the intensity of that sentiment. Modern approaches include both lexicon-based methods and machine learning models.

Lexicon-Based Approaches

Traditional sentiment analysis relies on pre-built dictionaries of words annotated with emotional valence. For example, a word like "joyful" might be scored as +3, while "despair" might be scored as -3. The algorithm counts the frequency and intensity of these words in a text and averages them to produce an overall sentiment score. Popular lexicons include the AFINN, SentiWordNet, and the General Inquirer. For historical research, lexicons must be carefully adapted, as word meanings and emotional connotations shift over time. A term like "awful" in the 18th century could mean "awe-inspiring" rather than "terrible." Researchers often build domain-specific or period-specific lexicons to improve accuracy.

Machine Learning and Deep Learning

More advanced systems train models on large corpora of text that have been manually labeled for sentiment. These models learn to recognize patterns, including word sequences, syntactic structures, and context. Recurrent neural networks (RNNs) and transformer-based architectures like BERT have achieved state-of-the-art performance in general sentiment tasks. For historical text, fine-tuning these models on period-appropriate language—capturing shifts in grammar, vocabulary, and idiom—remains an active area of research. Hybrid approaches that combine lexicons with machine learning offer a practical balance for historians who may lack the resources to train custom deep learning models and are often used for large-scale studies.

Building the Historical Corpus: Sources of Textual Data

The quality and scope of a sentiment analysis study depend critically on the data. Historical researchers compile datasets from a variety of digitized sources, each with its own strengths and limitations.

Digitized Newspapers and Periodicals

Massive collections like the Chronicling America database from the Library of Congress or the British Newspaper Archive provide millions of pages spanning centuries. Newspapers are rich in editorial commentary, letters to the editor, and reported speeches—making them ideal for tracking public discourse. However, they represent the views of publishers and the literate public, not necessarily the entire population. OCR (optical character recognition) errors can introduce noise, requiring careful preprocessing.

Personal Diaries and Letters

Collections of personal writings offer a more intimate window into sentiment. Projects like the Diaries of War or the transcribed correspondence of historical figures allow researchers to track individual emotional trajectories over time. These texts are often less formal and more emotionally direct, making them valuable for sentiment analysis. The trade-off is that they are less representative of the general public and require careful handling of idiosyncratic language.

Political Speeches and Government Documents

Official records, including parliamentary debates, presidential addresses, and diplomatic cables, provide a formal register of sentiment. These texts are highly structured and often use carefully crafted language to project a specific tone. Sentiment analysis on speeches can reveal shifts in political rhetoric—for example, the rising negative sentiment in the months leading up to a war. The American Presidency Project and the Hansard corpus are commonly used sources.

Pamphlets and broadsides were the viral content of earlier centuries, often filled with strong opinions. Novels and literary works, while fictional, encode cultural sentiments and moral panics. More recently, archives of early social media—such as Usenet posts from the 1980s and 1990s—offer a bridge to the digital era. The widening availability of digitized text through initiatives like Google Books and HathiTrust continues to expand the historian's toolbox.

Applications: Case Studies in Historical Sentiment Analysis

Sentiment analysis has been applied to a growing range of historical questions, yielding insights that complement traditional qualitative methods.

Tracking Wartime Morale and Public Anxiety

In the study of the American Civil War, researchers have analyzed thousands of diaries and letters using sentiment analysis to map fluctuations in morale among soldiers and civilians. By scoring each entry for emotional valence, they observed that negative sentiment spiked after major battles and during periods of economic hardship, while positive sentiment correlated with Union victories and news of emancipation. This quantitative approach confirmed patterns that historians had long suspected, but also revealed unexpected regional variations—for example, that morale in the Confederate home front often followed a different rhythm than that of soldiers.

The Great Depression of the 1930s left a massive textual footprint in newspapers, government reports, and letters to the White House. One study applied sentiment analysis to letters sent to President Franklin D. Roosevelt, finding that negative sentiment peaked around bank holidays and the passage of the Social Security Act—the latter generating controversy. The research demonstrated how sentiment shifted from desperation to cautious optimism as New Deal programs took effect. Similarly, analysis of newspaper editorials during the Panic of 1893 tracked the rise of populist rhetoric and anti-immigration sentiment.

The women's suffrage movement in the early 20th century can be studied through the lens of sentiment in newspapers and pamphlets. Researchers have charted how pro-suffrage sentiment increased in mainstream newspapers following major protests and the entry of the U.S. into World War I, which framed women's contributions to the war effort. Sentiment analysis on the writings of suffragists themselves reveals a complex mix of frustration, hope, and strategic anger. The method also allows for comparative studies across different regions and periods, showing how the movement's emotional tenor varied.

Tracing Long-Term Shifts in Cultural Mood

Perhaps the most ambitious applications attempt to measure the "mood" of an entire cultural period. For example, researchers have used sentiment analysis on millions of books from the 19th and 20th centuries to track the ebb and flow of optimism versus pessimism. These studies suggest that the Victorian era had a distinct sentiment cycle, with peaks during periods of imperial expansion and troughs during economic depressions. While such large-scale analyses are coarse-grained, they provide a panoramic view that complements in-depth studies of individual events.

Methodological Challenges and Best Practices

Applying sentiment analysis to historical text is not a simple matter of running a modern algorithm on old data. Historians must contend with a set of unique challenges.

Linguistic Change and Context

Word meanings evolve. "Nice" once meant "foolish" or "ignorant"; "silly" originally meant "blessed" or "happy." Sarcasm, irony, and metaphor are notoriously difficult for algorithms, and historical texts are rich with them. A sentence like "What a brilliant idea to sink the ship" could be interpreted as positive by a naive lexicon. Researchers must train models on period-specific language and often use human annotations to validate results. They also need to account for genre-specific conventions—academic writing differs in emotional tone from a personal diary.

Sampling Bias and Representativeness

The texts that survive and are digitized are not a random sample of historical opinion. Archives often overrepresent the voices of the elite, the literate, and the politically active. Women, minorities, and the poor are underrepresented. Sentiment analysis can inadvertently reinforce these biases if the dataset is not carefully constructed. Researchers must combine computational methods with traditional historical criticism, asking who wrote the text, for whom, and under what circumstances.

Validation and Ground Truth

How does one know the algorithm is correctly measuring sentiment? In modern applications, ground truth is established through human labeling. For historical text, it is often impossible to ask people from the era how they felt. Researchers must use other historical evidence—such as election results, protest participation, or contemporaneous accounts—to validate sentiment scores. They may also rely on multiple independent human annotators trained in historical contexts to label a subset of the data.

Scale vs. Depth

Sentiment analysis excels at pattern detection across thousands of texts, but it often misses the subtlety of individual voices. A single diary entry might be deeply ambivalent, but the algorithm may average it to neutral. The most effective studies combine quantitative analysis with close reading of representative texts. This mixed-methods approach respects the complexity of human emotion while leveraging the power of computation.

The Future of Historical Sentiment Analysis

As NLP technology advances and more historical text becomes available in machine-readable form, the field is poised for significant growth. Several emerging trends promise to deepen our understanding of historical public opinion.

Multilingual and Cross-Cultural Analysis

Current sentiment analysis is heavily focused on English. Extending it to other languages—and, more challengingly, to dead languages like Latin or medieval English—will open new vistas for global historical research. Projects like Translatio are developing historical language models for ancient and medieval texts. Cross-cultural studies could compare public sentiment during similar events—for example, how different societies reacted to the outbreak of World War I.

Fine-Grained Emotion Detection

Rather than simple positive/negative polarity, new models can detect discrete emotions such as anger, fear, trust, surprise, and joy. This is particularly useful for history, where the emotional palette is nuanced. Anger might predate a revolution, while fear might accompany a plague. Fine-grained emotion analysis can help historians pinpoint the triggers of collective action.

Temporal Dynamics and Event Detection

Sentiment time series can be correlated with historical events, but the algorithms themselves can also help identify events. For example, a sudden spike in negative sentiment across a corpus might signal an unnoticed crisis or turning point. This event detection capability turns sentiment analysis into a tool for discovery, not just measurement.

Integrating with Other Data Sources

The richest historical insights will come from combining sentiment analysis with other forms of quantitative data—economic indicators, weather records, demographic statistics. Linking emotional tone to crop prices or mortality rates, for instance, can reveal how economic hardship translated into public mood. The rise of the Historical Data Initiative at the NBER and similar projects provides the infrastructure for such interdisciplinary work.

Conclusion: Augmenting Historical Understanding

Sentiment analysis will never replace the historian's craft of interpretation, context, and empathy. The algorithm does not know that a sigh of relief in a letter from a soldier on the eve of battle carries a different weight than a politician's manufactured enthusiasm. But as a tool for scale and pattern recognition, it offers something no single historian can achieve: the ability to trace the emotional currents of an entire society across decades. It forces us to confront the aggregate—to see not just the extraordinary voices but the murmur of the crowd. When used with rigor, historical sensitivity, and a clear understanding of its limitations, sentiment analysis opens a new window onto the past. It allows us to ask, with greater precision, how people really felt—and to begin to answer that question with evidence drawn from millions of words left behind.

As this field matures, it will likely become a standard component of the historian's methodological toolkit, complementing traditional sources and narratives. The result will be a richer, more textured understanding of the emotional life of previous generations—and, perhaps, a deeper appreciation of the role sentiment plays in shaping history itself.