world-history
Using Multimodal Textual Analysis to Study Historical Political Cartoons and Texts
Table of Contents
Introduction: Why Multimodal Analysis Matters for Historical Sources
Historical political cartoons and printed texts are not mere records of past events; they are carefully engineered rhetorical artifacts. A single caricature from the 18th century or a pamphlet from the Civil War era blends visual spectacle, textual argument, and spatial design to persuade, provoke, or entertain. For too long, historians have treated images and words as separate domains, analyzing a cartoon's visual symbolism in isolation from its caption or ignoring how a newspaper's layout shaped reader interpretation. This fragmentation misses the essential synergy: meaning often emerges precisely at the intersection of modes. Multimodal textual analysis provides a systematic toolkit for decoding that synergy, allowing researchers to reconstruct how historical actors crafted their messages for specific audiences and purposes.
The approach, rooted in social semiotics as developed by Gunther Kress and Theo van Leeuwen, posits that all communicative modes—image, typography, color, spatial arrangement—carry distinct potentials for meaning. Their seminal work Reading Images: The Grammar of Visual Design (Kress and van Leeuwen) laid the foundation for analyzing how visual elements function as a kind of grammar. For historians, applying this framework to materials like Thomas Nast's anti-corruption cartoons or World War I propaganda posters reveals not only the explicit argument but also the unspoken cultural assumptions embedded in design choices. The method forces scholars to ask more than what a source says; it demands an account of how every semiotic element works together to produce a rhetorical effect.
Defining Multimodal Textual Analysis: A Framework for Historians
At its core, multimodal textual analysis examines how multiple semiotic resources interact to create meaning. These resources include:
- Visual imagery: Icons, symbols, caricatures, allegorical figures, and stylistic conventions that carry culturally specific connotations.
- Written language: Word choice, syntax, rhetorical figures, and typographic variation that establish tone and emphasis.
- Layout and composition: Spatial hierarchy, framing, size contrasts, reading paths, and the use of white space to guide attention.
- Materiality: Paper quality, format (broadside, pamphlet, bound volume), and physical condition that signal intended audience, cost, and durability.
These modes do not operate independently; they co-articulate meaning. A bold headline in red above a somber illustration creates a different effect than the same headline in a small serif font below the image. The analyst's task is to trace how each mode contributes to the overall rhetorical effect, paying attention to moments of agreement, tension, or contradiction. For example, a 1790s cartoon that pairs a dignified portrait with a mocking caption may deliberately undermine its own visual authority, using the gap between modes to produce satire.
Historical and Theoretical Foundations
Multimodal analysis did not emerge from a vacuum. It draws on several established intellectual traditions, each offering a lens for understanding how visual and textual elements combine to persuade.
Iconography and Iconology (Panofsky)
Erwin Panofsky's three-tiered method—pre-iconographic description (identifying objects and events), iconographic analysis (interpreting conventional symbols), and iconological interpretation (discovering underlying cultural values)—aligns closely with multimodal thinking. Panofsky showed that even seemingly simple images encode layers of meaning available only to those familiar with the visual language of a particular time and place (Panofsky's iconology). Historians can apply this layered approach to both visual and textual elements simultaneously. For instance, a Victorian-era cartoon showing Britannia surrounded by colonial figures requires iconographic knowledge of allegorical attributes (helmet, shield, trident) and iconological understanding of imperial ideology. The text in such images often anchors or extends the iconography, making Panofsky's method a natural precursor to contemporary multimodal analysis.
Social Semiotics and Critical Discourse Analysis
Kress, van Leeuwen, and later theorists like Theo van Leeuwen and David Machin extended semiotics from language to all modes. Their work emphasizes that modes are shaped by social contexts and power relations—a point crucial for analyzing propaganda, satire, and political rhetoric. Critical discourse analysis (CDA) further provides tools for uncovering how language and image work together to naturalize ideology. In a 1950s anti-communist cartoon, for example, the visual depiction of a red stain spreading across a map interacts with a caption warning of "infiltration." The combination naturalizes the threat as inevitable and biological, rather than as a political choice.
Art History and Visual Culture Studies
Traditional art historical methods—examining style, attribution, and iconography—offer deep insights into visual conventions. Multimodal analysis enriches these by explicitly linking visual choices to textual and design strategies, making it especially useful for studying mass-produced items like cartoons and posters. Art historians have long analyzed the compositional rules of satire, but multimodal analysis adds a systematic account of how typography, layout, and materiality amplify or undercut those rules.
Applying Multimodal Analysis to Historical Political Cartoons
Political cartoons are ideal case studies because they compress complex arguments into a single, often humorous or satirical frame. The following process guides systematic analysis of these dense artifacts.
Step 1: Establish Context and Provenance
Identify the publication date, periodical, cartoonist (if known), and the specific political or social issue addressed. For example, a 1784 cartoon by James Gillray satirizing Charles James Fox and the East India Bill cannot be understood without knowing the furious parliamentary debates of that year. Context determines the cartoon's target, allusions, and intended effect on readers. Without context, a historian might miss that a figure holding a feathered headdress is not a generic "Indian" but a personification of the British East India Company's corruption. Always consult contemporary news reports, political pamphlets, and private correspondence to reconstruct the original intended meaning.
Step 2: Catalog Visual Elements Thoroughly
Make an exhaustive inventory of every visual component: characters (real persons, mythical figures, animals with symbolic meaning), objects (chains, scales, bags of money, national flags), setting (courtroom, battlefield, fantasy land), and expressive details (facial expressions, body language, clothing). For instance, in Gillray's "The Plumb-pudding in Danger," the image of a globe being carved like a pudding by William Pitt and Napoleon is rich with meaning: the prime minister and emperor personify their nations' imperial ambitions, while the globe-as-pudding trivializes global conflict as a gluttonous meal. A thorough inventory also captures the use of color—whether the original was hand-colored or printed in black and white—since color carried additional symbolic weight in the 18th century.
Step 3: Analyze All Written Components
Transcribe and scrutinize every piece of text: captions, speech bubbles, labels on objects, signatures, and any embedded inscriptions. Note rhetorical devices—irony, understatement, hyperbole—and how they interact with the visual. A caption that directly contradicts the image (e.g., "A Just and Fair Treaty" above a cartoon showing a dog being forced to sign a contract with a wolf) often delivers the satirical punch. Typographic choices matter: a bold, sans-serif caption in a 1917 poster conveys urgency, while an ornate script in a 1780s print suggests elegance or satire. Always check whether the text was added after the image or was integral to the original design—some printers inserted captions later, changing the intended meaning.
Step 4: Assess Composition and Visual Hierarchy
Map the layout: what occupies the center? What is in the foreground vs. background? Are there framing devices like borders or speech bubbles? How does the eye move across the image? Larger figures and bold text draw primary attention; smaller background details often encode secondary commentary or inside jokes for attentive viewers. In Herblock's famous 1950s cartoons about McCarthyism, the oversized figure of Senator McCarthy holding a brush painting "communist" labels dominates the center, while tiny marginalized victims cower at the edges—a composition that visually argues that McCarthy's accusations were out of proportion and targeted the vulnerable. The reading path created by this hierarchy is known as the salience pattern, and it directs the viewer's attention in a deliberate sequence.
Step 5: Interpret the Multimodal Interaction
Bring together the findings to construct a unified interpretation. Ask: How does the visual metaphor reinforce or subvert the text? What assumptions about the audience's knowledge are required? What cultural stereotypes are invoked? For example, a 1908 cartoon showing Uncle Sam offering a "New Deal" medicine bottle to a ragged figure labeled "The Unemployed" uses both visual (the medicine bottle as panacea) and textual (the ironic term "New Deal") modes to critique government promises. The interaction between modes creates a layered argument that neither image nor text could achieve alone. This step also requires the historian to consider potential polysemy: could the same cartoon be read differently by a supporter of the government versus a critic? Multimodal analysis helps reveal those divergent readings.
Analyzing Historical Texts Through a Multimodal Lens
While cartoons are overtly multimodal, many historical texts that appear "just words" also rely on visual features that affect meaning. A newspaper article from the 18th century might use decorative headpieces, column rules, and varying typefaces to signal importance. A 19th-century abolitionist pamphlet often juxtaposes stark woodcuts of enslaved people with eloquent appeals in contrasting fonts (Library of Congress collection). Multimodal analysis of such texts considers:
- Typography: Bold, italic, and size variations signal emphasis or contrast. For example, newspapers in the 1700s used italic for quotations, roman for news text, and blackletter for official proclamations.
- Layout: Column width, margins, and insertion of images or lists guide reading pace. A narrow column forces the eye to move quickly, while a wide central column invites slower, more reflective reading.
- Paratext: Titles, subtitles, bylines, and publication information frame the text rhetorically. A bold headline in large type signals significance, while a small subtitle may qualify or undercut the headline.
Case Study: The Printed Speech
Consider Abraham Lincoln's 1863 Gettysburg Address as it appeared in contemporary newspapers. Some versions placed the speech under a large headline with ornamental borders, others condensed it into a single column. The Chicago Tribune ran it with a glowing editorial introduction. Analyzing these paratextual elements reveals how the speech's reception was shaped—not just by Lincoln's words but by the visual prestige or modesty of its presentation. A plain version in a small-town paper might have been read as a routine event, while the Tribune's ornate layout elevated the address to a national moment. The multimodal historian therefore reads not only the text but also the design as a primary source.
Propaganda Posters: A Multimodal Masterclass
World War I and World War II propaganda posters represent a peak of deliberate multimodal engineering. Every element—visual, textual, spatial, and even implied temporal sequence (e.g., "Now or never")—was designed to elicit specific emotions and actions. Unlike cartoons, which often rely on humor or irony, propaganda posters often deploy direct appeals and stark contrasts.
Take the famous British "Your Country Needs YOU" poster with Lord Kitchener: the piercing finger pointing directly at the viewer (addressing the "you") combined with bold, uncompromising type creates a sense of personal obligation. The red color of the text signals urgency and sacrifice. The composition forces a reading path: first the face, then the finger, then the command. Multimodal analysis unpacks how these choices transform a simple portrait into an imperative. Later parodies by anti-war groups subvert the same layout to critique jingoism, demonstrating how changing the text can repurpose the visual mode. The United States adopted similar designs, such as James Montgomery Flagg's "I Want YOU for U.S. Army," which used the same finger-pointing gesture but replaced the British uniform with Uncle Sam's suit—a subtle shift that personalized the call for an American audience.
Digital Tools and Archival Resources for Multimodal Research
Historians today can leverage digital tools to support multimodal analysis. Image annotation platforms like Apero or Mirador allow layered tagging of visual elements alongside transcriptions. Text mining of OCR-corrected captions across thousands of cartoons can identify recurring visual metaphors (e.g., "ship of state," "national uncle"). Geographical information systems (GIS) can map the spread of specific propaganda images. Digital archives such as Library of Congress Prints and Photographs Division provide high-resolution scans of political cartoons and posters, often with contextual metadata. However, digital tools cannot replace hermeneutic interpretation. The historian must always ask: what does the tool highlight, and what does it obscure? The loss of original color, scale, and materiality in digitized images demands caution. Where possible, combine digital access with study of originals or high-quality facsimiles. For instance, a cartoon printed on cheap newsprint in 1865 may appear crisp in a scan, but the frayed edges and yellowed paper that signal its ephemeral nature are lost.
Challenges and Ethical Considerations
Multimodal analysis carries risks. Anachronism is a primary danger: symbols that seem racist or misogynistic today may have been mainstream in their time, but the historian must evaluate whether they were intended as critique or reinforcement. For example, a 19th-century cartoon depicting Irish immigrants with simian features may have been read as humorous by some contemporaries, but it also reinforced ethnic stereotypes. The analyst must carefully contextualize such images using contemporary reviews, letters, and other primary sources that reveal how original audiences decoded the artifact. Another ethical consideration is the reproduction of offensive images: scholars must decide whether to show the source in its original form or to describe it without reproduction. The researcher's own cultural biases can also distort interpretation. To mitigate this, always ground analysis in a range of primary sources and be transparent about the limits of one's historical perspective.
Additionally, digitization often strips material evidence—paper texture, fold marks, scent—that conveyed status and use. A cheaply printed broadside pasted on a tavern wall operated differently than a fine color plate bound in a magazine. Historians must account for these material contexts whenever possible. A poster that was originally displayed outdoors with adhesive marks tells a different story from a pristine copy kept in a portfolio. The multimodal analysis should note these material traces as part of the evidence.
Pedagogical Uses in the History Classroom
Teaching students to perform multimodal analysis develops critical thinking and visual literacy. A scaffolded approach works well:
- Description phase: Students list everything they see and read without interpretation. This trains observation and prevents premature conclusions.
- Analysis phase: Students identify patterns, symbols, and rhetorical devices in each mode separately. They might compare the use of color in two posters or the typographic choices in two pamphlets.
- Integration phase: Students explain how modes interact to produce a unified argument. This is where they articulate the synergy between image and text.
- Evaluation phase: Students assess the source's bias, intended audience, and effectiveness based on multimodal evidence.
Assignments might include comparing two cartoons on the same event from opposing newspapers, or re-creating a propaganda poster based on a contemporary issue and writing a reflection on design choices. Such exercises prepare students for rigorous source analysis that goes beyond "I like this picture." Assessment should focus on the quality of multimodal reasoning, not on aesthetic judgment. For example, a student who can explain why a cartoonist chose a particular font or framing demonstrates deeper historical understanding than one who merely describes the cartoon's message.
Benefits for Historical Understanding
Multimodal textual analysis offers concrete advantages for historians:
- Recovers strategic communication: It reveals how authors deliberately orchestrated multiple modes to maximize persuasion for specific audiences.
- Exposes hidden ideologies: Implicit values—about race, gender, class, nation—often emerge more clearly in visual choices than in explicit text. A 1915 poster showing women in domestic roles while men march to war naturalizes gender divisions without stating them.
- Validates diverse sources: It legitimates studying ephemeral material (posters, cartoons, advertisements) as serious historical evidence, not merely as illustrations of textual sources.
- Fosters interdisciplinary rigor: It bridges history, art history, linguistics, and media studies, producing richer scholarship that can speak to multiple academic audiences.
As digital archives expand access to visual and printed materials, the historian equipped with multimodal tools can ask more interesting questions—not just what a source says, but how it says it and why that design was chosen. This approach ensures that the past is understood not as a flat text but as a vivid, contested interplay of images, words, and materials. The method also prepares historians for analyzing contemporary media, where multimodal communication is the norm. Understanding the full rhetoric of the past—the visual composition, the typographic choices, the material conditions—enables scholars to reconstruct how public opinion was shaped and how cultural assumptions were reinforced or challenged.
Whether analyzing a 1790s satire by Gillray or a 1917 recruitment poster, multimodal textual analysis helps historians listen to the full rhetoric of the past, capturing the layered arguments that shaped public opinion and legacy. It transforms the study of historical media from a simple extraction of "facts" into a nuanced investigation of how meaning was made—and can still be made today. The next generation of digital humanities tools, combined with rigorous theory, promises to deepen this understanding even further, making multimodal analysis an essential component of historical methodology.