Using Digital Archives to Teach Data Set Creation and Visualization Design

screenshot of Visual Haggard home page

Kate Holterhoff @KateHolterhoff is a Marion L. Brittain Postdoctoral Fellow at the Georgia Institute of Technology. Her research areas include nineteenth- and early-twentieth-century British literature, visual culture, digital humanities, and the history of science. She directs and edits the literary and art historical resource, which has recently become a federated archive with with NINES, the Networked Infrastructure for Nineteenth-Century Electronic Scholarship.

Useful as digital archives have become for academics and historians, integrating these projects into college classrooms still poses significant challenges. Because electronic scholarship often concerns niche areas of study, its usefulness is usually reserved for those possessing a broad and not inconsiderable base of knowledge. Advanced undergraduate and graduate students alone tend to benefit from the growing number of projects archiving primary documents online. But digital archives providing access to rare images, correspondence, reviews, and manuscripts have the capacity to benefit a spectrum of undergraduates, including those majoring in STEM fields.

Although less common perhaps, archives can enhance introductory humanities and first-year composition courses aimed at non-liberal arts majors. This semester I designed assignments centered on Visual Haggard (VH), a digital archive that I edit intended to centralize and improve access to the illustrations of popular Victorian novelist H. Rider Haggard. These multimodal assignments encouraged undergraduates to practice critical thinking, writing and communication, and visual and digital design—all course goals for first-year composition seminars at Georgia Tech. At the same time, students actively collaborated with me on the project by composing public-facing descriptive metadata tags demonstrating close reading as well as well-designed charts and graphs visualizing this data.

Visual Haggard

Unlike many undergraduate writing and communication projects that use data generated by professional researchers, students in this course composed their own datasets about the illustrations posted to VH (full assignment description). Using the assigned course text King Solomon’s Mines as a model, students learned how to tag book illustrations on a spreadsheet using this designated criteria:

  1. All objects in the illustration (window, horse, hat, tree)
  2. Setting (continent, country, house, forest)
  3. Names of depicted characters (Allan Quatermain, Sir Henry Curtis)
  4. Posture and expression of characters (lean, sit, stand)
  5. Number of human figures (1 figure, 3 figures, crowd)
  6. Style of art (realism, symbolist)
  7. If appropriate, critical themes relating to race, gender, religion, disability, etc.

Tags acted as a form of close reading and provided material for discussion about contentious labels (identifying characters was sometimes a challenge, but critical themes are always contentious). Groups of five then created their own spreadsheets to tag all illustrations available on VH for a second Haggard novel. The number of illustrations for each group varied widely depending on the novel selected. For instance, the group assigned to The Brethren tagged 51 illustrations, while The People of the Mist group tagged only 22. Groups debated about the appropriateness of tags as critics, but they also practiced audience awareness by considering which words and phrases would be most useful to VH’s diverse visitors.

In the process of writing tags, students noticed patterns across their assigned novel’s illustrations. Students who selected Haggard’s Allan and the Holy Flower, for instance, wondered what Maurice Greiffenhagen’s illustrations could reveal about the prevalence of certain characters over others, and how often these characters appeared together. To answer these questions, their “Allan and the Holy Flower Character Map” reveals character frequency through bubble size, while the number of lines connecting these bubbles indicates the number of illustrations in which two or more characters appear.

character map

Like a visual studies version of Christopher Warren’s Six Degrees of Francis Bacon, this chart exposes relationships between characters that individual illustrations alone conceal. Moreover, the frequency of “background crowds” composed of African figures caused these students to ask, to quote their supporting document: “Is the role of the African supporting characters in Greiffenhagen’s illustrations to influence and contribute to the plot or to just serve as a sort of setting for the story?” Visualizing descriptive metadata tags allowed these students to identify a troubling trend in Haggard’s visual history: African bodies often serve as a homogenous and exotic mass, rather than individuals, to form a backdrop for Haggard’s heroic white protagonists.

The majority of Haggard’s approximately fifty novels were lushly illustrated, many of them repeatedly in different editions and by different illustrators. Students studying Haggard’s She used their data visualization assignment to compare the 14 illustrations by E. K. Johnson, published in the 1887 1st US edition, with the 30 illustrations from Maurice Greiffenhagen and Charles Kerr’s edition, reprinted from the 1887 1st UK edition. This visualization compares the emotions, characters, setting, and number of figures between these two very different illustrated texts.

visualization of illustrations in She

heart imagery in Haggard

As is so often the case with primary documents, the evidence in VH is messy and uneven. However frustrating, this messiness proved useful to teaching data sets and visualization as an interpretive, subjective, and argumentative practice. For example, although I identify the character Mahomed in two of Johnson’s illustrations, the assignment allows students to determine the appropriateness of tags, and how best to explain patterns, without the instructor overruling their judgments. Rather, I evaluated data visualizations according to the accuracy of its design, the clarity of the story it tells, and its capacity to advance knowledge about the subject matter. For instance, it is more accurate to use percentile rather than gross total to juxtapose these illustrated editions because there are less than half as many illustrations by Johnson than by Kerr and Greiffenhagen. The story of this comparison is explained well in the heading, and the categories for this comparison are useful to literary and illustration studies scholars researching Haggard’s She.

I encourage instructors to replicate my assignment in their own courses using VH, but the basic format of this project and its pedagogical benefits works for any number of other archives. Students might tag one year of correspondence from the Walt Whitman Archive, or a paired poem and picture Doublework from the Rossetti Archive, and then visualize patterns that emerge from their findings. The process of careful examination, tagging, and visualizing patterns encourages students to think critically about the relationship of data creation for close reading and design.

Do you have a favorite assignment for incorporating digital archives into general education courses? Please share in comments.

Main image is a screenshot of

Return to Top