A Science of Literature
Aug 3, 2015
15 Min read time
A social scientist's take on the digital humanities.
Cover detail from Electronic Computers, by S. H. Holligdale and G. C. Tootill (1965). Image: Penguin Press.
Verso, $29.95 (paper)
The Bourgeois: Between History and Literature
Verso, $19.95 (paper)
Macroanalysis: Digital Methods and Literary History
University of Illinois Press, $30 (paper)
Text Analysis with R for Students of Literature
Springer, $49.99 (cloth)
For some years, humanities scholars have sought to integrate computing technology into their research. These efforts—the “digital humanities”—have inspired public debate well out of proportion to the number of researchers involved or the scope of their findings. P.T. Barnums and Chicken Littles have proclaimed that computation will mark the end of humanistic inquiry. Actual literary research in this vein suggests otherwise.
Much of this work is driven by tools rather than by questions; when scholars have the means to manipulate large bodies of text, they will fiddle with the data and see what happens. To the extent that digital projects do have clear goals, they tend to yield recognizably humanist products, such as a new edition of a book, a map of the places discussed in a narrative, or attribution of authorship to a formerly anonymous text.
The literary scholar Franco Moretti and his colleagues, most notably Matthew Jockers, are exceptions: their project of “distant reading,” developed steadily over more than a decade, has an ambitious, nontraditional goal. In its strongest formulation, it seeks to explain long-term patterns of literary stability and change through the quantitative study of all surviving literary texts. Forms of change include large developments, such as the rise and fall of novel genres, already recognized by critics, but also many small shifts, such as changes in sentence structure, the gradual emergence of themes, or the increased use of locative prepositions, which are readily detected with statistics but mostly unnoticeable to a human reader. Jockers’s nascent work on novel plots, which suggests that nearly all novels conform to a half dozen basic structures, derives from many small measurements concerning the emotional sentiment of individual sentences.
This is an unmistakably scientific aspiration. Unfortunately, few scientists, or social scientists, have taken notice of this body of work, while humanists have responded to it in a remarkably partisan fashion. The main difficulty is that two distinct issues have been blurred. The first is legitimate disagreement about the goals of humanistic inquiry. But both critics and proponents tend to jump straight to a second, larger conflict about the transformation of the university and the proper place of the humanities in education and intellectual life. These are important value questions; however, the work of these digital humanists should not be expected to answer them.
The results of this earnest scientific project are mixed. Moretti and Jockers are obviously enthusiastic, and in many cases their findings are interesting and surprising. (As a frequent reader of academic social science, I wish more scholars could make statistics seem so exciting.) But there are serious conceptual difficulties. While this work presents extensive new descriptions of literary change, it has not persuasively explained the causes of that change. The evidence available—published texts—is not sufficient to explain it. The statistics used in these works are mainly descriptive, and the faith placed in these descriptions is limited. A table or graph is treated as an object to be interpreted. In this and many other respects, distant reading remains a recognizably humanistic practice.
• • •
The premise of distant reading is simple: most literary scholars know only a couple of dozen works well, and are familiar with perhaps a couple hundred in total. But extant works from a given period in a given language—say, nineteenth-century English literature—may number in the tens of thousands. This corpus may differ markedly from the works that have held the attention of scholars; indeed, because critical attention is directed toward books according to their perceived significance, the difference between an excellent book and a mediocre book may be considerable. If the goal is to track change over time or discover plot patterns, the common run of books may reveal what excellent ones cannot. So, too, may attention to textual units smaller than the whole book. A scholar cannot read thousands of books, or millions of sentences, but can describe some of their properties with statistics. Thinking about books in this way can prompt scholars to approach texts differently; some of Moretti’s most interesting findings do not require a computer at all.
In the digital humanities, the data itself is an object to be 'read' as though it were a text.
In many cases computational techniques confirm what more traditional scholarship has already concluded. In the paper “Style, Inc.,” Moretti uses a catalog of 7,000 books to show that eighteenth-century novels had cumbersome titles. Jockers, equipped with the full text of thousands of novels, finds that Moby-Dick is mainly about life on a whale ship. In fact, a fair part of Jockers’s Macroanalysis seeks to prove that computational techniques can reproduce the judgments made by a trained reader. One chapter, for instance, establishes that his techniques can accurately identify the genres to which chunks of novels belong. Such seemingly trivial findings have led critics such as Adam Kirsch to scoff, but Jockers modestly suggests that replication of known findings is a scientific virtue. Moreover, if computational methods could not confirm what we already know to be true, there would be little reason to entertain more adventurous claims.
Moretti and Jockers do produce new claims. “The Slaughterhouse of Literature,” an essay that marks Moretti’s “new beginning” as an empirical scholar, sets out to explain why Arthur Conan Doyle survived as the preeminent mystery writer from a period when the genre was especially popular. In a memorable bit of investigation, Moretti groups mystery stories according to their use of clues. The winning formula, he finds, provides clues that are visible and comprehensible to the reader and necessary to the resolution of the mystery. (Many other mystery writers were surprisingly inept in their use of clues.) In “Network Theory, Plot Analysis,” Moretti draws a network of the social connections between all the characters in Hamlet. The complex, bloody conclusion of the play, he finds, arises from a simple network structure: the characters who die are those who have interacted with both Hamlet and Claudius. Jockers also has plenty to contribute. In his own specialty, Irish literature, he shows that many well-established critical beliefs about Irish-American literature—and the Irish experience in the United States—derive from scholars’ focus on texts that are mostly about life in eastern cities. The picture changes in light of a large but mostly ignored set of works about the Irish experience in the American West.
As Macroanalysis proceeds, the complexity of Jockers’s statistical techniques and the scope of their application steadily increase. In his final empirical chapter, “Influence,” he gives a sense of what these tools can do in concert. For each of 3,346 novels from the eighteenth and nineteenth centuries, he computes the values of 578 variables, including the frequencies of thematically related words (“topics”) and stylistic features such as sentence length, kinds of grammatical clauses, and use of punctuation. The differences between books along each of these dimensions can be understood as distance in an abstract space. On the “Native Americans” topic, for example, James Fenimore Cooper’s The Deerslayer would score high, while Jane Austen’s Mansfield Park would score low, making them distant along this dimension. Distance across all 578 dimensions yields a complex measure of the similarity between books. Jockers then organizes the books by their date of publication, producing an approximation of literary influence: an influential work is close to another one in the abstract space and preceded it in time.
Some of the results are unsurprising. Authors of the same gender tend to write similar works, and novels written at about the same time tend to be similar. Works a human reader would find similar do indeed cluster together: nearMoby-Dick are several other works by Melville, along with novels from James Fenimore Cooper and Edgar Allan Poe’s The Narrative of Arthur
Pym of Nantucket. Other results are provocative. The most influential works according to this measure—Laurence Sterne’s Tristram Shandy, George Gissing’s The Whirlpool, and Benjamin Disraeli’s Venetia—are not ones scholars have deemed decisive in the development of the English novel. Jockers acknowledges this puzzling result. I suggest a partial explanation: the analysis here cannot address the possibility that many authors might independently adopt a new set of themes. The first to do so will seem influential by Jockers’s measures but may only be prescient in addressing a topic that will later become important or quick to write about a historical event, such as the French Revolution, that has recently occurred.
• • •
So Moretti and Jockers can reproduce some basic critical findings and yield new results that appear plausible and meaningful. But much is missing.
For one thing, there are simply not enough trained scholars to pursue a more systematic investigation using digital techniques. For this reason Jockers’s textbook, Text Analysis with R for Students of Literature, may prove to be of more enduring importance than Macroanalysis. Given the controversy about computation in the humanities, a successful programming textbook cannot simply teach programming; Text Analysis, though modest in scope, displays a sound psychological understanding. Jockers appreciates that scholars who wish to dabble with programming may not want to sign up for a revolution, so he stresses the many ways that computation complements traditional humanistic inquiry and assures his audience that computers will never supplant expert readers. Furthermore, Jockers understands how daunting it can be to learn computer programming and statistics and designs the book accordingly. Readers with no programming experience will be able to follow along. They will also receive near-instant gratification: starting with the second chapter, the exercises help students produce meaningful results about real works of literature.
Any introductory work must scant something, especially if it is addressed to learners who have doubts about the value of the enterprise. In this case, the main omission is a discussion of the statistical methods being employed. Though Jockers possesses a good grasp of the relevant statistics, the omission here is consistent with broader weaknesses in computational studies of literature. Works from Moretti and others associated with Stanford’s Literary Lab place surprisingly little faith in statistics, a point that becomes particularly clear when reading the lab’s freely accessible pamphlet series. The common approach might be called “describe and interpret.” That is, the data will be used to produce a descriptive summary in the form of a graph or figure, which is then treated as an object to be “read” by the investigator as though it were a text. (Indeed, in a 2011 issue of Victorian Studies, members of the Literary Lab talk about “learning to read” statistical data in a manner analogous to the interpretation of literary text.) These descriptive representations are often informative, but critical intuitions, rather than inferential statistical methods, are used to draw the conclusions.
When Moretti and colleagues do pay attention to the quantitative measures produced by their studies, they often take note of the wrong one: statistical significance. Roughly speaking, significance refers to a low probability that an observed pattern in the data would arise by chance. However, in large data sets, many observed patterns will be significant according to standard measures. As a result, patterns may be statistically significant but substantively trivial: the magnitude of the differences (the “effect size”) may be extremely small, or it could disappear entirely if the researcher tweaks the model.
In many disciplines there is growing criticism of research that concerns itself with significance measures at the expense of the concrete meaning of the results or the coherence of the study design. The editors of the Journal of Graduate Medical Education, addressing medical researchers who use statistics of this kind, put it bluntly: “the effect size is the main finding of a quantitative study.” The journal Basic and Applied Social Psychology has banned significance testing entirely, on the grounds that it has contributed to a ruinously high level of unreproducible studies in social psychology. Because numerical results, including effect sizes, are usually not reported in computational literary research, it is difficult to judge the soundness of many of the findings.
The point is that scholars who are serious about explaining—rather than merely describing—patterns of literary development still have to resolve a number of methodological questions: How should certain techniques be employed, and what do the results mean? What constitutes an important, as opposed to a statistically significant, finding? How can we make causal inferences about influence or the emergence of a new genre? These are questions that do not, as yet, have definite answers, and there is active debate in the social sciences about how computational techniques should be used to study cultural products such as literary texts. Both sides would benefit if literary scholars such as Moretti were part of the social scientific conversation.
At the same time, these are questions that cannot be resolved merely by contemplating the nature of causal inference. The issue would become much more tractable if the computational study of literature was working from a theory. By this I mean, broadly, a provisional general explanation for how and why literature develops, along with some specific claims or predictions that could be explored using data.
Moretti and Jockers are hindered because they make halfhearted use of two kinds of explanation: evolution and social change. The digital humanities have produced crudely evolutionist theories such as literary Darwinism, which views literary creation as an evolved human behavior. Moretti and Jockers are much more thoughtful. They use concepts from biological evolution to describe changes in literary texts and genres, not the humans who create them. Moretti writes an entire chapter, “Evolution, World-Systems, Weltliteratur,” that likens stylistic changes to the formation of biological species. Jockers, after a page-long caveat questioning evolutionary analogies, says he “cannot resist the great temptation to liken these data to the genome.” He often says that literature “evolves.” This is the signal weakness of evolutionary explanations: books do not self-reproduce as biological organisms do; they are artifacts of deliberate human activity. Evolutionary reasoning sometimes produces striking descriptions, as in Moretti’s discussion of clues. However, to the extent that this view of literature excludes human actors, there is a great deal that it cannot explain. Texts are not self-caused. They are written, read, and interpreted by humans, and a persuasive explanation of historical change in literature will require some account of the social side of literature, including matters such as practices of authorship, institutions of publishing, and the experience of reading. In short, distant reading wants for a theory.
A more promising approach examines literary development in light of changes in the societies that produce the literature. Explanations of this kind have the virtue of connecting literature to the agents that produce and consume it. Moretti and Jockers both show, for example, that the lifespan of novel genres is broadly in line with generational boundaries. Certain genres that rise and fall more quickly, such as the Jacobin and Anti-Jacobin novel, correspond to major historical events. It seems plausible that demographic change could account for many patterns of literary development, at least at the large scale. Jockers, in his chapters “Style” and “Nationality,” suggests that genre and language are, in some ways, stronger than authors, apt to constrain what they write. This, he offers, is the reason why statistical regularities exist in the first place.
Of course, these larger features of social structure say little about the meaning literature holds for the people who read and write it. It is on this point that literary scholars have a decided advantage over social scientists. Large social theories about culture are at their weakest when addressing the content of literary works or the subjective experience of composition and reading. Many computational techniques are avowedly indifferent to meaning or experience: if a model produces a serviceable prediction, it scarcely matters why.
Critics have much to offer on this point. Moretti suggests that his The Bourgeois, a book published more or less simultaneously with Distant Reading, represents a very different, more traditional project. I respectfully disagree. The Bourgeois is an interpretive study of nineteenth-century European literature that pays particular attention to keywords, sentiments, and innovations in prose. It is nonetheless an attempt to explain the emergence of the modern middle classes and their effect on literature. Though The Bourgeois uses a different kind of evidence, its spirit is not so different from that of Distant Reading. This kind of expert knowledge would undoubtedly play a key role in producing a strong explanation of how literature has changed over time, especially when broad trends must be connected to particular groups, authors, or works.
• • •
In some ways, my criticism of the program of distant reading is unfair. What I point to as weaknesses—the problems of making causal claims, relating data to subjective experiences, explaining the extent to which individual action is constrained, connecting large-scale and small-scale phenomena—are also core theoretical challenges that have dogged American sociology for the last half-century. These are stock objections that would be raised in any graduate seminar. But the difficulties are real and will become more pressing as computational literary research gains the wider attention it deserves.
I have tabled the value questions humanists have raised, but I wish to conclude with a serious value question: Who stands to benefit most from the success of this research? At its core, this enterprise is about using computers to understand how humans use language, including what that language means to people and how it persuades. Just down the road from Stanford is an enormous industry whose profitability depends on understanding this. Academia and Silicon Valley are converging in this territory: technology firms are eager to hire researchers who can help them turn large volumes of human-generated text into money and eager to collaborate with scholars in order to reap the practical benefit of academic knowledge. And an enormous surveillance apparatus, in the United States and abroad, is similarly interested in extracting meaning and predictions from volumes of text. In both cases, the problem is not assembling the data, which has already been acquired even though the people surveilled often don’t know it. The problem is making sense of it, which requires social scientists and humanists.
As the empirical project of distant reading enjoys greater success, it cannot be innocent about the outside parties that will take an interest. Jockers is surely aware of this: he helped write an influential amicus brief for Authors Guild, Inc. v. HathiTrust (2014), a major copyright victory for digital humanities scholars and academic libraries, but also for Google. Of course, this wouldn’t be the first time private industry and intelligence agencies have found uses for humanistic enterprises. Peircean semiotics, Bauhaus design, New Criticism, the Iowa Writers’ Workshop, and post-structuralism have all been put to corporate and government ends. Intellectuals are always eager to point out that ideas have real, high stakes. In the case of the digital humanities, they are certainly right.
August 03, 2015
15 Min read time