A couple of weeks ago, after reading abstracts from DH2013, I said that the take-away for me was that “literary theory is about to get interesting again” – subtweeting the course of history in a way that I guess I ought to explain.
In the twentieth century, “literary theory” was often a name for the sparks that flew when literary scholars pushed back against challenges from social science. Theory became part of the academic study of literature around 1900, when the comparative study of folklore seemed to reveal coherent patterns in national literatures that scholars had previously treated separately. Schools like the University of Chicago hired “Professors of Literary Theory” to explore the controversial possibility of generalization.* Later in the century, structural linguistics posed an analogous challenge, claiming to glimpse an organizing pattern in language that literary scholars sought to appropriate and/or deconstruct. Once again, sparks flew.
I think literary scholars are about to face a similarly productive challenge from the discipline of machine learning — a subfield of computer science that studies learning as a problem of generalization from limited evidence. The discipline has made practical contributions to commercial IT, but it’s an epistemological method founded on statistics more than it is a collection of specific tools, and it tends to be intellectually adventurous: lately, researchers are trying to model concepts like “character” (pdf) and “gender,” citing Judith Butler in the process (pdf).
At DH2013 and elsewhere, I see promising signs that literary scholars are gearing up to reply. In some cases we’re applying methods of machine learning to new problems; in some cases we’re borrowing the discipline’s broader underlying concepts (e.g. the notion of a “generative model”); in some cases we’re grappling skeptically with its premises. (There are also, of course, significant collaborations between scholars in both fields.)
This could be the beginning of a beautiful friendship. I realize a marriage between machine learning and literary theory sounds implausible: people who enjoy one of these things are pretty likely to believe the other is fraudulent and evil.** But after reading through a couple of ML textbooks,*** I’m convinced that literary theorists and computer scientists wrestle with similar problems, in ways that are at least loosely congruent. Neither field is interested in the mere accumulation of data; both are interested in understanding the way we think and the kinds of patterns we recognize in language. Both fields are interested in problems that lack a single correct answer, and have to be mapped in shades of gray (ML calls these shades “probability”). Both disciplines are preoccupied with the danger of overgeneralization (literary theorists call this “essentialism”; computer scientists call it “overfitting”). Instead of saying “every interpretation is based on some previous assumption,” computer scientists say “every model depends on some prior probability,” but there’s really a similar kind of self-scrutiny involved.
It’s already clear that machine learning algorithms (like topic modeling) can be useful tools for humanists. But I think I glimpse an even more productive conversation taking shape, where instead of borrowing fully-formed “tools,” humanists borrow the statistical language of ML to think rigorously about different kinds of uncertainty, and return the favor by exposing the discipline to boundary cases that challenge its methods.
Won’t quantitative models of phenomena like plot and genre simplify literature by flattening out individual variation? Sure. But the same thing could be said about Freud and Lévi-Strauss. When scientists (or social scientists) write about literature they tend to produce models that literary scholars find overly general. Which doesn’t prevent those models from advancing theoretical reflection on literature! I think humanists, conversely, can warn scientists away from blind alleys by reminding them that concepts like “gender” and “genre” are historically unstable. If you assume words like that have a single meaning, you’re already overfitting your model.
Of course, if literary theory and computer science do have a conversation, a large part of the conversation is going to be a meta-debate about what the conversation can or can’t achieve. And perhaps, in the end, there will be limits to the congruence of these disciplines. Alan Liu’s recent essay in PMLA pushes against the notion that learning algorithms can be analogous to human interpretation, suggesting that statistical models become meaningful only through the inclusion of human “seed concepts.” I’m not certain how deep this particular disagreement goes, because I think machine learning researchers would actually agree with Liu that statistical modeling never starts from a tabula rasa. Even “unsupervised” algorithms have priors. More importantly, human beings have to decide what kind of model is appropriate for a given problem: machine learning aims to extend our leverage over large volumes of data, not to take us out of the hermeneutic circle altogether.
But as Liu’s essay demonstrates, this is going to be a lively, deeply theorized conversation even where it turns out that literary theory and computer science have fundamental differences. These disciplines are clearly thinking about similar questions: Liu is right to recognize that unsupervised learning, for instance, raises hermeneutic questions of a kind that are familiar to literary theorists. If our disciplines really approach similar questions in incompatible ways, it will be a matter of some importance to understand why.
* <plug> For more on “literary theory” in the early twentieth century, see the fourth chapter of Why Literary Periods Mattered: Historical Contrast and the Prestige of English Studies (2013, hot off the press). The book has a lovely cover, but unfortunately has nothing to do with machine learning. </plug>
** This post grows out of a conversation I had with Eleanor Courtemanche, in which I tried to convince her that machine learning doesn’t just reproduce the biases you bring to it.
*** Practically, I usually rely on Data Mining: Practical Machine Learning Tools and Techniques (Ian Witten, Eibe Frank, Mark Hall), but to understand the deeper logic of the field I’ve been reading Machine Learning: A Probabilistic Perspective (Kevin P. Murphy). Literary theorists may appreciate Murphy’s remark that wealth has a long-tailed distribution, “especially in plutocracies such as the USA” (43).
PS later that afternoon: Belatedly realize I didn’t say anything about the most controversial word in my original tweet: “literary theory is about to get interesting again.” I suppose I tacitly distinguish literary theory (which has been a little sleepy lately, imo) from theory-sans-adjective (which has been vigorous, although hard to define). But now I’m getting into a distinction that’s much too slippery for a short blog post.