Distant reading and the blurry edges of genre.

There are basically two different ways to build collections for distant reading. You can build up collections of specific genres, selecting volumes that you know belong to them. Or you can take an entire digital library as your base collection, and subdivide it by genre.

Most people do it the first way, and having just spent two years learning to do it the second way, I’d like to admit that they’re right. There’s a lot of overhead involved in mining a library. The problem becomes too big for your desktop; you have to schedule batch jobs; you have to learn to interpret MARC records. All this may be necessary eventually, but it’s not the ideal place to start.

But some of the problems I’ve encountered have been interesting. In particular, the problem of “dividing a library by genre” has made me realize that literary studies is constituted by exclusions that are a bit larger and more arbitrary than I used to think.

First of all, why is dividing by genre even a problem? Well, most machine-readable catalog records don’t say much about genre, and even if they did, a single volume usually contains multiple genres anyway. (Think introductions, indexes, collected poems and plays, etc.) With support from the ACLS and NEH, I’ve spent the last year wrestling with that problem, and in a couple of weeks I’m going to share an imperfect page-level map of genre for English-language books in HathiTrust 1700-1923.

But the bigger thing I want to report is that the ambiguity of genre may run deeper than most scholars who aren’t librarians currently imagine. To be sure, we know that subgenres like “detective fiction” are social institutions rather than natural forms. And in a vague way we also accept that broader categories like “fiction” and “poetry” are social constructs with blurry edges. We can all point to a few anomalies: prose poems, eighteenth-century journalistic fictions like The Spectator, and so on.

But somehow, in spite of knowing this for twenty years, I never grasped the full scale of the problem. For instance, I knew the boundary between fiction and nonfiction was blurry in the 18c, but I thought it had stabilized over time. By the time you got to the Victorians, surely, you could draw a circle around “fiction.” Exceptions would just prove the rule.

Selecting volumes one by one for genre-specific collections didn’t shake my confidence. But if you start with a whole library and try to winnow it down, you’re forced to consider a lot of things you would otherwise never look at. I’ve become convinced that the subset of genre-typical cases (should we call them cis-genred volumes?) is nowhere near as paradigmatic as literary scholars like to imagine. A substantial proportion of the books in a library don’t fit those models.

This is both a photograph of a real, unnamed mother and baby, and a picture of a fictional character named Shinkah. Frontispiece to Shinkah, The Osage Indian (1916).

This is both a photograph of a real, unnamed mother and baby, and a picture of a fictional character named Shinkah. Frontispiece to Shinkah, The Osage Indian (1916).


Consider the case of Shinkah, the Osage Indian, published in 1916 by S. M. Barrett. The preface to this volume informs us that it’s intended as a contribution to “the sociology of the Osage Indians.” But it’s set a hundred years in the past, and the central character Shinkah is entirely fictional (his name just means “child.”) On the other hand, the book is illustrated with photographs of real contemporary people, who stand for the characters in an ethnotypical way.

After wading though 872,000 volumes, I’m sorry to report that odd cases of this kind are more typical of nineteenth- and early twentieth-century fiction than my graduate-school training had led me to believe. There’s a smooth continuum for instance between Shinkah and Old Court Life in France (1873), by Frances Elliot. This book has a bibliography, and a historiographical preface, but otherwise reads like a historical novel, complete with invented dialogue. I’m not sure how to distinguish it from other historical novels with real historical personages as characters.

Literary critics know there’s a problem with historical fiction. We also know about the blurry boundary between fiction, journalism, and travel writing represented by the genre of the “sketch.” And anyone who remembers James Frey being kicked out of Oprah Winfrey’s definition of nonfiction knows that autobiographies can be problematic. And we know that didactic fiction blurs into philosophical dialogue. And anyone who studies children’s literature knows that the boundary between fiction and nonfiction gets especially blurry there. And probably some of us know about ethnographic novels like Shinkah. But I’m not sure many of us (except for librarians) have added it all up. When you’re sorting through an entire library you’re forced to see the scale of it: in the period 1700-1923, maybe 10% of the volumes that could be cataloged as fiction present puzzling boundary cases.

You run into a lot of these works even if you browse or select titles at random; that’s how I met Shinkah. But I’ve also been training probabilistic models of genre that report, among other things, how certain or uncertain they are about each page. These models are good at identifying clear cases of our received categories; I found that they agreed with my research assistants almost exactly as often as the research assistants agreed with each other (93-94% of the time, about broad categories like fiction/nonfiction). But you can also ask a model to sift through several thousand volumes looking for hard cases. When I did that I was taken aback to discover that about half the volumes it had most trouble with were things I also found impossible to classify. The model was most uncertain, for instance, about The Terrific Register (1825) — an almanac that mixes historical anecdote, urban legend, and outright fiction randomly from page to page. The second-most puzzling book was Madagascar, or Robert Drury’s Journal (1729), a book that offers itself as a travel journal by a real person, and was for a long time accepted as one, although scholars have more recently argued that it was written by Defoe.

Of course, a statistical model of fiction doesn’t care whether things “really happened”; it pays attention mostly to word frequency. Past-tense verbs of speech, personal names, and “the,” for instance, are disproportionately common in fiction. “Is” and “also” and “mr” (and a few hundred other words) are common in nonfiction. Human readers probably think about genre in a more abstract way. But it’s not particularly miraculous that a model using word frequencies should be confused by the same examples we find confusing. The model was trained, after all, on examples tagged by human beings; the whole point of doing that was to reproduce as much as possible the contours of the boundary that separates genres for us. The only thing that’s surprising is that trawling the model through a library turns up more books right in the middle of the boundary region than our habits of literary attention would have suggested.

A lot of discussions of distant reading have imagined it as a move from canonical to popular or obscure examples of a (known) genre. But reconsidering our definitions of the genres we’re looking for may be just as important. We may come to recognize that “the novel” and “the lyric poem” have always been islands floating in a sea of other texts, widely read but never genre-typical enough to be replicated on English syllabi.

In the long run, this may require us to balance two kinds of inclusiveness. We already know that digital libraries exclude a lot. Allen Riddell has nicely demonstrated just how much: he concludes that there are digital scans for only about 58% of the novels listed in bibliographies as having been published between 1800 and 1836.

One way to ensure inclusion might be to start with those bibliographies, which highlight books invisible in digital libraries. On the other hand, bibliographies also make certain things invisible. The Terrific Register (1825), for instance, is not in Garside’s bibliography of early-nineteenth-century fiction. Neither is The Wonder-Working Water Mill (1791), to mention another odd thing I bumped into. These aren’t oversights; Garside et. al. acknowledge that they’re excluding certain categories of fiction from their conception of the novel. But because we’re trained to think about novels, the scale of that exclusion may only become visible after you spend some time trawling a library catalog.

I don’t want to present this as an aporia that makes it impossible to know where to start. It’s not. Most people attempting distant reading are already starting in the right place — which is to build up medium-sized collections of familiar generic categories like “the novel.” The boundaries of those categories may be blurrier than we usually acknowledge. But there’s also such a thing as fretting excessively about the synchronic representativeness of your sample. A lot of the interesting questions in distant reading are actually trends that involve relative, diachronic differences in the collection. Subtle differences of synchronic coverage may more or less drop out of questions about change over time.

On the other hand, if I’m right that the gray areas between (for instance) fiction and nonfiction are bigger and more persistently blurry than literary scholarship usually mentions, that’s probably in the long run an issue we should consider! When I release a page-level map of genre in a couple of weeks, I’m going to try to provide some dials that allow researchers to make more explicit choices about degrees of inclusion or exclusion.

Predictive models that report probabilities give us a natural way to handle this, because they allow us to characterize every boundary as a gradient, and explicitly acknowledge our compromises (for instance, trade-offs between precision and recall). People who haven’t done much statistical modeling often imagine that numbers will give humanists spuriously clear definitions of fuzzy concepts. My experience has been the opposite: I think our received disciplinary practices often make categories seem self-evident and stable because they teach us to focus on easy cases. Attempting to model those categories explicitly, on a large scale, can force you to acknowledge the real instability of the boundaries involved.

References and acknowledgments

Training data for this project was produced by Shawn Ballard, Jonathan Cheng, Lea Potter, Nicole Moore and Clara Mount, as well as me. Michael L. Black and Boris Capitanu built a GUI that helped us tag volumes at the page level. Material support was provided by the National Endowment for the Humanities and the American Council of Learned Societies. Some information about results and methods is online as a paper and a poster, but much more will be forthcoming in the next month or so — along with a page-level map of broad genre categories and types of paratext.

The project would have been impossible without help from HathiTrust and HathiTrust Research Center. I’ve also been taught to read MARC records by librarians and information scientists including Tim Cole, M. J. Han, Colleen Fallaw, and Jacob Jett, any of whom could teach a course on “Cursed Metadata in Theory and Practice.”

I mention Garside’s bibliography of early nineteenth-century fiction. This is Garside, Peter, and Rainer Schöwerling. The English novel, 1770-1829 : a bibliographical survey of prose fiction published in the British Isles. Ed. Peter Garside, James Raven, and Rainer Schöwerling. 2 vols. Oxford: Oxford University Press, 2000.

Paul Fyfe directed me to a couple of useful works on the genre of the sketch. Michael Widner has recently written a dissertation about the cognitive dimension of genre titled Genre Trouble. I’ve also tuned into ongoing thoughts about the temporal and social dimensions of genre from Daniel Allington and Michael Witmore. The now-classic pamphlet #1 from the Stanford Literary Lab, “Quantitative Formalism,” is probably responsible for my interest in the topic.

A window on the twentieth century may be about to open.

The nineteenth century gets a lot of attention from scholars interested in text mining, simply because it’s in the public domain. After 1923, you run into copyright laws that make it impossible to share digital texts of many volumes.

"Ray of Light," by Russell H Cribb, 2006.   CC-BY 2.0.

“Ray of Light,” by Russell H Cribb, 2006. CC-BY 2.0.

One of the most promising solutions to that problem is the non-consumptive research portal being designed by the HathiTrust Research Center. In non-consumptive research, algorithms characterize a collection without exposing the original texts to human reading or copying.

This could work in a range of ways. Some of them are complex — for instance, if worksets and algorithms have to be tailored to individual projects. HTRC is already supporting that kind of research, but expanding it to the twentieth century may pose problems of scale that take a while to solve. But where algorithms can be standardized, calculations can run once, in advance, across a whole collection, creating datasets that are easy to serve up in a secure way. This strategy could rapidly expand opportunities for research on twentieth-century print culture.

For instance, a great deal of interesting macroscopic research can be done, at bottom, by counting words. JSTOR has stirred up a lot of interest by making word counts available for scholarly journal articles. Word counts from printed books would be at least equally valuable, and relatively easy to provide.

So people interested in twentieth-century history and literary history should prick up their ears at the news that HathiTrust Research Center is releasing an initial set of word counts from public-domain works as an alpha test. This set only includes 250,000 of the eleven million volumes in HathiTrust, and does not yet include any data about works after 1923, but one can hope that the experiment will soon expand to cover the twentieth century. (I’m just an interested observer, so I don’t know how rapid the expansion will be, but the point of this experiment is ultimately to address obstacles to twentieth-century text mining.)

The data provided by HTRC is in certain ways richer than the data provided by JSTOR, and it may already provide a valuable service for scholars who study the nineteenth or early twentieth centuries. Words are tagged with parts of speech, and word counts are provided at the page level — an important choice, since a single volume may combine a number of different kinds of text. HTRC is also making an effort to separate recurring headers and footers from the main text on each page; they’re providing line counts and sentence counts for each page, and also providing a count of the characters that begin and end lines. In my own research, I’ve found that it’s possible to use this kind of information to separate genres and categories of paratext within a volume (the lines of an index tend to begin with capital letters and end with numbers, for instance).

Of course, researchers would like to pose many questions that can’t be answered by page-level unigram counts. Some of those questions will require custom-tailored algorithms. But other questions might be possible to address with precalculated features extracted in a relatively standard way.

Whatever kinds of information interest you, speak up for them, using the e-mail address provided on the HTRC feature-extraction page. And if this kind of service would have value for your research, please write in to say how you would use it. Part of the point of this experiment is to assess the degree of scholarly interest.

You can’t govern reception.

I’ve read a number of articles lately that posit “digital humanities” as a coherent intellectual movement that makes strong, scary normative claims about the proper future of the humanities as a whole.

Adam Kirsch’s piece in The New Republic is the latest of these; he constructs an opposition between a “minimalist” DH that simply uses computers to edit or read things as we have always done, and a “maximalist” version where technology is taking over English departments and leveling solitary genius in order to impose a cooperative but “post-verbal” vision of the future.

I think there’s a large excluded middle in that picture, where everything interesting actually happens. But I’m resisting — or trying to resist — the urge to write a blog post of clarification and explanation. Increasingly, I believe that’s a futile impulse, not only because “DH” can be an umbrella for many different projects, but more fundamentally because “the meaning of DH” is a perspectival question.

I mean it’s true, objectively, that the number of scholars actually pursuing (say) digital history or game studies is still rather small. But I nevertheless believe that Kirsch is sincere in perceiving them as the narrow end of a terrifying wedge. And there’s no way to prove he’s wrong about that, because threats are very much in the eye of the beholder. Projects don’t have to be explicitly affiliated with each other, or organized around an explicit normative argument, in order to be perceived collectively as an implicit rebuke to some existing scheme of values. In fact, people don’t even really get to choose whether they’re part of a threatening phenomenon. Franco Moretti hasn’t been cheerleading for anything called “digital humanities,” but that point is rapidly becoming moot.

I’m reminded of a piece of advice Mark Seltzer gave me sixteen years ago, during my dissertation defense. Like all grad students in the 90s, I had written an overly-long introduction explaining what my historical research meant in some grander theoretical way. As I recall, he said simply, “you can’t govern your own reception.” A surprisingly hard thing to accept! People of course want to believe that they’re the experts about the meaning of their own actions. But that’s not how social animals work.

So I’m going to try to resist the temptation to debate the meaning of “DH,” which is not in anyone’s control. Instead I’m going to focus on doing cool stuff. Like Alexis Madrigal’s reverse-engineering of Netflix genres, or Mark Sample’s Twitter bots, or the Scholars’ Lab project PRISM, which apparently forgot to take over English departments and took over K-12 education instead. At some future date, historians can decide whether any of that was digital humanities, and if so, what it meant.

(Comments are turned off, because you can’t moderate a comment thread titled “you can’t govern reception.”)

Postscript May 10th: This was written quickly, in the heat of the occasion, and I think my anecdote may be better at conveying a feeling than explaining its underlying logic. Obviously, “you can’t govern reception” cannot mean “never try to change what other people think.” Instead, I mean that “digital humanities” seems to me a historical generalization more than a “field” or a “movement” based on shared premises that could be debated. I see it as closer to “modernism,” for instance, than to “psychology” or “post-structuralism.”

You cannot really write editorials convincing people to like “modernism.” You’d have to write a book. Even then, understandings of the historical phenomenon are going to differ, and some people are going to feel nostalgic for impressionist painting. The analogy to “DH” is admittedly imperfect; DH is an academic phenomenon (mostly! at times it’s hard to distinguish from data journalism), and has slightly more institutional coherence than modernism did. But I’m not sure it has more intellectual coherence.

How much DH can we fit in a literature department?

It’s an open secret that the social phenomenon called “digital humanities” mostly grew outside the curriculum. Library-based programs like Scholars’ Lab at UVA have played an important role; so have “centers” like MITH (Maryland) and CHNM (George Mason) — not to mention the distributed unconference movement called THATCamp, which started at CHNM. At Stanford, the Literary Lab is a sui generis thing, related to departments of literature but not exactly contained inside them.

The list could go on, but I’m not trying to cover everything — just observing that “DH” didn’t begin by embedding itself in the curricula of humanities departments. It went around them, in improvisational and surprisingly successful ways.

That’s a history to be proud of, but I think it’s also setting us up for predictable frustrations at the moment, as disciplines decide to import “DH” and reframe it in disciplinary terms. (“Seeking a scholar of early modern drama, with a specialization in digital humanities …”)

Of course, digital methods do have consequences for existing disciplines; otherwise they wouldn’t be worth the trouble. In my own discipline of literary study, it’s now easy to point to a long sequence of substantive contributions to literary study that use digital methods to make thesis-driven interventions in literary history and even interpretive theory.

But although the research payoff is clear, the marriage between disciplinary and extradisciplinary institutions may not be so easy. I sense that a lot of friction around this topic is founded in a feeling that it ought to be straightforward to integrate new modes of study in disciplinary curricula and career paths. So when this doesn’t go smoothly, we feel there must be some irritating mistake in existing disciplines, or in the project of DH itself. Something needs to be trimmed to fit.

What I want to say is just this: there’s actually no reason this should be easy. Grafting a complex extradisciplinary project onto existing disciplines may not completely work. That’s not because anyone made a mistake.

Consider my home field of literary study. If digital methods were embodied in a critical “approach,” like psychoanalysis, they would be easy to assimilate. We could identify digital “readings” of familiar texts, add an article to every Norton edition, and be done with it. In some cases that actually works, because digital methods do after all change the way we read familiar texts. But DH also tends to raise foundational questions about the way literary scholarship is organized. Sometimes it valorizes things we once considered “mere editing” or “mere finding aids”; sometimes it shifts the scale of literary study, so that courses organized by period and author no longer make a great deal of sense. Disciplines can be willing to welcome new ideas, and yet (understandably) unwilling to undertake this sort of institutional reorganization.

Training is an even bigger problem. People have argued long and fiercely about the amount of digital training actually required to “do DH,” and I’m not going to resolve that question here. I just want to say that there’s a reason for the argument: it’s a thorny problem. In many cases, humanists are now tackling projects that require training not provided in humanities departments. There are a lot of possible fixes for that — we can make tools easier to use, foster collaboration — but none of those fixes solve the whole problem. Not everything can be externalized as a “tool.” Some digital methods are really new forms of interpretation; packaging them in a GUI would create a problematic black box. Collaboration, likewise, may not remove the need for new forms of training. Expecting computer scientists to do all the coding on a project can be like expecting English professors to do all the spelling.

I think these problems can find solutions, but I’m coming to suspect that the solutions will be messy. Humanities curricula may evolve, but I don’t think the majority of English or History departments are going to embrace rapid structural change — for instance, change of the kind that would be required to support graduate programs in distant reading. These disciplines have already spent a hundred years rejecting rapprochement with social science; why would they change course now? English professors may enjoy reading Moretti, but it’s going to be a long time before they add a course on statistical methods to the major.

Meanwhile, there are other players in this space (at least at large universities): iSchools, Linguistics, Departments of Communications, Colleges of Media. Digital methods are being assimilated rapidly in these places. New media, of course, are already part of media studies, and if a department already requires statistics, methods like topic modeling are less of a stretch. It’s quite possible that the distant reading of literary culture will end up being shared between literature departments and (say) Communications. The reluctance of literary studies to become a social science needn’t prevent social scientists from talking about literature.

I’m saying all this because I think there’s a strong tacit narrative in DH that understands extradisciplinary institutions as a wilderness, in which we have wandered that we may reach the promised land of recognition by familiar disciplinary authority. In some ways that’s healthy. It’s good to have work organized by clear research questions (so we aren’t just digitizing aimlessly), and I’m proud that digital methods are making contributions to the core concerns of literary studies.

But I’m also wary of the normative pressures associated with that narrative, because (if you’ll pardon the extended metaphor) I’m not sure this caravan actually fits in the promised land. I suspect that some parts of the sprawling enterprise called “DH” (in fact, some of the parts I enjoy most) won’t be absorbed easily in the curricula of History or English. That problem may be solved differently at different schools; the nice thing about strong extradisciplinary institutions is that they allow us to work together even if the question of disciplinary identity turns out to be complex.

postscript: This whole post should have footnotes to Bethany Nowviskie every time I use the term “extradisciplinary,” and to Matt Kirschenbaum every time I say “DH” with implicit air quotes.

New models of literary collectivity.

This is a version of a response I gave at session 155 of MLA 2014, “Literary Criticism at the Macroscale.” Slides and/or texts of the original papers by Andrew Piper and Hoyt Long and Richard So are available on the web, as is another resonse by Haun Saussy.

* * *

The papers we heard today were not picking the low-hanging fruit of text mining. There’s actually a lot of low-hanging fruit out there still worth picking — big questions that are easy to answer quantitatively and that only require organizing large datasets — but these papers were tackling problems that are (for good or ill) inherently more difficult. Part of the reason involves their transnational provenance, but another reason is that they aren’t just counting or mapping known categories but trying to rethink some of the basic concepts we use to write literary history — in particular, the concept we call “influence” or “diffusion” or “intertextuality.”

I’m tossing several terms at this concept because I don’t think literary historians have ever agreed what it should be called. But to put it very naively: new literary patterns originate somehow, and somehow they are reproduced. Different generations of scholars have modeled this differently. Hoyt and Richard quote Laura Riding and Robert Graves exploring, in 1927, an older model centered on basically personal relationships of imitation or influence. But early-twentieth-century scholars could also think anthropologically about the transmission of motifs or myths or A. O. Lovejoy’s “unit ideas.” In the later 20th century, critics got more cautious about implying continuity, and reframed this topic abstractly as “intertextuality.” But then the specificity of New Historicism sometimes pushed us back in the direction of tracing individual sources.

I’m retelling a story you already know, but trying to retell it very frankly, in order to admit that (while we’ve gained some insight) there is also a sense in which literary historians keep returning to the same problem and keep answering it in semi-satisfactory ways. We don’t all, necessarily, aspire to give a causal account of literary change. But I think we keep returning to this problem because we would like to have a kind of narrative that can move more smoothly between individual examples and the level of the discourse or genre. When we’re writing our articles the way this often works in practice is: “here’s one example, two examples — magic hand-waving — a discourse!”

leviathan_svSomething interesting and multivocal about literary history gets lost at the moment when we do that hand-waving. The things we call genres or discourses have an internal complexity that may be too big to illustrate with examples, but that also gets lost if you try to condense it into a single label, like “the epistolary novel.” Though we aspire to subtlety, in practice it’s hard to move from individual instances to groups without constructing something like the sovereign in the frontispiece for Hobbes’ Leviathan – a homogenous collection of instances composing a giant body with clear edges.

While they offer different solutions, I think both of the papers we heard today are imagining other ways to move between instances and groups. They both use digital methods to describe new forms of similarity between texts. And in both cases, the point of doing this lies less in precision than in creating a newly flexible model of collectivity. We gain a way of talking about texts that is collective and social, but not necessarily condensed into a single label. For Andrew, the “Werther effect” is less about defining a new genre than about recognizing a new set of relationships between different communities of works. For Hoyt and Richard, machine learning provides a way of talking about the reception of hokku that isn’t limited to formal imitation or to a group of texts obviously “influenced” by specific models. Algorithms help them work outward from clear examples of a literary-historical phenomenon toward a broader penumbra of similarity.

I think this kind of flexibility is one of the most important things digital tools can help us achieve, but I don’t think it’s on many radar screens right now. The reason, I suspect, is that it doesn’t fit our intuitions about computers. We understand that computers can help us with scale (distant reading), and we also get that they can map social networks. But the idea that computers can help us grapple with ambiguity and multiple determination doesn’t feel intuitive. Aren’t computers all about “binary logic”? If I tell my computer that this poem both is and is not a haiku, won’t it probably start to sputter and emit smoke?

Well, maybe not. And actually I think this is a point that should be obvious but just happens to fall in a cultural blind spot right now. The whole point of quantification is to get beyond binary categories — to grapple with questions of degree that aren’t well-represented as yes-or-no questions. Classification algorithms, for instance, are actually very good at shades of gray; they can express predictions as degrees of probability and assign the same text different degrees of membership in as many overlapping categories as you like. So I think it should feel intuitive that a quantitative approach to literary history would have the effect of loosening up categories that we now tend to treat too much as homogenous bodies. If you need to deal with gradients of difference, numbers are your friend.

Of course, how exactly this is going to work remains an open question. Technically, the papers we heard today approach the problem of similarity in different ways. Hoyt and Richard are borrowing machine learning algorithms that use the contrast between groups of texts to define similarity. Andrew’s improvising a different approach that uses a single work to define a set of features that can then be used to organize other works as an “exotext.” And other scholars have approached the same problem in other ways. Franco Moretti’s chapter on “Trees” also bridges the gap I’m talking about between individual examples and coherent discourses; he does it by breaking the genre of detective fiction up into a tree of differentiations. It’s not a computational approach, but for some problems we may not need computation. Matt Jockers, on the other hand, has a chapter on “influence” in Macroanalysis that uses topic modeling to define global criteria of similarity for nineteenth-century novels. And I could go on: Sara Steger, for instance, has done work on sentimentality in the nineteenth century novel that uses machine learning in a loosely analogous way to think about the affective dimension of genre.

The differences between these projects are worth discussing, but in this response I’m more interested in highlighting the common impulse they share. While these projects explore specific problems in literary history, they can also be understood as interventions in literary theory, because they’re all attempting to rethink certain basic concepts we use to organize literary-historical narrative. Andrew’s concept of the “exotext” makes this theoretical ambition most overt, but I think it’s implicit across a range of projects. For me the point of the enterprise, at this stage, is to brainstorm flexible alternatives to our existing, slightly clunky, models of literary collectivity. And what I find exciting at the moment is the sheer proliferation of alternatives.

Measurement and modeling.

If the Internet is good for anything, it’s good for speeding up the Ent-like conversation between articles, to make that rumble more perceptible by human ears. I thought I might help the process along by summarizing the Stanford Literary Lab’s latest pamphlet — a single-authored piece by Franco Moretti, “‘Operationalizing’: or the function of measurement in modern literary theory.”

One of the many strengths of Moretti’s writing is a willingness to dramatize his own learning process. This pamphlet situates itself as a twist in the ongoing evolution of “computational criticism,” a turn from literary history to literary theory.

Measurement as a challenge to literary theory, one could say, echoing a famous essay by Hans Robert Jauss. This is not what I expected from the encounter of computation and criticism; I assumed, like so many others, that the new approach would change the history, rather than the theory of literature ….

Measurement challenges literary theory because it asks us to “operationalize” existing critical concepts — to say, for instance, exactly how we know that one character occupies more “space” in a work than another. Are we talking simply about the number of words they speak? or perhaps about their degree of interaction with other characters?

Moretti uses Alex Woloch’s concept of “character-space” as a specific example of what it means to operationalize a concept, but he’s more interested in exploring the broader epistemological question of what we gain by operationalizing things. When literary scholars discuss quantification, we often tacitly assume that measurement itself is on trial. We ask ourselves whether measurement is an adequate proxy for our existing critical concepts. Can mere numbers capture the ineffable nuances we assume they possess? Here, Moretti flips that assumption and suggests that measurement may have something to teach us about our concepts — as we’re forced to make them concrete, we may discover that we understood them imperfectly. At the end of the article, he suggests for instance (after begging divine forgiveness) that Hegel may have been wrong about “tragic collision.”

I think Moretti is frankly right about the broad question this pamphlet opens. If we engage quantitative methods seriously, they’re not going to remain confined to empirical observations about the history of predefined critical concepts. Quantification is going to push back against the concepts themselves, and spill over into theoretical debate. I warned y’all back in August that literary theory was “about to get interesting again,” and this is very much what I had in mind.

At this point in a scholarly review, the standard procedure is to point out that a work nevertheless possesses “oversights.” (Insight, meet blindness!) But I don’t think Moretti is actually blind to any of the reflections I add below. We have differences of rhetorical emphasis, which is not the same thing.

For instance, Moretti does acknowledge that trying to operationalize concepts could cause them to dissolve in our hands, if they’re revealed as unstable or badly framed (see his response to Bridgman on pp. 9-10). But he chooses to focus on a case where this doesn’t happen. Hegel’s concept of “tragic collision” holds together, on his account; we just learn something new about it.

In most of the quantitative projects I’m pursuing, this has not been my experience. For instance, in developing statistical models of genre, the first thing I learned was that critics use the word genre to cover a range of different kinds of categories, with different degrees of coherence and historical volatility. Instead of coming up with a single way to operationalize genre, I’m going to end up producing several different mapping strategies that address patterns on different scales.

Something similar might be true even about a concept like “character.” In Vladimir Propp’s Morphology of the Folktale, for instance, characters are reduced to plot functions. Characters don’t have to be people or have agency: when the hero plucks a magic apple from a tree, the tree itself occupies the role of “donor.” On Propp’s account, it would be meaningless to represent a tale like “Le Petit Chaperon Rouge” as a social network. Our desire to imagine narrative as a network of interactions between imagined “people” (wolf ⇌ grandmother) presupposes a separation between nodes and edges that makes no sense for Propp. But this doesn’t necessarily mean that Moretti is wrong to represent Hamlet as a social network: Hamlet is not Red Riding Hood, and tragic drama arguably envisions character in a different way. In short, one of the things we might learn by operationalizing the term “character” is that the term has genuinely different meanings in different genres, obscured for us by the mere continuity of a verbal sign. [I should probably be citing Tzvetan Todorov here, The Poetics of Prose, chapter 5.]

Illustration from "Learning Latent Personas of Film Characters," Bamman et. al.

Illustration from “Learning Latent Personas of Film Characters,” Bamman et. al.

Another place where I’d mark a difference of emphasis from Moretti involves the tension, named in my title, between “measurement” and “modeling.” Moretti acknowledges that there are people (like Graham Sack) who assume that character-space can’t be measured directly, and therefore look for “proxy variables.” But concepts that can’t be directly measured raise a set of issues that are quite a bit more challenging than the concept of a “proxy” might imply. Sack is actually trying to build models that postulate relations between measurements. Digital humanists are probably most familiar with modeling in the guise of topic modeling, a way of mapping discourse by postulating latent variables called “topics” that can’t be directly observed. But modeling is a flexible heuristic that could be used in a lot of different ways.

The illustration on the right is a probabilistic graphical model drawn from a paper on the “Latent Personas of Film Characters” by Bamman, O’Connor, and Smith. The model represents a network of conditional relationships between variables. Some of those variables can be observed (like words in a plot summary w and external information about the film being summarized md), but some have to be inferred, like recurring character types (p) that are hypothesized to structure film narrative.

Having empirically observed the effects of illustrations like this on literary scholars, I can report that they produce deep, Lovecraftian horror. Nothing looks bristlier and more positivist than plate notation.

But I think this is a tragic miscommunication produced by language barriers that both sides need to overcome. The point of model-building is actually to address the reservations and nuances that humanists correctly want to interject whenever the concept of “measurement” comes up. Many concepts can’t be directly measured. In fact, many of our critical concepts are only provisional hypotheses about unseen categories that might (or might not) structure literary discourse. Before we can attempt to operationalize those categories, we need to make underlying assumptions explicit. That’s precisely what a model allows us to do.

It’s probably going to turn out that many things are simply beyond our power to model: ideology and social change, for instance, are very important and not at all easy to model quantitatively. But I think Moretti is absolutely right that literary scholars have a lot to gain by trying to operationalize basic concepts like genre and character. In some cases we may be able to do that by direct measurement; in other cases it may require model-building. In some cases we may come away from the enterprise with a better definition of existing concepts; in other cases those concepts may dissolve in our hands, revealed as more unstable than even poststructuralists imagined. The only thing I would say confidently about this project is that it promises to be interesting.

The imaginary conflicts disciplines create.

One thing I’ve never understood about humanities disciplines is our insistence on staging methodology as ethical struggle. I don’t think humanists are uniquely guilty here; at bottom, it’s probably the institution of disciplinarity itself that does it. But the normative tone of methodological conversation is particularly odd in the humanities, because we have a reputation for embracing multiple perspectives. And yet, where research methods are concerned, we actually seem to find that very hard.

It never seems adequate to say “hey, look through the lens of this method for a sec — you might see something new.” Instead, critics practicing historicism feel compelled to justify their approach by showing that close reading is the crypto-theological preserve of literary mandarins. Arguments for close reading, in turn, feel compelled to claim that distant reading is a slippery slope to takeover by the social sciences — aka, a technocratic boot stomping on the individual face forever. Or, if we do admit that multiple perspectives have value, we often feel compelled to prescribe some particular balance between them.

Imagine if biologists and sociologists went at each other in the same way.

“It’s absurd to study individual bodies, when human beings are social animals!”

“Your obsession with large social phenomena is a slippery slope — if we listened to you, we would eventually forget about the amazing complexity of individual cells!”

“Both of your methods are regrettably limited. What we need, today, is research that constantly tempers its critique of institutions with close analysis of mitochondria.”

As soon as we back up and think about the relation between disciplines, it becomes obvious that there’s a spectrum of mutually complementary approaches, and different points on the spectrum (or different combinations of points) can be valid for different problems.

So why can’t we see this when we’re discussing the possible range of methods within a discipline? Why do we feel compelled to pretend that different approaches are locked in zero-sum struggle — or that there is a single correct way of balancing them — or that importing methods from one discipline to another raises a grave ethical quandary?

It’s true that disciplines are finite, and space in the major is limited. But a debate about “what will fit in the major” is not the same thing as ideology critique or civilizational struggle. It’s not even, necessarily, a substantive methodological debate that needs to be resolved.