Categories
DH as a social phenomenon

Hold on loosely; or, Gemeinschaft and Gesellschaft on the web.

I want to try a quick experiment.

The digital humanities community must …

If that sounds like a plausible beginning to a sentence, what about this one?

The literary studies community must …

Does that sound as odd to you as it does to me? No one pretends literary studies is a community. In the U.S., the discipline becomes visible to itself mainly at the spectacular, but famously alienating, yearly ritual of the MLA. A hotel that contains disputatious full professors and brilliant underemployed jobseekers may be many interesting things, but “community” is not the first word that comes to mind.

“Digital humanities,” on the other hand, frequently invokes itself as a “community.” The reasons may stretch back into the 90s, and to the early beleaguered history of humanities computing. But the contemporary logic of the term is probably captured by Matt Kirschenbaum, who stresses that the intellectually disparate projects now characterized as DH are unified above all by reliance on social media, especially Twitter.

In many ways that’s a wonderful thing. Twitter is not a perfectly open form, and it’s certainly not an egalitarian one; it has a one-to-many logic. But you don’t have to be a digital utopian to recognize that academic fields benefit from frequent informal contact among their members — what Dan Cohen has described as “the sidewalk life of successful communities.” Twitter is especially useful for establishing networks that cross disciplinary (and professional) boundaries; I’ve learned an amazing amount from those networks.

On the other hand, the illusion of open and infinitely extensible community created by Twitter has some downsides. Ferdinand Tönnies’s distinction between Gemeinschaft and Gesellschaft may not describe all times and places well, but I find it useful here as a set of ideal types. A Gemeinschaft (community) is bound together by personal contact among members and by shared implicit values. It may lack formal institutions, so its members have to be restrained by moral suasion and peer pressure. A Gesellschaft (society) doesn’t expect all its members to share the same values; it expects them to be guided mostly by individual aims, restrained and organized by formal institutions.

Given that choice, wouldn’t everyone prefer to live in cozy Gemeinschaft? Well, sure, except … remember you’re going to have to agree on a set of values! Digital humanists have spent a lot of time discussing values (Lisa Spiro, “Why We Fight”), but as the group gets larger that discussion may prove quite difficult. In the humanities, disagreeing about values is part of our job. It may be just one part of the job in humanities computing, which has a collaborative emphasis. But disagreeing about values has been almost the whole job in more traditional precincts of the humanities. As DH expands, that difference creates yet another layer of disagreement — a meta-struggle over meta-values labeled “hack” and “yack.”

But you know that. Why am I saying all this? I hope the frame I’m offering here is a useful way to understand the growing pains of a web-mediated academic project. DH has at times done a pretty good imitation of Gemeinschaft, but as it gets bigger it’s necessarily going to become more Geselle-y. Which may sound sadder than it is; here’s where I invoke the title of this post. Academic community doesn’t have to be impersonal, but in the immortal words of .38 Special, we need to give each other “a whole lot of space to breathe in.”

This may involve consciously bracketing several values that we celebrate in other contexts. For instance, the centrifugal logic of a growing field isn’t a problem that can be solved by “niceness.” Resolving academic debates by moral suasion on Twitter is not just a bad idea because it produces flame wars. It would be an even worse idea if it worked — because we don’t really want an academic project to have that kind of consensus, enforced by personal ties and displays of collective solidarity.

On the other hand, the values of “candor” and “open debate” may be equally problematic on the web. Filter bubbles have their uses. I want to engage all points of view, but I can’t engage them all at one-hour intervals.

An open question that I can’t answer concerns the role of Twitter here. I’ve found it enormously valuable, both as a latecomer to “DH,” and as an interested lurker in several other fields (machine learning, linguistics, computational social science). I also find it personally enjoyable. But it’s possible that Twitter will just structurally tempt humanists into attempting a more cohesive, coercive kind of Gemeinschaft than academic social networks can (or should) sustain. It’s also possible that we’ll see a kind of cyclic logic here, where Twitter remains valuable for newcomers but tends to become a drain on the time and energy of scholars who already have extensive networks in a field. I don’t know.

Postscript a few hours later: The best reflection on the “cyclic logic” of academic projects online is still Bethany Nowviskie’s “Eternal September of the Digital Humanities,” which remains strikingly timely even after the passage of (gasp) three years.

Categories
genre comparison interpretive theory methodology

One way numbers can after all make us dumber.

[Used to have a more boring title still preserved in the URL. -Ed.] In general I’m deeply optimistic about the potential for dialogue between the humanities and quantitative disciplines. I think there’s a lot we can learn from each other, and I don’t think the humanities need any firewall to preserve their humanistic character.

But there is one place where I’m coming to agree with people who say that quantitative methods can make us dumber. To put it simply: numbers tend to distract the eye. If you quantify part of your argument, critics (including your own internal critic) will tend to focus on problems in the numbers, and ignore the deeper problems located elsewhere.

I’ve discovered this in my own practice. For instance, when I blogged about genre in large digital collections. I got a lot of useful feedback on those blog posts; it was probably the most productive conversation I’ve ever had as a scholar. But most of the feedback focused on potential problems in the quantitative dimension of my argument. E.g., how representative was this collection as a sample of print culture? Or, what smoothing strategies should I be using to plot results? My own critical energies were focused on similar questions.

Those questions were useful, and improved the project greatly, but in most cases they didn’t rock its foundations. And with a year’s perspective, I’ve come to recognize that there were after all foundation-rocking questions to be posed. For instance, in early versions of this project, I hadn’t really ironed out the boundary between “poetry” and “drama.” Those categories overlap, after all! This wasn’t creating quantitative problems (Jordan Sellers and I were handling cases consistently), but it was creating conceptual ones: the line “poetry” below should probably be labeled “nondramatic verse.”

Results I think are still basically reliable, although we need to talk more about that word "genre."
Results I think are still basically reliable, although we need to talk more about that word “genre.”
The biggest problem was even less quantitative, and more fundamental: I needed to think harder about the concept of genre itself. As I model different kinds of genre, and read about similar (traditional and digital) projects by other scholars, I increasingly suspect the elephant in the room is that the word may not actually hold together. Genre may be a box we’ve inherited for a whole lot of basically different things. A bibliography is a genre; so is the novel; so is science fiction; so is the Kailyard school; so is acid house. But formally, socially, and chronologically, those are entities of very different kinds.

Skepticism about foundational concepts has been one of the great strengths of the humanities. The fact that we have a word for something (say genre or the individual) doesn’t necessarily imply that any corresponding entity exists in reality. Humanists call this mistake “reification,” and we should hold onto our skepticism about it. If I hand you a twenty-page argument using Google ngrams to prove that the individual has been losing ground to society over the last hundred years, your response should not be “yeah, but how representative is Google Books, and how good is their OCR?” (Those problems are relatively easy to solve.) Your response should be, “Uh … how do you distinguish ‘the individual’ from ‘society’ again?”

As I said, humanists have been good at catching reification; it’s a strength we should celebrate. But I don’t see this habit of skepticism as an endangered humanistic specialty that needs to be protected by a firewall. On the contrary, we should be exporting our skepticism! This habit of questioning foundational concepts can be just as useful in the sciences and social sciences, where quantitative methods similarly distract researchers from more fundamental problems. [I don’t mean to suggest that it’s never occurred to scientists to resist this distraction: as Matt Wilkens points out in the comments, they’re often good at it. -Ed.]

In psychology, for instance, emphasis on clearing a threshold of statistical significance (defined as a p-value) frequently distracts researchers from more fundamental questions of experimental design (like, are we attempting to measure an entity that actually exists?) Andrew Gelman persuasively suggests that this is not just a problem caused by quantification but can be more broadly conceived as a “dangerous lure of certainty.” In any field, it can be tempting to focus narrowly on the degree of certainty associated with a hypothesis. But it’s often more important to ask whether the underlying question is interesting and meaningfully framed.

On the other hand, this doesn’t mean that humanists need to postpone quantitative research until we know how to define long-debated concepts. I’m now pretty skeptical about the coherence of this word genre, for instance, but it’s a skepticism I reached precisely by attempting to iron out details in a quantitative model. Questions about accuracy can prompt deeper conceptual questions, which reframe questions of accuracy, in a virtuous cycle. The important thing, I think, is not to let yourself stall out on the “accuracy” part of the cycle: it offers a tempting illusion of perfectibility, but that’s not actually our goal.

Postscript: Scott Weingart conveys the point I’m trying to make in a nicely compressed way by saying that it flips the conventional worry that the mere act of quantification will produce unearned trust. In academia, the problem is more often inverse: we’re so strongly motivated to criticize numbers that we forget to be skeptical about everything else.

Categories
disciplinary history interpretive theory machine learning

Interesting times for literary theory.

A couple of weeks ago, after reading abstracts from DH2013, I said that the take-away for me was that “literary theory is about to get interesting again” – subtweeting the course of history in a way that I guess I ought to explain.

A 1915 book by Chicago's "Professor of Literary Theory."
A 1915 book by Chicago’s “Professor of Literary Theory.”

In the twentieth century, “literary theory” was often a name for the sparks that flew when literary scholars pushed back against challenges from social science. Theory became part of the academic study of literature around 1900, when the comparative study of folklore seemed to reveal coherent patterns in national literatures that scholars had previously treated separately. Schools like the University of Chicago hired “Professors of Literary Theory” to explore the controversial possibility of generalization.* Later in the century, structural linguistics posed an analogous challenge, claiming to glimpse an organizing pattern in language that literary scholars sought to appropriate and/or deconstruct. Once again, sparks flew.

I think literary scholars are about to face a similarly productive challenge from the discipline of machine learning — a subfield of computer science that studies learning as a problem of generalization from limited evidence. The discipline has made practical contributions to commercial IT, but it’s an epistemological method founded on statistics more than it is a collection of specific tools, and it tends to be intellectually adventurous: lately, researchers are trying to model concepts like “character” (pdf) and “gender,” citing Judith Butler in the process (pdf).

At DH2013 and elsewhere, I see promising signs that literary scholars are gearing up to reply. In some cases we’re applying methods of machine learning to new problems; in some cases we’re borrowing the discipline’s broader underlying concepts (e.g. the notion of a “generative model”); in some cases we’re grappling skeptically with its premises. (There are also, of course, significant collaborations between scholars in both fields.)

This could be the beginning of a beautiful friendship. I realize a marriage between machine learning and literary theory sounds implausible: people who enjoy one of these things are pretty likely to believe the other is fraudulent and evil.** But after reading through a couple of ML textbooks,*** I’m convinced that literary theorists and computer scientists wrestle with similar problems, in ways that are at least loosely congruent. Neither field is interested in the mere accumulation of data; both are interested in understanding the way we think and the kinds of patterns we recognize in language. Both fields are interested in problems that lack a single correct answer, and have to be mapped in shades of gray (ML calls these shades “probability”). Both disciplines are preoccupied with the danger of overgeneralization (literary theorists call this “essentialism”; computer scientists call it “overfitting”). Instead of saying “every interpretation is based on some previous assumption,” computer scientists say “every model depends on some prior probability,” but there’s really a similar kind of self-scrutiny involved.

It’s already clear that machine learning algorithms (like topic modeling) can be useful tools for humanists. But I think I glimpse an even more productive conversation taking shape, where instead of borrowing fully-formed “tools,” humanists borrow the statistical language of ML to think rigorously about different kinds of uncertainty, and return the favor by exposing the discipline to boundary cases that challenge its methods.

Won’t quantitative models of phenomena like plot and genre simplify literature by flattening out individual variation? Sure. But the same thing could be said about Freud and Lévi-Strauss. When scientists (or social scientists) write about literature they tend to produce models that literary scholars find overly general. Which doesn’t prevent those models from advancing theoretical reflection on literature! I think humanists, conversely, can warn scientists away from blind alleys by reminding them that concepts like “gender” and “genre” are historically unstable. If you assume words like that have a single meaning, you’re already overfitting your model.

Of course, if literary theory and computer science do have a conversation, a large part of the conversation is going to be a meta-debate about what the conversation can or can’t achieve. And perhaps, in the end, there will be limits to the congruence of these disciplines. Alan Liu’s recent essay in PMLA pushes against the notion that learning algorithms can be analogous to human interpretation, suggesting that statistical models become meaningful only through the inclusion of human “seed concepts.” I’m not certain how deep this particular disagreement goes, because I think machine learning researchers would actually agree with Liu that statistical modeling never starts from a tabula rasa. Even “unsupervised” algorithms have priors. More importantly, human beings have to decide what kind of model is appropriate for a given problem: machine learning aims to extend our leverage over large volumes of data, not to take us out of the hermeneutic circle altogether.

But as Liu’s essay demonstrates, this is going to be a lively, deeply theorized conversation even where it turns out that literary theory and computer science have fundamental differences. These disciplines are clearly thinking about similar questions: Liu is right to recognize that unsupervised learning, for instance, raises hermeneutic questions of a kind that are familiar to literary theorists. If our disciplines really approach similar questions in incompatible ways, it will be a matter of some importance to understand why.

0804784469* <plug> For more on “literary theory” in the early twentieth century, see the fourth chapter of Why Literary Periods Mattered: Historical Contrast and the Prestige of English Studies (2013, hot off the press). The book has a lovely cover, but unfortunately has nothing to do with machine learning. </plug>

** This post grows out of a conversation I had with Eleanor Courtemanche, in which I tried to convince her that machine learning doesn’t just reproduce the biases you bring to it.

*** Practically, I usually rely on Data Mining: Practical Machine Learning Tools and Techniques (Ian Witten, Eibe Frank, Mark Hall), but to understand the deeper logic of the field I’ve been reading Machine Learning: A Probabilistic Perspective (Kevin P. Murphy). Literary theorists may appreciate Murphy’s remark that wealth has a long-tailed distribution, “especially in plutocracies such as the USA” (43).

PS later that afternoon: Belatedly realize I didn’t say anything about the most controversial word in my original tweet: “literary theory is about to get interesting again.” I suppose I tacitly distinguish literary theory (which has been a little sleepy lately, imo) from theory-sans-adjective (which has been vigorous, although hard to define). But now I’m getting into a distinction that’s much too slippery for a short blog post.

Categories
methodology

On not trusting people who promise “to use their powers for good.”

Data mining is troubling for some of the same reasons that social science in general is troubling. It suggests that our actions are legible from a perspective we don’t immediately possess, and reveal things we haven’t consciously chosen to reveal. This asymmetry of knowledge is unsettling even when posed abstractly as a question of privacy. It becomes more concretely worrisome when power is added to the equation. Kieran Healy has written a timely blog post showing how the network analysis that allows us to better understand Boston in the 1770s could also be used as an instrument of social control. The NSA’s programs of secret surveillance are Healy’s immediate target, but it’s not difficult to imagine that corporate data mining could be used in equally troubling ways.

Right now, for reasons of copyright law, humanists mostly mine data about the dead. But if we start teaching students how to do this, it’s very likely that some of them will end up working in corporations or in the government. So it’s reasonable to ask how we propose to deal with the political questions these methods raise.

My own view is that we should resist the temptation to say anything reassuring, because professional expertise can’t actually resolve the underlying political problem. Any reassurance academics might offer will be deceptive.

The classic form of this deception is familiar from the opening scenes of a monster movie. “Relax! I can assure you that the serum I have developed will only be used for good.”

Poster from the 1880s, courtesy Wikimedia commons.
Poster from the 1880s, courtesy Wikimedia commons.
Of course, something Goes Horribly Wrong. But since monster movies aren’t usually made about humanists, we may not recognize ourselves in this picture. We don’t usually “promise to use our powers for good”; we strike a different tone.

For instance: “I admit that in their current form, these methods are problematic. They have the potential to reduce people to metadata in a way that would be complicit with state and corporate power. But we can’t un-invent computers or statistical analysis. So I think humanists need to be actively involved in these emerging discourses as cultural critics. We must apply our humanistic values to create a theoretical framework that will ensure new forms of knowledge get used in cautious, humane, skeptical ways.”

I suspect some version of that statement will be very popular among humanists. It strikes a tone we’re comfortable with, and it implies that there’s an urgent need for our talents. And in fact, there’s nothing wrong with articulating a critical, humanistic perspective on data mining. It’s worth a try.

But if you back up far enough — far enough that you’re standing outside the academy altogether — humanists’ claims about the restraining value of cultural critique sound a lot like “I promise only to use my powers for good.” The naive scientist says “trust me; my professional integrity will ensure that this gets used well.” The naive humanist says “trust me; my powers of skeptical critique will ensure that this gets used well.” I wouldn’t advise the public to trust either of them.

I don’t have a solution to offer, either. Just about everything human beings have invented — from long pointy sticks to mathematics to cultural critique — can be used badly. It’s entirely possible that we could screw things up in a major way, and end up in an authoritarian surveillance state. Mike Konczal suggests we’re already there. I think history has some useful guidance to offer, but ultimately, “making sure we don’t screw this up” is not a problem that can be solved by any form of professional expertise. It’s a political problem — which is to say, it’s up to all of us to solve it.

The case of Edward Snowden may be worth a moment’s thought here. I’m not in a position to decide whether he acted rightly. We don’t have all the facts yet, and even when we have them, it may turn out to be a nasty moral problem without clear answers. What is clear is that Snowden was grappling with exactly the kinds of political questions data mining will raise. He had to ask himself, not just whether the knowledge produced by the NSA was being abused today, but whether it was a kind of knowledge that might structurally invite abuse over a longer historical timeframe. To think that question through you have to know something about the ways societies can change; you have to imagine the perspectives of people outside your immediate environment, and you have to have some skepticism about the distorting effects of your own personal interest.

These are exactly the kinds of reflection that I hope the humanities foster; they have a political value that reaches well beyond data mining in particular. But Snowden’s case is especially instructive because he’s one of the 70% of Americans who don’t have a bachelor’s degree. Wherever he learned to think this way, it wasn’t from a college course in the humanities. Instead he seems to have relied on a vernacular political tradition that told him certain questions ought to be decided by “the public,” and not delegated to professional experts.

Again, I don’t know whether Snowden acted rightly. But in general, I think traditions of democratic governance are a more effective brake on abuses of knowledge than any code of professional ethics. In fact, the notion of “professional ethics” can be a bit counter-productive here since it implies that certain decisions have to be restricted to people with an appropriate sort of training or cultivation. (See Timothy Burke’s related reflections on “the covert imagination.”)

I’m not suggesting that we shouldn’t criticize abuses of statistical knowledge; on the contrary, that’s an important topic, and I expect that many good things will be written about it both by humanists and by statisticians. What I’m saying is that we shouldn’t imagine that our political responsibilities on this topic can ever be subsumed in or delegated to our professional identities. The tension between authoritarian and democratic uses of social knowledge is not a problem that can be resolved by a more chastened or enlightened methodology, or by any form of professional expertise. It requires concrete political action — which is to say, it has to be decided by all of us.

Categories
problems of scale

Against (talking about) “big data.”

Is big data the future of X? Yes, absolutely, for all X. No, forget about big data: small data is the real revolution! No, wait. Forget about big and small — what matters is long data.

800px-Looking_Up_at_Empire_State_BuildingConversation about “big data” has become a hilarious game of buzzword bingo, aggravated by one of the great strengths of social media — the way conversations in one industry or field seep into another. I’ve seen humanists retweet an article by a data scientist criticizing “big data,” only to discover a week later that their author defines “small data” as anything less than a terabyte. Since the projects that humanists would call “big” usually involve less than a tenth of a terabyte, it turns out that our brutal gigantism is actually artisanal and twee.

The discussion is incoherent, but human beings like discussion, and are reluctant to abandon a lively one just because it makes no sense. One popular way to save this conversation is to propose that the “big” in “big data” may be a purely relative term. It’s “whatever is big for you.” In other words, perhaps we’re discussing a generalized expansion of scale, across all scales? For Google, “big data” might mean moving from petabytes to exabytes. For a biologist, it might mean moving from gigabytes to terabytes. For a humanist, it might mean any use of quantitative methods at all.

This solution is rhetorically appealing, but still incoherent. The problem isn’t just that we’re talking about different sizes of data. It’s that the concept of “big data” conflates trends located in different social contexts, that raise fundamentally different questions.

To sort things out a little, let me name a few of the different contexts involved:

1) Big IT companies are simply confronting new logistical problems. E.g., if you’re wrangling a petabyte or more, it no longer makes sense to move the data around. Instead you want to clone your algorithm and send it to the (various) machines where the data already lives.

2) But this technical sense of the word shades imperceptibly into another sense where it’s really a name for new business opportunities. The fact that commerce is now digital means that companies can get a new stream of information about consumers. This sort of market research may or may not actually require managing “big data” in sense (1). A widely-cited argument from Microsoft Research suggests that most applications of this kind involve less than 14GB and could fit into memory on a single machine.

3) Interest in these business opportunities has raised the profile of a loosely-defined field called “data science,” which might include machine learning, data mining, information retrieval, statistics, and software engineering, as well as aspects of social-scientific and humanistic analysis. When The New York Times writes that a Yale researcher has “used Big Data” to reveal X — with creepy capitalization — they’re not usually making a claim about the size of the dataset at all. They mean that some combination of tools from this toolkit was involved.

4) Social media produces new opportunities not only for corporations, but for social scientists, who now have access to a huge dataset of interactions between real, live, dubiously representative people. When academics talk about “big data,” they’re most often discussing the promise and peril of this research. Jean Burgess and Axel Bruns have focused explicitly on the challenges of research using Twitter, as have Melissa Terras, Shirley Williams, and Claire Warwick.

5) Some prominent voices (e.g., the editor-in-chief of Wired) have argued that the availability of data makes explicit theory-building less important. Most academics I know are at least slightly skeptical. The best case for this thesis might be something like machine translation, where a brute-force approach based on a big corpus of examples turns out to be more efficient than a painstakingly crafted linguistic model. Clement Levallois, Stephanie Steinmetz, and Paul Wouters have reflected thoughtfully on the implications for social science.

6) In a development that may or may not have anything to do with senses 1-5, quantitative methods have started to seem less ridiculous to humanists. Quantitative research has a long history in the humanities, from ARTFL to the Annales school to nineteenth-century philology. But it has never occupied center stage — and still doesn’t, although it is now considered worthy of debate. Since humanists usually still work with small numbers of examples, any study with n > 50 is in danger of being described as an example of “big data.”

These are six profoundly different issues. I don’t mean to deny that they’re connected: contemporaneous trends are almost always connected somehow. The emergence of the Internet is probably a causal factor in everything described above.

But we’re still talking about developments that are very different — not just because they involve different scales, but because they’re grounded in different institutions and ideas. I can understand why journalists are tempted to lump all six together with a buzzword: buzz is something that journalists can’t afford to ignore. But academics should resist taking the bait: you can’t make a cogent argument about a buzzword.

I think it’s particularly a mistake to assume that interest in scale is associated with optimism about the value of quantitative analysis. That seems to be the assumption driving a lot of debate about this buzzword, but it doesn’t have to be true at all.

To take an example close to my heart: the reason I don’t try to mine small datasets is that I’m actually very skeptical about the humanistic value of quantification. Until we get full-blown AI, I doubt that computers will add much to our interpretation of one, or five, or twenty texts. In the context of obsession with the boosterism surrounding “big data,” people tend to understand this hesitation as a devaluation of something called (strangely) “small data.” But the issue is really the reverse: the interpretive problems in individual works are interesting and difficult, and I don’t think digital technology provides enough leverage to crack them. In the humanities, numbers help mainly with simple problems that happen to be too large to fit in human memory.

To make a long story short: “big data” is not an imprecise-but-necessary term. It’s a journalistic buzzword with a genuinely harmful kind of incoherence. I personally avoid it, and I think even journalists should proceed with caution.

Categories
fiction

A new approach to the history of character?

In Macroanalysis, Matt Jockers points out that computational stylistics has found it hard to grapple with “the aspects of writing that readers care most deeply about, namely plot, character, and theme” (118). He then proceeds to use topic modeling to pretty thoroughly anatomize theme in the nineteenth-century novel. One down, I guess, two to go!

But plot and character are probably harder than theme; it’s not yet clear how we would trace those patterns in thousands of volumes. So I think it may be worth flagging a very promising article by David Bamman, Brendan O’Connor, and Noah A. Smith. Computer scientists don’t often develop a new methodology that could seriously enrich criticism of literature and film. But this one deserves a look. (Hat tip to Lynn Cherny, by the way, for this lead.)

Emotion-Masks-760092The central insight in the article is that character can be modeled grammatically. If you can use natural language processing to parse sentences, you should be able to identify what’s being said about a given character. The authors cleverly sort “what’s being said” into three questions: what does the character do, what do they suffer or undergo, and what qualities are attributed to them? The authors accordingly model character types (or “personas”) as a set of three distributions over these different domains. For instance, the ZOMBIE persona might do a lot of “eating” and “killing,” get “killed” in turn, and find himself described as “dead.”

The authors try to identify character types of this kind in a collection of 42,306 movie plot summaries extracted from Wikipedia. The model they use is a generative one, which entails assumptions that literary critics would call “structuralist.” Movies in a given genre have a tendency to rely on certain recurring character types. Those character types in turn “generate” the specific characters in a given story, which in turn generate the actions and attributes described in the plot summary.

Using this model, they reason inward from both ends of the process. On the one hand, we know the genres that particular movies belong to. On the other hand, we can see that certain actions and attributes tend to recur together in plot summaries. Can we infer the missing link in this process — the latent character types (“personas”) that mediate the connection from genre to action?

It’s a very thoughtful model, both mathematically and critically. Does it work? Different disciplines will judge success in different ways. Computer scientists tend to want to validate a model against some kind of ground truth; in this case they test it against character patterns described by fans on TV Tropes. Film critics may be less interested in validating the model than in seeing whether it tells them anything new about character. And I think the model may actually have some new things to reveal; among other things, it suggests that the vocabulary used to describe character is strongly coded by genre. In certain genres, characters “flirt,” in others, they “switch” or “are switched.” In some genres, characters merely “defeat” each other; in other genres, they “decapitate” or “are decapitated”!

Since an association with genre is built into the generative assumptions that define the article’s model of character, this might be a predetermined result. But it also raises a hugely interesting question, and there’s lots of room for experimentation here. If the authors’ model of character is too structuralist for your taste, you’re free to sketch a different one and give it a try! Or, if you’re skeptical about our ability to fully “model” character, you could refuse to frame a generative model at all, and just use clustering algorithms in an ad hoc exploratory way to find clues.

Critics will probably also cavil about the dataset (which the authors have generously made available). Do Wikipedia plot summaries tell us about recurring character patterns in film, or do they tell us about the character patterns that are most readily recognized by editors of Wikipedia?

But I think it would be a mistake to cavil. When computer scientists hand you a new tool, the question to ask is not, “Have they used it yet to write innovative criticism?” The question to ask is, “Could we use this?” And clearly, we could.

The approach embodied in this article could be enormously valuable: it could help distant reading move beyond broad stylistic questions and start to grapple with the explicit social content of fiction (and for that matter, nonfiction, which may also rely on implicit schemas of character, as the authors shrewdly point out). Ideally, we would not only map the assumptions about character that typify a given period, but describe how those patterns have changed across time.

Making that work will not be simple: as always, the real problem is the messiness of the data. Applying this technique to actual fictive texts will be a lot harder than applying it to a plot summary. Character names are often left implicit. Many different voices speak; they’re not all equally reliable. And so on.

But the Wordseer Project at Berkeley has begun to address some of these problems. Also, it’s possible that the solution is to scale up instead of sweating the details of coreference resolution: an error rate of 20 or 30% might not matter very much, if you’re looking at strongly marked patterns in a corpus of 40,000 novels.

In any case, this seems to me an exciting lead, worthy of further exploration.

Postscript: Just to illustrate some of the questions that come up: How gendered are character types? The article by Bamman et. al. explicitly models gender as a variable, but the types it ends up identifying are less gender-segregated than I might expect. The heroes and heroines of romantic comedy, for instance, seem to be described in similar ways. Would this also be true in nineteenth-century fiction?

Categories
impressionistic criticism

On trolling.

Does our fixation on the character of “the troll” obscure a deeper problem — that the Internet allows us to continuously troll ourselves?

"Troll," by Jolande RM, CC-BY-NC-ND.
“Troll,” by Jolande RM, CC-BY-NC-ND.
Since trolls monopolize every discussion they’re involved in, it should come as no surprise that reflection on trolling itself tends to be preoccupied by the persona of the troll. Wikipedia, for instance, discusses trolling only as a subtopic in its article on “internet trolls.” This sounds straightforward enough: surely, trolling means behaving like a troll. But a more interesting question opens up if we recognize that the verb can float free of the noun — that trolling pervades contemporary discourse, and is performed by everyone.

After all, why does the New York Times write about real estate in the Hamptons, for an audience that mostly can’t afford it? Why does The Atlantic scour every corner of society for trends that prevent professional women from achieving work/life balance? Why do publications for an audience that has already entered or finished grad school run articles advising them not to go to grad school?

They’re all trolling us.

“Wait,” you say. “The way you’re using the word, trolling is just another name for targeted journalistic provocation.”

Trolling may have been perfected by journalists who hold their audience captive in a filter bubble, but trolling is older than journalism. As far as I can tell, Socrates was the first person to practice it. “Why hello there, Gorgias. I hear you’re a rhetorician. By the way, I’ve always wondered, what exactly is rhetoric?”

"Socrates," photo by Sebastià Giralt, CC-BY-NC-SA
“Socrates,” photo by Sebastià Giralt, CC-BY-NC-SA
In fact, Socrates may have been a troll in the noun sense as well, because he clearly enjoyed tormenting interlocutors. But that’s ad hominem and beside the point. I call Socratic discourse “trolling,” not because it was malicious, but because it was in principle interminable. When you first sat down with Socrates, you may have thought “I’m just going to answer this one question and then go buy some olives.” But the first question never gets answered. It always leads on to deeper puzzles, and although you may finally give up and leave, the discourse will be taken up tomorrow by some other victim.

Journalism is, similarly, designed to be interminable. There’s a thin pretense that you’re familiarizing yourself with world events in order to become an informed citizen, but if you actually stopped watching once you had enough information to act, cable news wouldn’t make money.

So I propose to define trolling, generally, as a discourse that is structurally incapable of reaching the conclusion it promises. It seems to be about some determinate object, but either that object endlessly recedes as you approach it, or the rules of the discourse guarantee that other topics can be substituted for the original one, so that a conclusion is never reached.

The Internet is trolling, elevated to Hegelian World Spirit. It’s easy to imagine that people lurk on comment threads denying climate change with endlessly shifting rationales because they are personally insincere, or because online anonymity creates a cool shady place where they can multiply. But in a deeper sense trolls are merely incarnating the structural logic of the Internet. On the Internet, discourse can continue endlessly, unconfined by ordinary social limits. On the Internet, there’s always a new interlocutor — and conversely, there’s always a new provocation, guaranteed to play on your most urgent anxieties, because you designed the filter that selected it yourself.

Of course, once we define trolling this broadly, it becomes nearly useless as a normative concept. It’s hard to locate a line of division between this sort of trolling and legitimate critical reflection. Which will be frustrating, unless you’re a post-structuralist or a troll.

Postscript: The italicized subhed was added on April 22, and wording was changed in minor ways to improve clarity.

Categories
20c sociology of literature

The long history of humanistic reaction to sociology.

N+1’s recent editorial on the sociology of taste is worth reading. Whatever it gets wrong, it’s probably right about the real source of tension in the humanities* right now.

People spend a lot of time arguing about the disruptive effects of technology. But if the humanities were challenged primarily by online delivery of recorded lectures, I would sleep very well at night.

The challenge humanists are confronting springs from social rather than technological change. And n+1 is right that part of the problem involves cynicism about the model of culture that justified the study of literature and other arts in the twentieth century. For much of that century, humanists felt comfortable claiming that their disciplines conveyed a kind of cultivation that transcended mere specialized learning. You learned about literary form not because it was in itself useful, but because it transformed you in a way that gave you full possession of a collective human legacy. I have to admit that the sociology of culture has made it harder to write sentences like that last one with a straight face. “Transformation” and “possession” are too obviously metaphors for cultural distinction.

John Guillory, Cultural Capital, Chicago, 1993.
John Guillory, Cultural Capital, Chicago, 1993.
This isn’t to say that Pierre Bourdieu and John Guillory are personally responsible for our predicament. I remember reading Guillory in 1993, and Cultural Capital didn’t come as a great shock. Rather, it seemed to explain, more candidly than usual, a state of imperial unclothedness that sidelong glances had already led most of us to privately suspect.

The n+1 editorial seems weakest when it tries to inflate this recent dilemma for humanists into a broader crisis for left politics or individual agency as such. If social theory necessarily sapped individuals’ will to action, we would be in very hot water indeed! We’d have to avoid reading Marx, as well as Bourdieu. But social analysis can of course coexist with a commitment to social change, and it’s not clear that the sociology of culture has done anything to undermine that commitment. The solidarity of middle and working classes against oligarchic power may even be in better shape today than it was in 1993.

That’s a bit beside the point, however, because n+1 doesn’t seem primarily interested in politics as such. They cite a few dubiously representative examples of contemporary(ish) political(ish) debate (e.g., David Brooks on bobos). But their heart seems to be in the academy, and their real concern appears to be that sociology is undermining academic humanists’ ability to defend their own institutions forcefully, untroubled by any doubt that those institutions merely reproduce cultural distinction. At least that’s what I infer when the editors write that “the spokespeople most effectively diminished by Bourdieu’s influence turn out to be those already in the precarious position of having to articulate and transmit a language of aesthetic experience that could remain meaningful outside either a regime of status or a regime of productivity.”

But here it seems to me that the editors are conflating two conversations. On the one hand, there’s a social and institutional debate about reforming and/or defending specific academic disciplines. On the other, there’s an abstract debate about the tension between social analysis and “aesthetic experience.” The rationale for treating them as the same seems weak.

Bowie, Heroes, 45 rpm, photo by Affendaddy. CC-BY-NC-SA.
Bowie, Heroes, 45 rpm, photo by Affendaddy. CC-BY-NC-SA.
For after all, aesthetic appreciation is doing just fine these days: the sociology of culture hasn’t even dented it. I don’t find my appreciation of David Bowie, for instance, even slightly compromised when I acknowledge that he concocted a specific kind of glamour out of racial, national, gender, and class identities. A historically specific fabulousness is no less fabulous.

The social specificity of Bowie’s glam does, on the other hand, complicate the kind of rationale I could provide for requiring students to study his music. It makes it harder to invoke him as a vehicle for a general cultivation that transcends mere specialized learning. And that’s why the sociology of culture has posed a problem for the humanities: not that it undermines aesthetic discourse as such, but that it complicates claims about the social necessity of aesthetic cultivation.

This is a real dilemma that I can’t begin to resolve in a blog post; instead I’ll just gesture at recent scholarly conversation on the topic broadly construed, including articles, courses, and presentations by Rachel Buurma, James English, Andrew Goldstone, and Laura Heffernan, among others.

The one detail I’d like to add to that conversation is that the concept of “the humanities” we are now tempted to defend may have been shaped in the early twentieth century by a reaction to social science rather like the reaction n+1 is now articulating.

It has been almost completely erased from the discipline’s collective memory, but between 1895 and 1925, literary studies came rather close to becoming a social science. The University of Chicago had a “Professor of Literary Theory and Interpretation” in 1903 — and what literary theory meant, at the time, was an ambitious project to articulate general laws of historical development for literary form. At other institutions this project was often called “general literatology” or “comparative literature,” but it had little in common with contemporary comparative literature. If you go back and read H. M. Posnett’s Comparative Literature (1886), you discover a project that resembles comparative anthropology more than contemporary literary study.

This period of the discipline’s history is now largely forgotten. English professors remember Matthew Arnold; we remember the New Criticism, and we vaguely remember that there was something dusty called “philology” in between. But we probably don’t remember that Chicago had a Professorship of (anthropologically conceived) “Literary Theory” in 1903.

The reason we don’t remember is that there was intense and effective push-back against the incorporation of social sciences (including history) in the study of arts and letters. The reaction stretched from works like Norman Foerster’s American Scholar (1929) to René Wellek’s widely-reprinted Theory of Literature (1949), and it argued at times rather explicitly that social-scientific approaches to culture would reduce the prestige of the arts by undermining the authority of personal cultivation. (One might almost say that critics of this period foresaw the danger posed by Bourdieu.)

humanitiesIt may not be an accident that this was also the period when a concept of “the humanities” (newly identified as an alternative to social science) became institutionally central in American universities (see Geoffrey Harpham’s Humanities and the Dream of America and my related blog post).

I’ll have a little more to say about the anthropologically-ambitious literary theory of the early twentieth century in a book forthcoming this summer (Why Literary Periods Mattered, Stanford UP). I don’t expect that book will resolve contemporary tension between the humanities and social sciences, but I do want to point out that the debate has been going on for more than a hundred years, and that it has constituted the humanities as a distinct entity as least as much as it has threatened them.

Postscript: For a response to n+1 by an actual sociologist of culture, see whatisthewhat.

* Postscript two days later: I now disagree with one aspect of this post — the way its opening paragraphs talk generally about a challenge “for the humanities.” Actually, it’s not clear to me that Bourdieu et. al have posed a problem for historians. I was describing a challenge “for the study of literature and the arts,” and I ought to have said that specifically. In fact, the tendency to inflate doubts about a specific model of literary culture into a generalized “crisis in the humanities” is part of what’s wrong with the n+1 editorial, and part of what I ought to be taking aim at here. But I guess blogging is about learning in public.

Categories
18c 19c genre comparison historicism interpretive theory methodology representativeness

Distant reading and representativeness.

Digital collections are vastly expanding literary scholars’ field of view: instead of describing a few hundred well-known novels, we can now test our claims against corpora that include tens of thousands of works. But because this expansion of scope has also raised expectations, the question of representativeness is often discussed as if it were a weakness rather than a strength of digital methods. How can we ever produce a corpus complete and balanced enough to represent print culture accurately?

I think the question is wrongly posed, and I’d like to suggest an alternate frame. As I see it, the advantage of digital methods is that we never need to decide on a single model of representation. We can and should keep enlarging digital collections, to make them as inclusive as possible. But no matter how large our collections become, the logic of representation itself will always remain open to debate. For instance, men published more books than women in the eighteenth century. Would a corpus be correctly balanced if it reproduced those disproportions? Or would a better model of representation try to capture the demographic reality that there were roughly as many women as men? There’s something to be said for both views.

Scott Weingart tweet.To take another example, Scott Weingart has pointed out that there’s a basic tension in text mining between measuring “what was written” and “what was read.” A corpus that contains one record for every title, dated to its year of first publication, would tend to emphasize “what was written.” Measuring “what was read” is harder: a perfect solution would require sales figures, reviews, and other kinds of evidence. But, as a quick stab at the problem, we could certainly measure “what was printed,” by including one record for every volume in a consortium of libraries like HathiTrust. If we do that, a frequently-reprinted work like Robinson Crusoe will carry about a hundred times more weight than a novel printed only once.

We’ll never create a single collection that perfectly balances all these considerations. But fortunately, we don’t need to: there’s nothing to prevent us from framing our inquiry instead as a comparative exploration of many different corpora balanced in different ways.

For instance, if we’re troubled by the difference between “what was written” and “what was read,” we can simply create two different collections — one limited to first editions, the other including reprints and duplicate copies. Neither collection is going to be a perfect mirror of print culture. Counting the volumes of a novel preserved in libraries is not the same thing as counting the number of its readers. But comparing these collections should nevertheless tell us whether the issue of popularity makes much difference for a given research question.

I suspect in many cases we’ll find that it makes little difference. For instance, in tracing the development of literary language, I got interested in the relative prominence of words that entered English before and after the Norman Conquest — and more specifically, in how that ratio changed over time in different genres. My first approach to this problem was based on a collection of 4,275 volumes that were, for the most part, limited to first editions (773 of these were prose fiction).

But I recognized that other scholars would have questions about the representativeness of my sample. So I spent the last year wrestling with 470,000 volumes from HathiTrust; correcting their OCR and using classification algorithms to separate fiction from the rest of the collection. This produced a collection with a fundamentally different structure — where a popular work of fiction could be represented by dozens or scores of reprints scattered across the timeline. What difference did that make to the result? (click through to enlarge)

The same question posed to two different collections. 773 hand-selected first editions on the left; on the right, 47,549 volumes, including many translations and reprints.
The same question posed to two different collections. 773 hand-selected first editions on the left; on the right, 47,549 volumes, including many translations and reprints. Yearly ratios are plotted rather than individual works.

It made almost no difference. The scatterplots look different, of course, because the hand-selected collection (on the left) is relatively stable in size across the timespan, and has a consistent kind of noisiness, whereas the HathiTrust collection (on the right) gets so huge in the nineteenth century that noise almost disappears. But the trend lines are broadly comparable, although the collections were created in completely different ways and rely on incompatible theories of representation.

I don’t regret the year I spent getting a binocular perspective on this question. Although in this case changing the corpus made little difference to the result, I’m sure there are other questions where it will make a difference. And we’ll want to consider as many different models of representation as we can. I’ve been gathering metadata about gender, for instance, so that I can ask what difference gender makes to a given question; I’d also like to have metadata about the ethnicity and national origin of authors.

pullquoteBut the broader point I want to make here is that people pursuing digital research don’t need to agree on a theory of representation in order to cooperate.

If you’re designing a shared syllabus or co-editing an anthology, I suppose you do need to agree in advance about the kind of representativeness you’re aiming to produce. Space is limited; tradeoffs have to be made; you can only select one set of works.

But in digital research, there’s no reason why we should ever have to make up our minds about a model of representativeness, let alone reach consensus. The number of works we can select for discussion is not limited. So we don’t need to imagine that we’re seeking a correspondence between the reality of the past and any set of works. Instead, we can look at the past from many different angles and ask how it’s transformed by different perspectives. We can look at all the digitized volumes we have — and then at a subset of works that were widely reprinted — and then at another subset of works published in India — and then at three or four works selected as case studies for close reading. These different approaches will produce different pictures of the past, to be sure. But nothing compels us to make a final choice among them.

Categories
methodology

Wordcounts are amazing.

People new to text mining are often disillusioned when they figure out how it’s actually done — which is still, in large part, by counting words. They’re willing to believe that computers have developed some clever strategy for finding patterns in language — but think “surely it’s something better than that?

Uneasiness with mere word-counting remains strong even in researchers familiar with statistical methods, and makes us search restlessly for something better than “words” on which to apply them. Maybe if we stemmed words to make them more like concepts? Or parsed sentences? In my case, this impulse made me spend a lot of time mining two- and three-word phrases. Nothing wrong with any of that. These are all good ideas, but they may not be quite as essential as we imagine.

I suspect the core problem is that most of us learned language a long time ago, and have forgotten how much leverage it provides. We can still recognize that syntax might be worthy of analysis — because it’s elusive enough to be interesting. But the basic phenomenon of the “word” seems embarrassingly crude.

Billy Graham, 1949, from the Galt Museum, on Creative Commons.
Baby, 1949, from the Galt Museum, on Creative Commons.
We need to remember that words are actually features of a very, very high-level kind. As a thought experiment, I find it useful to compare text mining to image processing. Take the picture on the right. It’s pretty hard to teach a computer to recognize that this is a picture that contains a face. To recognize that it contains “sitting” and a “baby” would be extraordinarily impressive. And it’s probably, at present, impossible to figure out that it contains a “blanket.”

Working with text is like working with a video where every element of every frame has already been tagged, not only with nouns but with attributes and actions. If we actually had those tags on an actual video collection, I think we’d recognize it as an enormously valuable archive. The opportunities for statistical analysis are obvious! We have trouble recognizing the same opportunities when they present themselves in text, because we take the strengths of text for granted and only notice what gets lost in the analysis. So we ignore all those free tags on every page and ask ourselves, “How will we know which tags are connected? And how will we know which clauses are subjunctive?”

Natural language processing is going to be important for all kinds of reasons — among them, it can eventually tell us which clauses are subjunctive (should we wish to know). But I think it’s a mistake to imagine that text mining is now in a sort of crude infancy, whose real possibilities will only be revealed after NLP matures. Wordcounts are amazing! An enormous amount of our cultural history is already tagged, in a detailed way that is also easy to analyze statistically. That’s not an embarrassingly babyish method: it’s a huge and obvious research opportunity.