The long history of humanistic reaction to sociology.

N+1’s recent editorial on the sociology of taste is worth reading. Whatever it gets wrong, it’s probably right about the real source of tension in the humanities* right now.

People spend a lot of time arguing about the disruptive effects of technology. But if the humanities were challenged primarily by online delivery of recorded lectures, I would sleep very well at night.

The challenge humanists are confronting springs from social rather than technological change. And n+1 is right that part of the problem involves cynicism about the model of culture that justified the study of literature and other arts in the twentieth century. For much of that century, humanists felt comfortable claiming that their disciplines conveyed a kind of cultivation that transcended mere specialized learning. You learned about literary form not because it was in itself useful, but because it transformed you in a way that gave you full possession of a collective human legacy. I have to admit that the sociology of culture has made it harder to write sentences like that last one with a straight face. “Transformation” and “possession” are too obviously metaphors for cultural distinction.

John Guillory, Cultural Capital, Chicago, 1993.

John Guillory, Cultural Capital, Chicago, 1993.

This isn’t to say that Pierre Bourdieu and John Guillory are personally responsible for our predicament. I remember reading Guillory in 1993, and Cultural Capital didn’t come as a great shock. Rather, it seemed to explain, more candidly than usual, a state of imperial unclothedness that sidelong glances had already led most of us to privately suspect.

The n+1 editorial seems weakest when it tries to inflate this recent dilemma for humanists into a broader crisis for left politics or individual agency as such. If social theory necessarily sapped individuals’ will to action, we would be in very hot water indeed! We’d have to avoid reading Marx, as well as Bourdieu. But social analysis can of course coexist with a commitment to social change, and it’s not clear that the sociology of culture has done anything to undermine that commitment. The solidarity of middle and working classes against oligarchic power may even be in better shape today than it was in 1993.

That’s a bit beside the point, however, because n+1 doesn’t seem primarily interested in politics as such. They cite a few dubiously representative examples of contemporary(ish) political(ish) debate (e.g., David Brooks on bobos). But their heart seems to be in the academy, and their real concern appears to be that sociology is undermining academic humanists’ ability to defend their own institutions forcefully, untroubled by any doubt that those institutions merely reproduce cultural distinction. At least that’s what I infer when the editors write that “the spokespeople most effectively diminished by Bourdieu’s influence turn out to be those already in the precarious position of having to articulate and transmit a language of aesthetic experience that could remain meaningful outside either a regime of status or a regime of productivity.”

But here it seems to me that the editors are conflating two conversations. On the one hand, there’s a social and institutional debate about reforming and/or defending specific academic disciplines. On the other, there’s an abstract debate about the tension between social analysis and “aesthetic experience.” The rationale for treating them as the same seems weak.

Bowie, Heroes, 45 rpm, photo by Affendaddy. CC-BY-NC-SA.

Bowie, Heroes, 45 rpm, photo by Affendaddy. CC-BY-NC-SA.

For after all, aesthetic appreciation is doing just fine these days: the sociology of culture hasn’t even dented it. I don’t find my appreciation of David Bowie, for instance, even slightly compromised when I acknowledge that he concocted a specific kind of glamour out of racial, national, gender, and class identities. A historically specific fabulousness is no less fabulous.

The social specificity of Bowie’s glam does, on the other hand, complicate the kind of rationale I could provide for requiring students to study his music. It makes it harder to invoke him as a vehicle for a general cultivation that transcends mere specialized learning. And that’s why the sociology of culture has posed a problem for the humanities: not that it undermines aesthetic discourse as such, but that it complicates claims about the social necessity of aesthetic cultivation.

This is a real dilemma that I can’t begin to resolve in a blog post; instead I’ll just gesture at recent scholarly conversation on the topic broadly construed, including articles, courses, and presentations by Rachel Buurma, James English, Andrew Goldstone, and Laura Heffernan, among others.

The one detail I’d like to add to that conversation is that the concept of “the humanities” we are now tempted to defend may have been shaped in the early twentieth century by a reaction to social science rather like the reaction n+1 is now articulating.

It has been almost completely erased from the discipline’s collective memory, but between 1895 and 1925, literary studies came rather close to becoming a social science. The University of Chicago had a “Professor of Literary Theory and Interpretation” in 1903 — and what literary theory meant, at the time, was an ambitious project to articulate general laws of historical development for literary form. At other institutions this project was often called “general literatology” or “comparative literature,” but it had little in common with contemporary comparative literature. If you go back and read H. M. Posnett’s Comparative Literature (1886), you discover a project that resembles comparative anthropology more than contemporary literary study.

This period of the discipline’s history is now largely forgotten. English professors remember Matthew Arnold; we remember the New Criticism, and we vaguely remember that there was something dusty called “philology” in between. But we probably don’t remember that Chicago had a Professorship of (anthropologically conceived) “Literary Theory” in 1903.

The reason we don’t remember is that there was intense and effective push-back against the incorporation of social sciences (including history) in the study of arts and letters. The reaction stretched from works like Norman Foerster’s American Scholar (1929) to René Wellek’s widely-reprinted Theory of Literature (1949), and it argued at times rather explicitly that social-scientific approaches to culture would reduce the prestige of the arts by undermining the authority of personal cultivation. (One might almost say that critics of this period foresaw the danger posed by Bourdieu.)

humanitiesIt may not be an accident that this was also the period when a concept of “the humanities” (newly identified as an alternative to social science) became institutionally central in American universities (see Geoffrey Harpham’s Humanities and the Dream of America and my related blog post).

I’ll have a little more to say about the anthropologically-ambitious literary theory of the early twentieth century in a book forthcoming this summer (Why Literary Periods Mattered, Stanford UP). I don’t expect that book will resolve contemporary tension between the humanities and social sciences, but I do want to point out that the debate has been going on for more than a hundred years, and that it has constituted the humanities as a distinct entity as least as much as it has threatened them.

Postscript: For a response to n+1 by an actual sociologist of culture, see whatisthewhat.

* Postscript two days later: I now disagree with one aspect of this post — the way its opening paragraphs talk generally about a challenge “for the humanities.” Actually, it’s not clear to me that Bourdieu et. al have posed a problem for historians. I was describing a challenge “for the study of literature and the arts,” and I ought to have said that specifically. In fact, the tendency to inflate doubts about a specific model of literary culture into a generalized “crisis in the humanities” is part of what’s wrong with the n+1 editorial, and part of what I ought to be taking aim at here. But I guess blogging is about learning in public.

What can topic models of PMLA teach us about the history of literary scholarship?

by Andrew Goldstone and Ted Underwood

Of all our literary-historical narratives it is the history of criticism itself that seems most wedded to a stodgy history-of-ideas approach—narrating change through a succession of stars or contending schools. While scholars like John Guillory and Gerald Graff have produced subtler models of disciplinary history, we could still do more to complicate the narratives that organize our discipline’s understanding of itself.

A browsable network based on Underwood's model of PMLA. Click through, then mouse over or click on individual topics.

A browsable network based on Underwood's model of PMLA. Click through, then mouse over or click on individual topics.

The archive of scholarship is also, unlike many twentieth-century archives, digitized and available for “distant reading.” Much of what we need is available through JSTOR’s Data for Research API. So last summer it occurred to a group of us that topic modeling PMLA might provide a new perspective on the history of literary studies. Although Goldstone and Underwood are writing this post, the impetus for the project also came from Natalia Cecire, Brian Croxall, and Roger Whitson, who may do deeper dives into specific aspects of this archive in the near future.

Topic modeling is a technique that automatically identifies groups of words that tend to occur together in a large collection of documents. It was developed about a decade ago by David Blei among others. Underwood has a blog post explaining topic modeling, and you can find a practical introduction to the technique at the Programming Historian. Jonathan Goodwin has explained how it can be applied to the word-frequency data you get from JSTOR.

Obviously, PMLA is not an adequate synecdoche for literary studies. But, as a generalist journal with a long history, it makes a useful test case to assess the value of topic modeling for a history of the discipline.

Goldstone and Underwood each independently produced several different models of PMLA, using different software, stopword lists, and numbers of topics. Our results overlapped in places and diverged in places. But we’ve reached a shared sense that topic modeling can enrich the history of literary scholarship by revealing trends that are presently invisible.

What is a topic?
A “topic model” assigns every word in every document to one of a given number of topics. Every document is modeled as a mixture of topics in different proportions. A topic, in turn, is a distribution of words—a model of how likely given words are to co-occur in a document. The algorithm (called LDA) knows nothing “meta” about the articles (when they were published, say), and it knows nothing about the order of words in a given document.

100 topics from PMLA.
This is a picture of 5940 articles from PMLA, showing the changing presence of each of 100 "topics" in PMLA over time. (Click through to enlarge; a longer list of topic keywords is here.) For example, the most probable words in the topic arbitrarily numbered 59 in the model visualized above are, in descending order:

che gli piu nel lo suo sua sono io delle perche questo quando ogni mio quella loro cosi dei

This is not a “topic” in the sense of a theme or a rhetorical convention. What these words have in common is simply that they’re basic Italian words, which appear together whenever an extended Italian text occurs. And this is the point: a “topic” is neither more nor less than a pattern of co-occurring words.

Nonetheless, a topic like topic 59 does tell us about the history of PMLA. The articles where this topic achieved its highest proportion were:

Antonio Illiano, “Momenti e problemi di critica pirandelliana: L’umorismo, Pirandello e Croce, Pirandello e Tilgher,” PMLA 83 no. 1 (1968): pp. 135-143
Domenico Vittorini, “I Dialogi ad Petrum Histrum di Leonardo Bruni Aretino (Per la Storia del Gusto Nell’Italia del Secolo XV),” PMLA 55 no. 3 (1940): pp. 714-720
Vincent Luciani, “Il Guicciardini E La Spagna,” PMLA 56 no. 4 (1941): pp. 992-1006

And here’s a plot of the changing proportions of this topic over time, showing moving 1-year and 5-year averages:

topic59lineWe see something about PMLA that is worth remembering for the history of criticism, namely, that it has embedded Italian less and less frequently in its language since midcentury. (The model shows that the same thing is true of French and German.)

What can topics tell us about the history of theory?
Of course a topic can also be a subject category—modeling PMLA, we have found topics that are primarily “about Beowulf” or “about music.” Or a topic can be a group of words that tend to co-occur because they’re associated with a particular critical approach.

Here, for instance, we have a topic from Underwood’s 150-topic model associated with discussions of pattern and structure in literature. We can characterize it by listing words that occur more commonly in the topic than elsewhere, or by graphing the frequency of the topic over time, or by listing a few articles where it’s especially salient.

Topic 109 from Underwood's model of 150 topics.
At first glance this topic might seem to fit neatly into a familiar story about critical history. We know that there was a mid-twentieth-century critical movement called “structuralism,” and the prominence of “structure” here might suggest that we’re looking at the rise and fall of that movement. In part, perhaps, we are. But the articles where this topic is most prominent are not specifically “structuralist.” In the top four articles, Ferdinand de Saussure, Claude Lévi-Strauss, and Northrop Frye are nowhere in evidence. Instead these articles appeal to general notions of symmetry, or connect literary patterns to Neoplatonism and Renaissance numerology.

By forcing us to attend to concrete linguistic practice, topic modeling gives us a chance to bracket our received assumptions about the connections between concepts. While there is a distinct mid-century vogue for structure, it does not seem strongly associated with the concepts that are supposed to have motivated it (myth, kinship, language, archetype). And it begins in the 1940s, a decade or more before “structuralism” is supposed to have become widespread in literary studies. We might be tempted to characterize the earlier part of this trend as “New Critical interest in formal unity” and the latter part of it as “structuralism.” But the dividing line between those rationales for emphasizing pattern is not evident in critical vocabulary (at least not at this scale of analysis).

This evidence doesn’t necessarily disprove theses about the history of structuralism. Topic modeling might not reveal varying “rationales” for using a word even if those rationales did vary. The strictly linguistic character of this technique is a limitation as well as a strength: it’s not designed to reveal motivation or conflict. But since our histories of criticism are already very intellectual and agonistic, foregrounding the conscious beliefs of contending critical “schools,” topic modeling may offer a useful corrective. This technique can reveal shifts of emphasis that are more gradual and less conscious than the ones we tend to celebrate.

It may even reveal shifts of emphasis of which we were entirely unaware. “Structure” is a familiar critical theme, but what are we to make of this?

Topic 79 from Underwood's 150-topic model.A fuller list of terms included in this topic would include “character”, “fact,” “choice,” “effect,” and “conflict.” Reading some of the articles where the topic is prominent, it appears that in this topic “point” is rarely the sort of point one makes in an argument. Instead it’s a moment in a literary work (e.g., “at the point where the rain occurs,” in Robert apRoberts 379). Apparently, critics in the 1960s developed a habit of describing literature in terms of problems, questions, and significant moments of action or choice; the habit intensified through the early 1980s and then declined. This habit may not have a name; it may not line up neatly with any recognizable school of thought. But it’s a fact about critical history worth knowing.

Note that this concern with problem-situations is embodied in common words like “way” and “cannot” as well as more legible, abstract terms. Since common words are often difficult to interpret, it can be tempting to exclude them from the modeling process. It’s true that a word like “the” isn’t likely to reveal much. But subtle, interesting rhetorical habits can be encoded in common words. (E.g. “itself” is especially common in late-20c theoretical topics.)

We don’t imagine that this brief blog post has significantly contributed to the history of criticism. But we do want to suggest that topic modeling could be a useful resource for that project. It has the potential to reveal shifts in critical vocabulary that aren’t well described, and that don’t fit our received assumptions about the history of the discipline.

Why browse topics as a network?
The fact that a word is prominent in topic A doesn’t prevent it from also being prominent in topic B. So certain generalizations we might make about an individual topic (for instance, that Italian words decline in frequency after midcentury) will be true only if there’s not some other “Italian” topic out there, picking up where the first one left off.

For that reason, interpreters really need to survey a topic model as a whole, instead of considering single topics in isolation. But how can you browse a whole topic model? We’ve chosen relatively small numbers of topics, but it would not be unreasonable to divide literary scholarship into, say, 500 topics. Information overload becomes a problem.

A browsable image map of 150 topics from PMLA. After you click through you can mouseover (or click) individual topics for more information.

A browsable image map of 150 topics from PMLA. After you click through you can mouseover (or click) individual topics for more information.

We’ve found network graphs useful here. Click on the image of the network on the right to browse Underwood’s 150-topic model. The size of each node (roughly) indicates the number of words in the topic; color indicates the average date of words. (Blue topics are older; yellow topics are more recent.) Topics are linked to each other if they tend to appear in the same articles. Topics have been labeled with their most salient word—unless that word was already taken for another topic, or seemed misleading. Mousing over a topic reveals a list of words associated with it; with most topics it’s also possible to click through for more information.

The structure of the network makes a loose kind of sense. Topics in French and German form separate networks floating free of the main English structure. Recent topics tend to cluster at the bottom of the page. And at the bottom, historical and pedagogical topics tend to be on the left, while formal, phenomenological, and aesthetic categories tend to be on the right.

But while it’s a little eerie to see patterns like this emerge automatically, we don’t advise readers to take the network structure too seriously. A topic model isn’t a network, and mapping one onto a network can be misleading. For instance, topics that are physically distant from each other in this visualization are not necessarily unrelated. Connections below a certain threshold go unrepresented.

Goldstone's 100-topic model of PMLA; click through to enlarge.

Goldstone’s 100-topic model of PMLA; click through to enlarge.

Moreover, as you can see by comparing illustrations in this post, a little fiddling with dials can turn the same data into networks with rather different shapes. It’s probably best to view network visualization as a convenience. It may help readers browse a model by loosely organizing topics—but there can be other equally valid ways to organize the same material.

How did our models differ?
The two models we’ve examined so far in this post differ in several ways at once. They’re based on different spans of PMLA‘s print run (1890–1999 and 1924–2006). They were produced with different software. Perhaps most importantly, we chose different numbers of topics (100 and 150).

But the models we’re presenting are only samples. Goldstone and Underwood each produced several models of PMLA, changing one variable at a time, and we have made some closer apples-to-apples comparisons.

Broadly, the conclusion we’ve reached is that there’s both a great deal of fluidity and a great deal of consistency in this process. The algorithm has to estimate parameters that are impossible to calculate exactly. So the results you get will be slightly different every time. If you run the algorithm on the same corpus with the same number of topics, the changes tend to be fairly minor. But if you change the number of topics, you can get results that look substantially different.

On the other hand, to say that two models “look substantially different” isn’t to say that they’re incompatible. A jigsaw puzzle cut into 100 pieces looks different from one with 150 pieces. If you examine them piece by piece, no two pieces are the same—but once you put them together you’re looking at the same picture. In practice, there was a lot of overlap between our models; on the older end of the spectrum you often see a topic like “evidence fact,” while the newer end includes topics that foreground narrative, rhetoric, and gender. Some of the more surprising details turned out to be consistent as well. For instance, you might expect the topic “literary literature” to skew toward the older end of the print run. But in fact this is a relatively recent topic in both of our models, associated with discussion of canonicity. (Perhaps the owl of Minerva flies only at dusk?)

Contrasting models: a short example
While some topics look roughly the same in all of our models, it’s not always possible to identify close correlates of that sort. As you vary the overall number of topics, some topics seem to simply disappear. Where do they go? For example, there is no exact counterpart in Goldstone’s model to that “structure” topic in Underwood’s model. Does that mean it is a figment? Underwood isolated the following article as the most prominent exemplar:

Robert E. Burkhart, The Structure of Wuthering Heights, Letter to the Editor, PMLA 87 no. 1 (1972): 104–5. (Incidentally, jstor has miscategorized this as a “full-length article.”)

Goldstone’s model puts more than half of Burkhart’s comment in three topics:

0.24 topic 38 time experience reality work sense form present point world human process structure concept individual reader meaning order real relationship

0.13 topic 46 novels fiction poe gothic cooper characters richardson romance narrator story novelist reader plot novelists character reade hero heroine drf

0.12 topic 13 point reader question interpretation meaning make reading view sense argument words word problem makes evidence read clear text readers

The other prominent documents in Underwood’s 109 are connected to similar topics in Goldstone’s model. The keywords for Goldstone’s topic 38, the top topic here, immediately suggest an affinity with Underwood’s topic 109. Now compare the time course of Goldstone’s 38 with Underwood’s 109 (the latter is above):

It is reasonable to infer that some portion of the words in Underwood’s “structure” topic are absorbed in Goldstone’s “time experience” topic. But “time experience reality work sense” looks less like vocabulary for describing form (although “form” and “structure” are included in it, further down the list; cf. the top words for all 100 topics), and more like vocabulary for talking about experience in generalized ways—as is also suggested by the titles of some articles in which that topic is substantially present:

“The Vanishing Subject: Empirical Psychology and the Modern Novel”
“Toward a Modern Humanism”
“Wordsworth’s Inscrutable Workmanship and the Emblems of Reality”

This version of the topic is no less “right” or “wrong” than the one in Underwood’s model. They both reveal the same underlying evidence of word use, segmented in different but overlapping ways. Instead of focusing our vision on affinities between “form” and “structure”, Goldstone’s 100-topic model shows a broader connection between the critical vocabulary of form and structure and the keywords of “humanistic” reflection on experience.

The most striking contrast to these postwar themes is provided by a topic which dominates in the prewar period, then gives way before “time experience” takes hold. Here are box plots by ten-year intervals of the proportions of another topic, Goldstone’s topic 40, in PMLA articles:

Underwood’s model shows a similar cluster of topics centering on questions of evidence and textual documentation, which similarly decrease in frequency. The language of PMLA has shown a consistently declining interest in “evidence found fact” in the era of the postwar research university.

So any given topic model of a corpus is not definitive. Each variation in the modeling parameters can produce a new model. But although topic models vary, models of the same corpus remain fundamentally consistent with each other.

Using LDA as evidence
It’s true that a “topic model” is simply a model of how often words occur together in a corpus. But information of that kind has a deeper significance than we might at first assume. A topic model doesn’t just show you what people are writing about (a list of “topics” in our ordinary sense of the word). It can also show you how they’re writing. And that “how” seems to us a strong clue to social affinities—perhaps especially for scholars, who often identify with a methodology or critical vocabulary. To put this another way, topic modeling can identify discourses as well as subject categories and embedded languages. Naturally we also need other kinds of evidence to produce a history of the discipline, including social and institutional evidence that may not be fully manifest in discourse. But the evidence of topic modeling should be taken seriously.

As you change the number of topics (and other parameters), models provide different pictures of the same underlying collection. But this doesn’t mean that topic modeling is an indeterminate process, unreliable as evidence. All of those pictures will be valid. They are taken (so to speak) at different distances, and with different levels of granularity. But they’re all pictures of the same evidence and are by definition compatible. Different models may support different interpretations of the evidence, but not interpretations that absolutely conflict. Instead the multiplicity of models presents us with a familiar choice between “lumping” or “splitting” cultural phenomena—a choice where we have long known that multiple levels of analysis can coexist. This multiplicity of perspective should be understood as a strength rather than a limitation of the technique; it is part of the reason why an analysis using topic modeling can afford a richly detailed picture of an archive like PMLA.

Appendix: How did we actually do this?
The PMLA data obtained from JSTOR was independently processed by Goldstone and Underwood for their different LDA tools. This created some quantitative subtleties that we’ve saved for this appendix to keep this post accessible to a broad audience. If you read closely, you’ll notice that we sometimes talk about the “probability” of a term in a topic, and sometimes about its “salience.” Goldstone used MALLET for topic modeling, whereas Underwood used his own Java implementation of LDA. As a result, we also used slightly different formulas for ranking words within a topic. MALLET reports the raw probability of terms in each topic, whereas Underwood’s code uses a slightly more complex formula for term salience drawn from Blei & Lafferty (2009). In practice, this did not make a huge difference.

MALLET also has a “hyperparameter optimization” option which Goldstone’s 100-topic model above made use of. Before you run screaming, “hyperparameters” are just dials that control how much fuzziness is allowed in a topic’s distribution across words (beta) or across documents (alpha). Allowing alpha to vary allows greater differentiation between the sizes of large topics (often with common words), and smaller (often more specialized) topics. (See “Why Priors Matter,” Wallach, Mimno, and McCallum, 2009.) In any event, Goldstone’s 100-topic model used hyperparameter optimization; Underwood’s 150-topic model did not. A comparison with several other models suggests that the difference between symmetric and asymmetric (optimized) alpha parameters explains much of the difference between their structures when visualized as networks.

Goldstone’s processing scripts are online in a github repository. The same repository includes R code for making the plots from Goldstone’s model. Goldstone would also like to thank Bob Gerdes of Rutgers’s Office of Instructional and Research Technology for support for running mallet on the university’s server, Ben Schmidt for helpful comments at a THATCamp Theory session, and Jon Goodwin for discussion and his excellent blog posts on topic-modeling jstor data.

Underwood’s network graphs were produced by measuring Pearson correlations between topic distributions (across documents) and then selecting the strongest correlations as network edges using an algorithm Underwood has described previously. That data structure was sent to Gephi. Underwood’s Java implementation of LDA, as well as his PMLA model, and code for translating a model into a network, are on github, although at this point he can’t promise a plug-and-play workflow. Underwood would like to thank Matt Jockers for convincing him to try topic modeling (see Matt’s impressive, detailed model of the nineteenth-century novel) and Michael Simeone for convincing him to try force-directed network graphs. David Mimno kindly answered some questions about the innards of MALLET.

[Cross-posted:, Arcade (to appear).]

[Edit (AG) 12/12/16: 10×10 grid image now with topics in numerical order. Original version still available: overview.png.]

More reflections on the apparent “structuralism” in the Google dataset

In my last post, I argued that groups of related terms that express basic sensory oppositions (wet/dry, hot/cold, red/green/blue/yellow) have a tendency to correlate strongly with each other in the Google dataset. When “wet” goes up in frequency, “dry” tends to go up as well, as if the whole sensory category were somehow becoming more prominent in writing. Primary colors rise and fall as a group as well.

blue, red, green, yellow, in English fiction, 1800-2000

In that post I focused on a group of categories (temperature, color, and wetness) that all seem to become more prominent from 1820 to 1940, and then start to decline. The pattern was so consistent that you might start to wonder whether it’s an artefact of some flaw in the data. Does every adjective go up from 1820 to 1940? Not at all. A lot of them (say, “melancholy”) peak roughly where the ones I’ve been graphing hit a minimum. And it’s possible to find many paired oppositions that correlate like hot/cold or wet/dry, but peak at a different point.

delicate, rough, in English fiction, from 1800 to 2000

“Delicate” and “rough” correlate loosely (with an interesting lag), but peak much earlier than words for temperature or color, somewhere between 1880 and 1900. Now, it’s fair to question whether “delicate” and “rough” are actually antonyms. Perhaps the opposite of “rough” is actually “smooth”? As we get away from the simplest sensory categories there’s going to be more ambiguity than there was with “wet” and “dry,” and the neat structural parallels I traced in my previous post are going to be harder to find. I think it’s possible, however, that we’ll be able to discover some interesting patterns simply by paying attention to the things that do in practice correlate with each other at different times. The history of diction seems to be characterized by a sequence of long “waves” where different conceptual categories gradually rise to prominence, and then decline.

I should credit mmwm at the blog Beyond Rivalry for the clue that led to my next observation, which is that it’s not just certain sensory adjectives (like hot/cold/cool/warm) that rise to prominence from 1820 to 1940, but also a few nouns loosely related to temperature, like the seasons.

winter, summer, spring, autumn, in English fiction, 1820-2000

I’ve started this graph at 1820 rather than 2000, because the long s/f substitution otherwise creates noise at the very beginning. And I’ve chosen “autumn” rather than “fall” to avoid interference from the verb. But the pattern here is very similar to the pattern I described in my last post — there’s a low around 1820 and a high around 1940. (Looking at the data for fummer and fpring, I suspect that the frequency of all four seasons does increase as you go back before 1820.)

As I factor in some of this evidence, I’m no longer sure it’s adequate to characterize this trend generally as an increase in “concreteness” or “sensory vividness” — although that might be how Ernest Hemingway and D. H. Lawrence themselves would have imagined it. Instead, it may be necessary to describe particular categories that became more prominent in the early 20c (maybe temperature? color?) while others (perhaps delicacy/roughness?) began to decline. Needless to say, this is all extremely tentative; I don’t specialize in modernism, so I’m not going to try to explain what actually happened in the early 20c. We need more context to be confident that these patterns have significance, and I’ll leave the task of explaining their significance to people who know the literature more intimately. I’m just drawing attention to a few interesting patterns, which I hope might provoke speculation.

Finally, I should note that all of the changes I’ve graphed here, and in the last post, were based on the English fiction dataset. Some of these correlations are a little less striking in the main English dataset (although some are also more striking). I’m restricting myself to fiction right now to avoid cherry-picking the prettiest graphs.

The rise of a sensory style?

I ended my last post, on colors, by speculating that the best explanation for the rise of color vocabulary from 1820 to 1940 might simply be “a growing insistence on concrete and vivid sensory detail.” Here’s the graph once again to illustrate the shape of the trend.

blue, red, green, yellow, in the English fiction corpus, 1800-2000

It occurred to me that one might try to confirm this explanation by seeing what happened to other words that describe fairly basic sensory categories. Would words like “hot” and “cold” change in strongly correlated ways, as the names of primary colors did? And if so, would they increase in frequency across the same period from 1820 to 1940?

The results were interesting.

cold, hot, in the English fiction corpus, 1800-2000

“Hot” and “cold” track each other closely. There is indeed a low around 1820 and a peak around 1940. “Cold” increases by about 60%, “hot” by more than 100%.

cool, warm, in the English fiction corpus, 1800-2000

“Warm” and “cool” are also strongly correlated, increasing by more than 50%, with a low around 1820 and a high around 1940 — although “cool” doesn’t decline much from its high, probably because the word acquires an important new meaning related to style.

wet, dry, in the English fiction corpus, 1800-2000

“Wet” and “dry” correlate strongly, and they both double in frequency. Once again, a low around 1820 and a peak around 1940, at which point the trend reverses.

There’s a lot of room for further investigation here. I think I glimpse a loosely similar pattern in words for texture (hard/soft and maybe rough/smooth), but it’s not clear whether the same pattern will hold true for the senses of smell, hearing, or taste.

More crucially, I have absolutely no idea why these curves head up in 1820 and reverse direction in 1940. To answer that question we would need to think harder about the way these kinds of adjectives actually function in specific works of fiction. But it’s beginning to seem likely that the pattern I noticed in color vocabulary is indeed part of a broader trend toward a heightened emphasis on basic sensory adjectives — at least in English fiction. I’m not sure that we literary critics have an adequate name for this yet. “Realism” and “naturalism” can only describe parts of a trend that extends from 1820 to 1940.

More generally, I feel like I’m learning that the words describing different poles or aspects of a fundamental opposition often move up or down as a unit. The whole semantic distinction seems to become more prominent or less so. This doesn’t happen in every case, but it happens too often to be accidental. Somewhere, Claude Lévi-Strauss can feel pretty pleased with himself.


It’s tempting to use the ngram viewer to stage semantic contrasts (efficiency vs. pleasure). It can be more useful to explore cases of semantic replacement (liberty vs. freedom). But a third category of comparison, perhaps even more interesting, involves groups of words that parallel each other quite closely as the whole group increases or decreases in prominence.

One example that is conveniently easy to visualize involves colors.

blue, red, green, yellow, in the English corpus, 1800-2000

The trajectories of primary colors parallel each other very closely. They increase in frequency through the nineteenth century, peak in a period between 1900 and 1945, and then decline to a low around 1985, with some signs of recovery. (The recovery is more marked after 2000, but that data may not be reliable yet.) Blue increases most, by a factor of almost three, and green the least, by about 50%. Red and yellow roughly double in frequency.

Perhaps red increases because of red-baiting, and blue increases because jazz singers start to use it metaphorically? Perhaps. But the big picture here is that the relative prominence of different colors remains fairly stable (red being always most prominent), while they increase and decline significantly as a group. This is a bit surprising. Color seems like a basic dimension of human experience, and you wouldn’t expect its importance to fluctuate. (If you graph the numbers one, two, three, for instance, you get fairly flat lines all the way across.)

What about technological change? Color photography is really too late to be useful. Maybe synthetic dyes? They start to arrive on the scene in the 1860s, which is also a little late, since the curves really head up around 1840, but it’s conceivable that a consumer culture with a broader range of artefacts brightly differentiated by color might play a role here. If you graph British usage, there’s even an initial peak in the 1860s and 70s that looks plausibly related to the advent of synthetic dye.

blue, red, green, yellow, in the British corpus, 1800-2000

On the other hand, if this is a technological change, it’s a little surprising that it looks so different in different national traditions. (The French and German corpora may not be reliable yet, but at this point their colors behave altogether differently.) Moreover, a hypothesis about synthetic dyes wouldn’t do much to explain the equally significant decline from the 1950s to the 1980s. Maybe the problem is that we’re only looking at primary colors. Perhaps in the twentieth century a broader range of words for secondary colors proliferated, and subtracted from the frequency of words like red and green?

lavender, pink, indigo, brown, gray, purple, in English corpus, 1800-2000

This is a hard hypothesis to test, because there are a lot of different words for color, and you’d need to explore perhaps a hundred before you had a firm answer. But at first glance, it doesn’t seem very helpful, because a lot of words for minor colors exhibit a pattern that closely resembles primary colors. Brown, gray, purple, and pink — the leaders in the graph above — all decline from 1950 to 1980. Even black and white (not graphed here) don’t help very much; they display a similar pattern of increase beginning around 1840 and decrease beginning around 1940, until the 1960s, when the racial meanings of the terms begin to clearly dominate other kinds of variation.

At the moment, I think we’re simply looking at a broad transformation of descriptive style that involves a growing insistence on concrete and vivid sensory detail. One word for this insistence might be “realism.” We ordinarily apply that word to fiction, of course, and it’s worth noting that the increase in color vocabulary does seem to begin slightly earlier in the corpus of fiction — as early perhaps as the 1820s.

blue, red, green, yellow, in English Fiction, 1800-2000

But “realism,” “naturalism,” “imagism,” and so on are probably not adequate words for a transformation of diction that covers many different genres and proceeds for more than a century. (It proceeds fairly steadily, although I would really like to understand that plateau from 1860 to 1890.) More work needs to be done to understand this. But the example of color vocabulary already hints, I think, that broadly diachronic studies of diction may turn up literary phenomena that don’t fit easily into literary scholars’ existing grid of periods and genres. We may need to define a few new concepts.