My reaction to Stanley Fish’s third column on digital humanities was at first so negative that I thought it not worth writing about. But in the light of morning, there is something here worth discussing. Fish raises a neglected issue that I (and a bunch of other people cited at the end of this post) have been trying to foreground: the role of discovery in the humanities. He raises the issue symptomatically, by suppressing it, but the problem is too important to let that slide.
Fish argues, in essence, that digital humanists let the data suggest hypotheses for them instead of framing hypotheses that are then tested against evidence.
The usual way of doing this is illustrated by my example: I began with a substantive interpretive proposition … and, within the guiding light, indeed searchlight, of that proposition I noticed a pattern that could, I thought be correlated with it. I then elaborated the correlation.
The direction of my inferences is critical: first the interpretive hypothesis and then the formal pattern, which attains the status of noticeability only because an interpretation already in place is picking it out.
The direction is the reverse in the digital humanities: first you run the numbers, and then you see if they prompt an interpretive hypothesis. The method, if it can be called that, is dictated by the capability of the tool.
The underlying element of truth here is that all researchers — humanists and scientists alike — do need to separate the process of discovering a hypothesis from the process of testing it. Otherwise you run into what we unreflecting empiricists call “the problem of data dredging.” If you simply sweep a net through an ocean of data, and frame a conclusion based on whatever you catch, you’re not properly testing anything, because you’re implicitly testing an infinite number of hypotheses that are left unstated — and the significance of any single test is reduced when it’s run as part of a large battery.
That’s true, but it’s also a problem that people who do data mining are quite self-conscious about. It’s why I never stop linking to this xkcd comic about “significance.” And it’s why Matt Wilkens (mistargeted by Fish as an emblem of this interpretive sin) goes through a deliberately iterative process of first framing hypotheses about nineteenth-century geographical imagination and then testing them. (For instance, after noticing that certain states seem especially prominent in 19c American fiction, he tests whether this remains true after you compensate for differences in population size, and then proposes a pair of hypotheses that he suggests will need to be evaluated against additional “test cases.”)
More importantly, Fish profoundly misrepresents his own (traditional) interpretive procedure by pretending that the act of interpretation is wholly contained in a single encounter with evidence. On his account we normally begin with a hypothesis (which seems to have sprung, like Sin, fully-formed from our head), and test it against a single sentence.
In reality, of course, our “interpretive proposition” is often suggested by the same evidence that confirms it. Or — more commonly — we derive a hypothesis from one example, and then read patiently through dozens of books until we have gathered enough confirming evidence to write a chapter. This process runs into a different interpretive fallacy: if you keep testing a hypothesis until you’ve confirmed it, you’re not testing it at all. And it’s a bit worse than that, because in practice what we do now is go to a full-text search engine and search for terms that would go together if our assumptions were correct. (In the example Fish offers, this might be “bishops” and “presbyters.”) If you find three sentences where those terms coincide, you’ve got more than enough evidence to prop up an argument, using our richly humanistic (cough, anecdotal) conception of evidence. And of course a full-text search engine can find you three examples of just about anything. But we don’t have to worry about this, because search engines are not tools that dictate a method; they are transparent extensions of our interpretive sensibility.
The basic mistake that Fish is making is this: he pretends that humanists have no discovery process at all. For Fish, the interpretive act is always fully contained in an encounter with a single piece of evidence. How your “interpretive proposition” got framed in the first place is a matter of no consequence: some readers are just fortunate to have propositions that turn out to be correct. Fish is not alone in this idealized model of interpretation; it’s widespread among humanists.
Fish is resisting the assistance of digital techniques, not because they would impose scientism on the humanities, but because they would force us to acknowledge that our ideas do after all come from somewhere — whether a search engine or a commonplace book. But as Peter Stallybrass eloquently argued five years ago in PMLA (h/t Mark Sample) the process of discovery has always been collaborative, and has long — at least since early modernity — been embodied in specific textual technologies.
Stallybrass, Peter. “Against Thinking.” PMLA 122.5 (2007): 1580-1587.
Wilkens, Matthew. “Geolocation Extraction and Mapping of Nineteenth-Century U.S. Fiction.” DHCS 2011.
On the process of embodied play that generates ideas, see also Stephen Ramsay’s book Reading Machines (University of Illinois Press, 2011).