The Los Angeles Review of Books has an article in this week’s issue that’s getting a lot of airtime on Twitter, by Stephen Marche, called “Literature is not data: Against digital humanities.” Marche, a 2005 graduate of the University of Toronto with a doctorate in Early Modern Drama, is a journalist who writes for Esquire, The NY Times, the Wall St. Journal, the New Republic, Salon.com, the Globe and Mail, and the Toronto Star. In the LA Review article, he observes that “The information we have about the past is, in almost every case, fragmentary. There are always masses of data which are simply missing or which cannot be untangled… Literature is irredeemably broken and messy. Its brokenness and its messiness are part of its humanness.” This observation is adduced in the service of an argument that digital humanities treats all literature as though it were the same: “The algorithmic analysis of novels and of newspaper articles is necessarily at the limit of reductivism. The process of turning literature into data removes distinction itself. It removes taste. It removes all the refinement from criticism. It removes the history of the reception of works.” It should be pointed out, though, that all kinds of data is messy and incomplete—contrary to the author’s example of baseball as a realm where you can have every stat for every game, there are aspects of the game no one though to keep track of until one day the did, and there are games, years, and players for which we have very little or no data at all. So incompleteness is neither unique to literature, nor does it preclude machine analysis. For that matter, not all of digital humanities is about machine analysis—some of it is about creating new works, some is about editing old ones in new media, some is about mapping and modeling, and so on. And Marche’s main argument, that “insight remains handmade” is a point with which no digital humanist would disagree—though we would probably look to different kinds of evidence to inspire our insights.
Across all these types of digital humanities, then, support for the application of computational methods to research in the humanities or to the production of art requires some new skill sets and new academic backgrounds in library and IT staff, both: when I was at Illinois, we addressed this under the heading of informatics, and I think that’s a useful term with which to conjure a certain kind of staff and a certain kind of services. “Informatics” is a term widely used in other parts of the world, but heard in the U.S. mostly as the latter half of “bio-informatics.” In the life sciences, or in the humanities, or in any other field, “informatics” is the application of information technology and information science to the data that constitutes the primary research material of that field. In Europe, digital humanities is sometimes called “cultural informatics,” and that’s a fine alternative term, in my view. Cultural informatics will require cultural informaticists (or cultural informaticians, depending on whether you want to be more like a lyricist or more like a magician). Those people, whatever we call them, will need to be educated in human information behaviors, in supporting the information needs of researchers, and in applying information technology to digital data. Those people might come out of the computer science and engineering (more likely engineering, for reasons I can explain if you’re interested), or they might come from library and information science programs, or they may well be self-taught, usually by having learned things in order to do their own work. Wherever they come from, these informatics people are going to be educated in the domain they support: to do informatics in life sciences, you need to know a good deal about life sciences; to do informatics in the humanities, you need to know a good deal about the humanities.
Science and libraries have a relationship that’s complex in one sort of way; humanities and libraries have a relationship that’s complex in utterly different ways, and social sciences have their own row to hoe. Doing the humanities digitally is not an activity that takes place in a vacuum: fundamentally, it competes with other activities and priorities in an environment where many trends indicate that fewer resources will be available. In that environment, we can’t just shuffle the deck-chairs: we have to throw some of the deck chairs overboard. So--in that environment--why would you choose to put your chips on digital humanities, or digital arts, and what do those chips actually amount to, when they are expressed as library or IT staff and services?
You might put your chips on DH because its research model offers a reasonable prospect of external funding to support the work of graduate students who collaborate with faculty on a project, and that’s a healthy trend.
You might put your chips on DH because it offers the prospect of laying to rest quantitative canards, or answering previously unanswerable quantitative questions.
You might put your chips on DH because it is fundamentally interested in the reinvention of scholarly communication, and that includes the reinvention of the library, and of campus IT.
You might put your chips on DH because it offers an academic and professional culture that has healthy norms of collaboration, co-authorship, mentoring, and service.
You might put your chips on DH because it offers a different role for librarians, as peers and collaborators. The best librarians and the best IT professionals are a force for change, because they understand and embody some values that universities badly need these days—collaboration, frugality, service, and the creation of new knowledge in new people.
But to enact those values tomorrow will not look like what we do today: as it does in other spheres of work, automation changes what people do in libraries and in IT, usually by taking over rote tasks, and (in a networked world) by making it easier for people to collaborate. For example, online teaching—and the support of online teaching—is going to require different practices and different skills than teaching in person—not different values or different intellectual principles, but different types of work. Support for online learning, instructional design, flipped classroom experiences, is the job of some combination of library and IT staff on every campus these days. The faculty of humanities and arts are going to need a lot of help to learn how to teach online, but in my experience, they are (by and large) ready to do that. They understand that both faculty and students benefit from the flexibility of the format, and faculty know that they will have to teach differently to teach as well online as they do on campus---and they also know that in some ways, they’ll be able to teach better online. They’re ready to experiment, and they need partners who understand the delivery of information services in new technological environments, with limited resources, targeted at the heart of the research and teaching mission of the modern university. Those partners sound to me like librarians and other human-centric information professionals.
The work one wants to end up doing, in this transition, is the work of information organization, data curation, collection development, interlibrary partnerships, IT innovation, information security, and the like. The work one wants to lose is work that can be done better, cheaper, and faster by the computer or by the crowd. Since computers cannot comprehend, much less appreciate, text, literary or otherwise, there is absolutely no danger, pace Marche, that the computer will end up with the job of producing interpretation, insight, or inspiration on its own. What it may do, though, is to keep track of each of the 5,000 words in Emily Dickinson’s remarkably small vocabulary, and tell you which words occur together and which ones don’t. What you make of that is your own business, as a humanist.
In libraries, humanists are your core constituency: they are a touchstone for the value of cultural heritage materials, but they’re not always good at collaborating. That’s something that is different in digital humanities, and something we can help others learn. Digital humanities folks collaborate not because they are innately nobler or more generous than other humanists, but because working with digital resources seems to lead one to addressing larger research questions across more primary material and from more disciplinary perspectives. Also, doing the humanities digitally involves one in partnerships with librarians and technologists who know things you don’t know, but which your project requires. And collaborative research projects offer interesting roles for students, at all levels—a kind of apprenticeship, in disciplinary terms, but also potential leadership roles as well, on the technical front, in project management, in grant-writing and grant administration, etc.
Digital Humanities and Digital Arts also raise very interesting problems around the preservation, migration, and use of data. Everybody now knows there’s a need for something called data curation, and humanists have some of the most complex kinds of data—not the most extensive, but (and here I agree with Marche) some of the messiest, most irregular, most elliptical. If you can curate that, you’ll probably be able to deal with a lot of other things. And even though the humanities don’t operate across the massive amounts of data generated by the instruments of some disciplines, we do at this point have millions of books with billions of pages and trillions of words in them, and that counts as large-scale in at least some dimensions.
Finally, I want to point out that my assignment—to talk about the impact of digital humanities and digital arts on libraries and IT—didn’t mention the Press. So, for extra credit, I’d like to spend a moment on university presses. Publishing services for humanities and arts, including open-access publishing, are increasingly becoming a library function, and presses that are humanities-oriented are increasingly likely to end up as part of the library. I think that’s great, and very appropriate. Like libraries and IT, presses are a cost-center: the irony of their situation in the university is that because they do have the potential to produce revenue by selling things—something that libraries and IT can’t do—they are expected to break even, or to make a profit, and when they (generally) don’t do that, they’re seen as a drag on the campus, chronically in debt, etc. But like the library and IT, the press function in a university is a necessary part of the life-cycle of scholarly information—the work that presses do in quality assurance, marketing, and distribution, the work with authors and with audiences, is quite different from what libraries do, and it’s necessary, but it’s never been organized as a local good, and because of the importance of the peer review component, local publication has always seemed suspect. These are superficial impediments to a new way of doing scholarly communication, though, and libraries and university presses working together need to organize that communication, because no one else will, except agents who are the economic adversaries of universities. Springer’s for sale, and if I could convince the AAU to pool its funds and offer the $2.5B that it’s likely to cost, I would: universities need to be in the scholarly publishing business, because they provide that business with its content, its audience, and its reason for being. If we can understand presses as an important part of that business, and fund them accordingly, there are some really interesting opportunities for presses in connection with digital humanities and digital arts.
So, I guess my assertion, presented to you for debate, would be that digital humanities and digital arts present libraries (and IT, and university presses) with an opportunity to be useful, in support of teaching and research—but also with a challenge, because seizing that opportunity requires a commitment of resources, which is a matter of priorities, and those priorities are a matter of local, institutional, industry, higher ed, and international strategy. Who is best positioned, in the university, to lead a conversation about those priorities? I would say libraries, academic technologists, and presses, working together. What’s the cost of not having that conversation? For starters, continuing to buy back our research from commercial entities. If the digital humanities can help to mitigate that situation, it will have done a service not only to its own practitioners, but to all faculty and students in all universities.