Medieval and Early Modern Digital Humanities Seminar

This Medieval and Early Modern Digital Humanities Seminar (officially called a Postgraduate Advanced Training Seminar, or PATS) was put on by the Australian and New Zealand Association for Medieval and Early Modern Studies (ANZAMEMS) on November 18, 2015, at the University of Canterbury, New Zealand. It was recorded and is available on YouTube.

It featured keynote speakers Professor Patricia Fumerton from the University of California Santa Barbara (US) and Professor Lyn Tribble from the University of Otago (NZ), as well as Dr. James Smithies from the University of Canterbury (NZ), and a panel of library and academic staff from the University of Canterbury (NZ). It was organized by Dr. Francis Yapp from the University of Canterbury’s Music Department, and I helped out with the live-streaming and social media presence. The following are notes from the day — I was also live-tweeting the event (#ANZAMEMS).

 Materiality, Affordances, and the Digital – by Professor Lyn Tribble

Tribble chose to focus on the Early English Books Online (EEBO) database and discuss what the online page vs. print objects reveals and conceals. Interestingly, she said that she doesn’t consider herself a Digital Humanist [though this presentation was very much what I would call DH]. She noted that we count on computers to erase our mistakes and that one of the first chief affordances of the computer was the need to not use white-out anymore. A book is not a transparent container of texts. Her MLIS degree has led her to think about books as not just things you pour words into, but things that have a material substantiality of their own. This affects how she thinks of the digital as well.

New technologies began to denaturalize the book (N. Katherine Hayles’ How We Think, etc.). Affordance is a term coined by J. J. Gibson in The Ecological Approach to Visual Perception (1966, 1979) to account for the relationship between human beings, their tools, and their environments. For example, because we’re using livecasting, it constrains how we can use the room and where we position ourselves. This is the same as with lighting in theatres: it changes how we act (ex. Shakespeare didn’t have a light board so operated differently). With an old technology like a cuneiform tablet, it has a permanence so will affect what you write on it, vs. what you would write on a wax tablet that could be erased.

Actors used to receive their lines from a scroll with just their lines on it and cues. This didn’t allow them to see what else was going on, but it did allow them to quickly see the changes that occur with their character over time. It was a different way of preparing for a part. It worked well for people who were doing 5-6 plays a week at that time. It might seem weird to us, but it was embedded in the times (more memorization in education, etc.).

A database is built on older systems of knowledge, even though everyone might not know the history. The switch from the card catalog to the online catalog has sometimes comical issues (ex. student typing the question “what is mitochondria” into library search box). It assumed you know what a library card catalog is or how to use it. People enter the digital world at a different point in time than others.

For EEBO database, what does it mean to say it’s comprehensive? There are lots of things that are omitted from it. What about ephemera? Latin texts on the continent are not in it. It doesn’t really represent a good picture of print world at that time. There is something of a nationalistic idea behind things like Pollard & Redgrave and EEBO as they try to document what books were printed and where they’re held.

EEBO digitization: you can have any color you want as long as it’s black. Red color washes out on the scans. On EEBO, you get whatever copy was easiest to scan, even though there can be differences in print copies. Use with caution: not the same as a scholarly edition. Remember that microfilm was cutting-edge technology in the 1930s. Sometimes things were related on the film, but often not. You would run past Hamlet and other things all together (she recounted still remembering that toxic smell…).

She showed John Overholt’s one-jpeg answer on Twitter as to why you would still want a printed text when it’s on EEBO. It’s self-explanatory!

Screens flatten out information. They limit what you get and what you don’t get. You don’t get size, depth, etc. of a book from a screen. It’s a physical effort to read a big book, but you lose this sense once it is digitized. A socially conscious gentleman could keep a little book in his pocket (The Academie of Eloquence). The social embodiment of knowledge shows what its use might have been.

EEBO has massively changed scholarship in the field. It is owned by ProQuest and is an expensive, proprietary package. It is hard for small liberal-arts colleges in the U.S. to get access. She referenced EEBOgate and how the Renaissance Society of America (RSA) told members that EEBO had cancelled the deal/subscription because people were using it too much (!) and it feared it was losing out on institution subscriptions.

Every technology has its own error that’s likely to be made. Page numbers are often wrong in Early Modern books. Likewise, OCR contains errors. An OCR cautionary tale (semen was actually seven!). The Lost Plays Database tries to show what plays are missing using evidence of their existence, even though they don’t have the actual plays.

Q&A

EEBO-TCP has some transcriptions but these are not linked to the EEBO database. There is a tendency to look where the light is better. We should be aware of our own bias in our research. The search function is awful in EEBO. Part of the Twitter EEBO scandal was the thought that: These are our books. Why is EEBO charging all this money for them?? That’s why the suggestion for guerrilla harvesting came about.

Patricia Fumerton noted: We’re such a visual culture. We want images. People don’t go to the ESTC. They automatically go to EEBO and assume they’re seeing everything and don’t go to the ESTC to check against it. It’s a serious problem in scholarship. The British Library’s motto is “all the world’s knowledge.”

Tribble recommended having students in an Honours class go and compare a rare book in their library to its EEBO version to show them what they don’t know.

Alan Liu said that maybe the loss-reading of machine learning is actually closer to how humans read (with errors and such) and the affordances of the human mind. What might have the reading environment been in the Early Modern time? What was the noise like?

Paper was expensive. They wouldn’t have gone back to re-print. What’s an acceptable level of error?

We may have lost the concept of fluent forgetting – when people didn’t have a copy of Hamlet in front of them, actors could get away with errors as long as they kept the meter and kept going. Nobody noticed.

It’s hard to discover what is and isn’t tacit knowledge or awareness for students and others. User testing should take things like this into account (how will it look on an ipad?). It’s easy to squirrel away information and make it seem like you have everything (look how much stuff I’ve downloaded!). When you had to pay per photocopy, you were more careful and selective. Now you face losing the reverence you have for the text.

Serendipity is interesting. You go to get a book and end up getting surrounding books. [This concept keeps popping up at DH events.]


 

The Digital Recovery of Moving Media: EBBA and the Early English Broadside Ballad – by Professor Patricia Fumerton

Fumerton discussed the English Broadside Ballad Archive (EBBA) that she helped found and lots of interesting history about the broadside ballads. The Roxburghe (1500 ballads) and Pepys (1800 ballads) were called veritable dung-hills in the 19th century by Francis James Child. Actually, print ballads were the most popular form of print in the period. They were printed by the millions, affordable (penny or halfpenny), disseminated even into the countryside, and mass marketed to high and low (but especially targeting middling and low). They were a marketing and consumer object, used for toilet paper, lighting pipes, and pasted on walls.

They had been largely neglected until recently. Folk studies in the U.S. focused on oral ballads. Finally, programs of study popped up at places like Harvard and UCLA. The last five years has seen an upsurge in books, dissertations, articles, and book chapters about the broadside ballads. Why now? The popular (mass marketed, low, street literature) is now popular, as are Ephemera Studies.

With market competition, EEBO and ECCO have put their own ballads up in a wildly haphazard way. With EBBA’s advanced search and “assemblage theory”, everything has to be individually tagged and marked because they will occur in different ways on different ballads. You can search by different spellings. The keyword search is tailored to the genre of the broadside ballad, after she and other scholars looked through 1800 ballads. They feel it represents a good way of looking through the ballads. It also has definitions for how words would have been defined at the time. They stopped doing keyword search on the Pepys because they didn’t feel it would help anyone search the images in any meaningful way. They’re now working on a digital tool that will match woodcut impressions (like finding all instances of a particular artichoke lady). All woodcuts get wormholes! You can tell where the woodcut came from and can date it by the wormholes.

The hole in the scholarship is trying to see ballads as multimedia pieces. Fumerton described a couple of the images on woodcuts and feminist and anti-feminist debates about women cropping their hair and wearing pants because the men aren’t able to “wear the pants”. A house full of prostitutes was depicted to critique the aristocracy who spent their money there. She played several audio clips from some modern recordings of the ballads and how individuals can sing them in different ways. The same tune might be renamed just like how images are reused. EBBA opens up fresh possibilities for modern people to experience and create their own imaginary assemblages (Manuel De Landa) of early modern ballads like others did.

Q&A

Fumerton explained how the database shapes her work. EBBA meets as a team, and a digital programmer (off-site from UCSB) is always in the discussion, as are ethno-musicologists and singers. You will have a failed project if you try to tell a programmer what you want and have them try to make it. You will also have a failed project if you have them tell you what they’re going to make and have them make it for you. EBBA has gone through several iterations. There’s always new knowledge coming in through their apprentice-ship graduate students. EBBA has been organic and grass-roots from the beginning.

She started out just wanting to teach a course on street literature and found there was nothing on street literature. So UCSB was the first UC to get EEBO and had it five years before other UCs. Check out her articles on the process of building EBBA. How did you do it is a hot topic in the Digital Humanities. Even if your end project fails, the process of making it is extraordinarily important and valuable. The project embodies the spirit of collaboration and democracy where everyone has a voice.

We complain about the fragmentary nature of texts now with Internet keyword searches, but there have always been holes in knowledge. There is no “whole” and there never was.

Without the database, you wouldn’t be thinking this way about the ballads. But it’s not a bad thing, it’s a good thing.

The scientists understand the lab model immediately (and how this stuff can be recognized in academia) while the humanities are not always certain about it. Printing out your data work is actually a good way of impressing people. It comes out to a lot of pages.


 

Behind the Scenes: EBBA and Early Modern Making – by Professor Patricia Fumerton

Fumerton explained that EBBA’s digital archival process has been critiqued by McKitterick for being intrusively recreative instead of a facsimile. But all collectors in some way are manipulating the artifact and recreating it. Everything has gone through multiple hands and been processed/manipulated before it came to be digitized by EBBA. Reassemblage is nothing new, and we don’t know every stage and when things were changed.

Before, everyone would come up with different ways of classifying images. Even Iconclass—which is designed for high-end art and extraordinarily complicated has 28,000 terms for identifying whole images and items in images—was difficult to use. So a machine learning tool can come in handy to determine feature points. Now, each image is understood as a Bag of Features instead of a complete image. This is similar to the Bag of Words concept in topic modeling. The trick is trying to train humans to help machine catalog correctly. Human cataloguing terms are based on Getty Art & Architecture Thesaurus and Iconclass, using genre terms (narrative, landscape) and descriptive tags (man, horse, book). This increases image interoperability.

The goal is to make the tune recordings in EBBA more accessible to the musically challenged, like her. To do this, they will be transcribing into modern notation the first stanza of every tune recording in Sibelius with text underlay. Users will be able to play the notes in a slowed down midi version (like a fiddle).

The EMC Imprint is the Early Modern Center’s journal which is publishing multi-media literary and cultural studies, 1500-1800. It is trying to capitalize on what the web offers users, rather than just doing PDFs like other digital journals. It is free and open access and peer-review. As soon as you go to the site, you see it’s an active experiential site; it has videos including how-to videos like where they are making their own paper. Fun fact: you can determine where paper was made by looking at the bugs in it!


 

 

How to Write a Digital Project Scope Document – by Dr. James Smithies

Smithies gave an overview of the key document in one of the Digital Humanities classes at the University of Canterbury called DIGI 403. It is called a Digital Project Scope Document and helps students shape their digital project and do project management.

It should take no more than 30 words to articulate the purpose. For New Zealand, the Treaty of Waitangi impact statement needs to be completed to show how the project will impact local indigenous communities such as Maori. Students are allowed to do off-line prototypes even though they can’t continue on with the project after graduation. They set their own delivery dates for all of the outputs, because all of the projects are different (a rough estimate is 10 key milestone dates). Students are responsible for communicating slippages in dates to the teacher as part of learning project management. It’s not about mastering HTML or PHP but successfully designing and managing a project. It is better to show something that can actually work and is thought through. It’s a way to sit back and examine all of the possible outputs for your project. Tools are fantastic but also brittle with limitations. Amazon’s server is a sand-box to play in and open to anyone. Omeka is designed by scholars so features are baked-in to be useful to them.


 

Roundtable Discussion with Anton Angelo (UC Library), Dr. James Smithies (DH), Joanna Condon (Macmillan Brown Library), and Dr. Chris Jones (History)

 1: Is digitization enough? Is it an appropriate activity in itself?

Jones said that New Zealand has done a lot of work with digitization, but people haven’t always involved historians in the process to determine which texts are a priority to digitize versus other ones that can wait. Smithies replied that some of the disconnect might be because academia has been seen to eschewing digitization (though shalt not digitize attitude). He believes humanists should take responsibility for the digital turn and not rely on outsourcing work to librarians or other people involved in digitization projects. What would be ideal is to have consultative conversations between librarians and humanists. Condon noted that it is hard to measure the impact from digitization so it is often not prioritized. Alan Liu added that there should be more API outputs from humanities research so that people can use them to make things like fun apps that pull information from libraries (like ‘men wearing girdles’).

2: What are your positions on licensing?

Smithies said that open access is needed for the health of the scholarly ecosystem. Business models and legacy systems are still around. Jones told an interesting story of a loophole in a copyright agreement in the UK which allowed for the university here to publish an image from the 1600s. Even on old material, publishers want to control content. Fumerton said that one of the big new challenges is proving sustainability. Funders want to know if a digital project can be sustained by the library. Liu added that we have to rethink what it means to “own” something and issues of curation. Ex. Anne Frank’s diary adding the co-author of her father to extend copyright or Open Access Week.

3: What has to happen for the Digital to get taken out of Digital Humanities? How will it become natural?

Jones believes that it will happen when digital natives become graduate students. There will be a natural evolution and historians will naturally merge their digital skills with the practice of history. It’s likely to happen first among medieval and early modernists because we’ve been early adopters. Smithies said that mainstream humanists will start integrating digital tools into their work. But there’s a naivety about the extent of possibilities within this domain. As Willard McCarty has said, there’s so much out there that still needs digitization; we’ve hardly touched the bulk of it. The cutting-edge will always be out in front of humanists; Digital Humanities incubates future possibilities. It is quite likely that historians in future years will have a colleague who’s a computer. What will advanced AI reading Shakespeare look like? [What a thought!]