A New Tool for Digital Manuscript Facsimiles: Introducing the Manicule Web Application

Aylin Malcolm, DM Postgraduate Subcommittee

Much of my work in digital manuscript studies has been informed by a simple question: is this something I can show to my parents? I am the only person among my family and childhood friends to pursue graduate studies in the humanities, and when others take an interest in my work, I try to provide resources that do not depend on specialized knowledge or institutional subscriptions. This question can also be framed in broader terms for scholars interested in public engagement: how can we make our research accessible and engaging for nonspecialists? How can scholars working on the material culture of previous periods demonstrate the relevance of such studies now? And how can digital resources enable us to learn from communities outside the traditional bounds of academia?

I recently confronted these questions while examining a late-fifteenth-century astronomical anthology, written in German and Latin close to the city of Nuremberg, and now identified as Philadelphia, University of Pennsylvania, Kislak Center for Special Collections, Rare Books and Manuscripts, LJS 445. This codex, which you can see in my video orientation below, is remarkable for its inclusion of material from three incunables, making it a clear example of the transmission of knowledge from print to manuscript.

For more videos like this one, see the Schoenberg Institute Youtube channel.

My own fascination with LJS 445 began when I opened it for the first time and saw a charming sketch of a man on the first page. Turning to the second folio, I was struck by its whimsical doodles of gardens and doors. What were these doing in a book dealing mostly with astronomical calculations and predictions about the Church?

birds.pngDetail of fol. 2r of LJS 445.

My non-medievalist mother knew the answer immediately. “They’re children’s drawings,” she observed, pointing out the uneven writing and repetition of common motifs, such as trees. And turning to the 1997 catalogue description by Regina Cermann, I found that she was right: this book can be traced to two of the sons of a Nuremberg patrician, Georg Veit (1573-1606) and Veit Engelhard (1581-1656) Holtzschuher. Veit Engelhard left numerous marks in it, including the year “1589” (fols. 95v, 192r, and 222v), suggesting that he inscribed this book when he was around eight years old. Thus began my efforts to find out more about the contents and uses of this book, from its faithful copies of print editions to its battered and often mutilated constellation images. Perhaps my favourite discovery occurred as I was reading German genealogical records, when I came across an engraving of Veit Engelhard as an adult.holzschuher-1.jpg

This digitized portrait of Holtzschuher is from the Herzog August Bibliothek Wolfenbüttel. It was also printed in Die Porträtsammlung der Herzog August Bibliothek Wolfenbüttel, vol. 11, ed. Peter Mortzfeld (Münich: K.G. Saur, 1989), no. A 100058, p. 266.

To make this remarkable manuscript more accessible to the public, I created a digital edition of it using Manicule, a web application built by Whitney Trettien and Liza Daly. Manicule, which is available on GitHub at https://github.com/wtrettien/manicule, allows scholars and students to create accessible, dynamic web editions of manuscripts and other rare books. It offers three modes of entry into a digitized text: a “Browse” function, whereby the viewer scrolls through pages of the facsimile alongside marginal notes written by the editor; a series of editor-curated “Tour Stops,” which provide commentaries on pages of particular note; and a “Structure” view, which draws on Dot Porter’s VisColl data model to depict the physical makeup of the manuscript, including missing, inserted, and conjoint leaves. Manicule can be downloaded and deployed on Mac OS systems using the instructions on the GitHub repository; Whitney is also available to provide advice and resolve issues.

The finished edition of LJS 445, available at aylinmalcolm.com/ljs445 under a Creative Commons Attribution 4.0 International License, is a true collaboration. In writing the text and creating the digital resource, I have built on the labours of many other researchers, including Regina Cermann; Whitney Trettien and Liza Daly; Dot Porter, whose tools for generating a collation model and image list are also available on the VisColl GitHub repository; and an entire digitization team at the University of Pennsylvania, from photographers to data managers and programmers. The result is also an evolving resource that can be adapted and augmented as new information about this manuscript emerges. Please feel free to contact me at malcolma[at]sas.upenn.edu if you have suggestions or queries, and I hope that you’ll enjoy exploring this unique manuscript.

COLLATE Collaboratory for annotation, indexing, and retrieval

From the project home page http://www.deutsches-filminstitut.de/collate/:

[COLLATE is] a Web-based collaboratory for archives, researchers and end-users working with digitized historic material. It… offers new ways of document-centered knowledge work to distributed user groups. European film heritage and censorship processes in the 1920s and 1930s were chosen as an example domain for the project. The developed COLLATE technologies, however, can easily be adapted to other application domains and usage contexts which are similarly information-intensive.

The current COLLATE collection of rare historic documents was provided by three major film and national archives from Germany, Austria and the Czech Republic. It consists of about 20000 digitized document pages describing film censorship procedures related to historic films and enriched context documentation including press material and digitized photos and film fragments. Members of these institutions – film historians and archivists – worked as pilot users, employing the COLLATE system for detailed cataloguing of the document collection and for in-depth content indexing and annotation of relevant sub-collections.

At the end of the project we established both an innovative Web-based collaboratory with a comfortable work environment for in-depth knowledge work with the material and a comprehensive, selected digitized collection of rare historic documents on European historic film that was interpreted and annotated by a multination team of film experts.

Since the end of the project the achieved results have been maintained and made available to the public. The project partners plan to further promote the system, i.e. both the technologies and contents (first of all Fraunhofer IPSI as the coordinator and major technology delevoper and the Deutsches Filminstitut – DIF as coordinator of the content providers).

Source(s): Web-based solutions

DigiPal Launch Party

Date: Tuesday 7th October 2014
Time: 5.45pm until the wine runs out
Venue: Council Room, King’s College London, Strand WC2R 2LS
Co-sponsor: Centre for Late Antique & Medieval studies, KCL
Register your place at http://digipallaunch.eventbrite.co.uk 

After four years, the DigiPal project is finally coming to an end. To celebrate this, we are having a launch party at King’s College London on Tuesday, 7 October. The programme is as follows:

  • Welcome: Stewart Brookes and Peter Stokes
  • Giancarlo Buomprisco: “Shedding Some Light(box) on Medieval Manuscripts”
  • Elaine Treharne (via Skype)
  • Donald Scragg: “Beyond DigiPal”
  • Q & A with the DigiPal team

If you’re in the area then do register and come along for the talks and a free drink (or two) in celebration. Registration is free but is required to manage numbers and ensure that we have enough drink and nibbles to go around.

If you’re not familiar with DigiPal already, we have been been developing new methods for the analysis of medieval handwriting. There’s much more detail about the project on our website, including one post of the DigiPal project blog which summarises the website and its functionality. Quoting from that, you can:

 Do have a look at the site and let us know what you think. And – just as importantly – do come and have a drink on us if you are in London on Tuesday!

The DigiPal Team


Source: From a description published in The XML Journal (http://www.sys-con.com/xml/wbg/CurrentSearch_Detail.cfm?ID=1119)

Anastasia is designed for handling large and highly-complex XML documents, where extremely precise control is required over the presentation of these. It can create output in any format, and it is optimized for HTML output direct to web browsers. Anastasia permits you to publish documents in identical form both on CD-ROM (Macintosh and Windows) and on the internet, from identical scripts. It includes full support for all valid XML and SGML documents, and a fully XML-aware search engine. Anastasia’s ease of use makes it suitable for small publishers with comparatively fewer computer resources, while its power fits it for large publishing enterprises. Anastasia is open source.

Home page: http://www.sd-editions.com/anastasia

Downloads: http://anastasia.sourceforge.net/


How to collect bibliographic references using cb2bib

It often happens to come across a useful bibliographic reference while navigating the WWW: in a newsgroup, while reading an on-line article, etc. If you want to add it to your collection of references, you can do that in a (semi-)automated way using a small but very handy utility, cb2bib. As you can read on the program’s home page, cb2bib “is a tool for rapidly extracting unformatted, or unstandardized biblographic references from email alerts, journal Web pages, and PDF files.” The name stands for “clipboard to bibliography (entry)” and stems from the program’s modus operandi: the text copied by the user in the clipboard is read by cb2bib and compared with a set of pre-existing patterns, then if a match is detected the clipboard text is directly converted in a bibtex entry on the basis of the matching pattern. Let’s see an example.

Note: cb2bib is available for both Windows and Linux operating systems (you can download it here), but the following screenshots refer to the OS I normally use, i.e. Linux.


Installing cb2bib is very easy both under Windows and Linux, in the latter case if you are using an RPM based distribution. If you use a Debian or Debian-derived Linux distro, or MacOS X, you might have to compile and install the software on your own. Comprehensive instructions are available at this URL.

A Simple Example

Open the Examples page on cb2bib’s website, select and then copy to the clipboard the second example, the one labelled as “PNAS Table of Contents Alert”.

Note: to copy text to the clipboard under Linux you can simply select it, or you can use the CTRL C key combination; under Windows press CTRL C.

As you can see from the following screenshot (Fig. 1), the selected text has been automatically converted to a structured bibliographic entry, which you can save now as a bibtex entry: just click on the icon next to last on the right (the one showing a floppy disk with a pencil over it), or press CTRL S, and the entry will be added to the file shown in the text field immediately above it.

Missing image 1-example.png Fig. 1 – A sample entry

It’s called references.bib and it lies in the cb2bib folder, but you can modify both path and file name, for instance you might choose C:\Documents\collected-refs.bib.

cb2bib will also retrieve the abstract, add the relevant keywords to the entry, and even download the PDF version of the article if there is an URL pointing to it and access is free! All of this automatically.

Once you have nicely collected and/or modified your reference, click on the Save button (the second from right), or press CTRL S, to save it in the references file. Delete the cb2bib_query_tmp_pdf if present, or you won’t be able to download the PDF file for the article if there is a link in the next reference you are going to process.

To know more about cb2bib features read the very nice overview page here.

Configuring cb2bib

Before exploring further cb2bib capabilities, it may be a good idea to check the program settings: click on the second icon from left, the one showing a wrench, and you will see the configuration window (Fig. 2).

Missing image 2-config.png Fig. 2 – Configuring cb2bib

It is split in about half a dozen tabs, it is important that you check paths in the first one and enable the network queries in the third one (Fig. 3).

Missing image 3-config.png Fig. 3 – Network queries

Since the author is especially interested in scientific publications, you will probably have to modify the regexps.txt file to obtain automatic format detection and field formatting. To do this you will have to understand the file structure, which isn’t too difficult, and add patterns for specific bibliographic styles. This is an example of a pattern which can identify and automatically format MLA-style entries:


cb2Bib 0.3.6  Pattern:
MLA-style article 1
author title journal volume number year pages
^(. ), "(. )," (. ) (\d ):(\d ) \((\d\d\d\d)\), ([\d|\-|\s] )\.$

Since this can be a time-consuming task, please share your regexps.txt files, so that everybody can benefit from your work and add/mix patterns to his own configuration.

Import references from a PDF file

If you have one or more PDF documents holding a good number of bibliographic references, you can import them using cb2bib: click on the third icon from left, and again on “Select files” to choose the PDF file(s) (Fig. 4).

Missing image 4-pdfprocess.png Fig. 4 – Extracting references from a PDF document

Once you have a list of files ready, click on “Process” to have the program read them and extract the references. If nothing happens, it’s because you haven’t specified a PDF importer in the last tab of the configuration window. Unfortunately this feature is still susceptible to improvement, results can vary from a useful list of references to a useless mangled text.

Export bibtex references to HTML

There are many programs that allow you to export your collection of bibtex references to the HTML format: many of these are simple command line tools, like bibtex2html or bib2html; if you want, you can also export them to XML using bibtexml. But you could also take advantage of more sophisticate bibliographic software, like Tellico (Linux) or Pybliographer (Linux): they allow for references managing, network queries, exporting to several different formats, and much more.