Maria Berggren

Providing access to a World Memory. A cataloguing project concerning the Swedenborg archives at the Royal Swedish Academy of Sciences

Emanuel Swedenborg (1688-1772) is one of the most renowned Swedish authors internationally. Swedenborg’s reputation rests mainly on his fame as a Christian mystic and theologian: His theory of correspondences has inspired Blake, Balzac, Baudelaire and Strindberg, to mention only a few modern authors. He is also the founder of a religious communion, the Swedenborgian church.

As a young man, Swedenborg was a scientist of great ambition, well acquainted with the latest findings of contemporary chemistry, geology, anatomy and physiology.

Together with Christopher Polhem, self-taught scientist and inventor, he started Sweden’s first scientific journal, the Daedalus Hyperboreus, in 1716.

He published a large work on mineralogy and mining that won international recognition, the Opera philsosophica et mineralia, printed in Dresden & Leipzig, 1734. The first tome was entitled Principia rerum naturalium, and contains an exposition of Swedenborg’s philosophy of nature; to a great deal it is inspired by Cartesius’ vortex theory.

Swedenborg also planned two great works on the anatomy and physiology of the human body. Parts of them were published but a vast material also remained in manuscript form.

On the suggestion of his relative, Carolus Linnaeus, Swedenborg was introduced into the Swedish Royal Academy of Sciences in 1741. At Swedenborg’s death in 1772, his manuscripts were donated by the heirs to the Academy of Sciences, where they still remain today.

Since June 2005, the Emanuel Swedenborg Archives at the RSAS is included in UNESCO’s  “Memory of the World” Register. This appointment has inspired a project aiming at new measures for the long-term preser­vation of and provision of accessibility to the collection. The construction of a new catalogue in electronic (and printed) form is part of the project.

I have the privilege to work on this catalogue together with the permanent staff at the Centre of History of Science at the Academy.

 The Swedenborg Archives

The Swedenborg Archives today consist of:

     * a kernel of 110 bound volumes, codices, containing mainly manuscripts by Swedenborg’s hand but also first-prints which have sometimes been bound with the manuscripts; the autographs date from 1719 to 1771, and cover both the scientific and the theological period in Swedenborg’s authorship.
    * extra copies of printed works, acquired by the Academy in the late 19th and early 20th centuries
    * correspondence and other documents connecting to the collection (not by Swedenborg’s hand)

 A new catalogue

The new catalogue will in the first place be a tool for finding the way through the material, a means for identifying, collocating and selecting material in the collection. A more general digitization of the autographs may take place in a second phase of the project, however. Therefore the format of the catalogue has to be as open as to allow for a complementary addition of photographs.

An overall description of the archives will be encoded in the EAD (Encoded Archival Description) format, which is of course a well-known standard internationally. (It is also the standard applied within the digital Swedish national catalogue of manuscript collections, Eddiffah, presented by Ingrid Svensson at this conference.) My colleague, Maria Asp, is working on this part of the project.

For the detailed description on item level of single volumes, particularly the autographs, we have decided on another model, namely the TEI (Text Encoding Initative) encoding scheme.

Text Encoding Initiative and manuscript cataloguing

TEI is a consortium of institutions and projects within the humanities, which since the late 1980:s have been developing and maintaining “guidelines” for text encoding. The TEI project was originally sponsored by three scholarly societies on American grounds, the Association for Computers and the Humanities, the Association for Computational Linguistics, and the Association for Literary and Linguistic Computing. Today it engages organizations and academic institutions from all over the world.

From the start, the TEI project made use of SGML as a standard markup language. Since the so-called P4 version of the guidelines, released in 2002, XML is the standard. The latest version of the guidelines, the TEI P5, was released in November last year (cf.

The TEI guidelines are not in the form of imperative rules, but rather of a set of principles that can be adapted to the needs of specific projects. Thus, TEI is a standard and yet not.

The general flexibility of TEI means that rather few elements are altogether compulsory in valid a TEI file. One element that is obligatory, however, is the TEI header, which contains structured data about the data file itself and the encoded text; it describes “the text itself, its source, its encoding, and its revisions” (TEI P5 Guidelines, 2, 6). The TEI header has been described as a metadata format. As such it is compatible with ISBD and to some extent also with the AACR2 (Cf. P. Caplan,  Metadata Fundamentals for All Libraries, American Library Association, 2003, p. 66 ff., with references).

The msDescr element

The TEI encoding scheme consists of many different building-blocks, “modules” (sets of elements and attributes) developed for different purposes. For example, there are modules that can be used for encoding verse, performance text, dictionaries and language corpora. There is also a module developed especially for the bibliographic description of manuscripts, a module which “can be used to provide detailed descriptive information about handwritten primary sources” (TEI P5 Guidelines, 10, 1). This module may be part of the TEI header or form a stand-alone XML-file.

The manuscript description module has been developed within the frames of an EU-project that focused on the cataloguing of medieval manuscripts, the so-called MASTER-project (Manuscript Standard for Access through Electronic Records), and later incorporated into the TEI encoding scheme.

Medieval manuscripts formed the point of departure, but notably, the TEI manuscript description module is presented in the Guidelines as applicable to all types of manuscript material  (TEI P5 Guidelines, 10, 1). As yet, there are relatively few projects where the TEI encoding scheme has been used for cataloguing modern manuscripts, however, and no studies in written form that I know of, where such an application of the system has been evaluated. We are therefore treading fairly new territory.

Modern versus medieval manuscripts

Some general differences seem to apply to modern manuscripts (post-Gutenberg) as opposed to medieval manuscripts.

A medieval literary manuscript typically constitutes a document de réproduction et transmission, one of many links in the history of transmission of a certain text. A literary manuscript from the modern era, on the other hand, is normally a document de création, and bears witness to a text in the making. (Cf. A. Grésillon, Éléments de critique génétique, Paris, 1994, p. 78 f.). One might say that the medieval manuscript is typically polyform, while the modern manuscript is monoform  (Cf. R. Du Rietz,  Den tryckta skriften, termer och begrepp, Uppsala, 1999, p. 25 ff.).

This is a difference that is reflected in cataloguing practise: For example, in the traditional bibliographic description of a medieval manuscript, the first and last lines of a certain text, the incipit and explicit, are generally given. This seems quite sensible when dealing with, for example, Caesar’s De bello Gallico, the first lines of which can be quoted by any first year student of Latin, or Vegetius, Epitoma rei militaris. These are texts that were transmitted in a fairly fixed form during the medieval era.

But for a manuscript containing one of many autograph versions of a text not published in the author’s lifetime, a text that has not been circulated in handwritten form (which is typically the case with the Swedenborg manuscripts), the stating of incipits and explicits seems less meaningful.

This kind of manuscript calls for other principles of characterisation. For our purposes I have made a slight adaption of the TEI model.

An adaption of the TEI

An overall description of each volume is encoded by means of the TEI msDescr element. This overall description contains descriptive metadata referring to the whole volume, such as shelfmarks, collective titles, and a physical description of the volume as a whole. The overall description also includes a general description of the contents of the volume, among other things, a list of manuscript items – single texts that can be found in the volume. (Text is of course a very difficult entity – but we will not go into that here.)

Each manuscript item – or “text” – is assigned a number and a title, and its location within the manuscript is stated. The single msItems “point” by means of identifiers to transcriptions of text, which are stored within the text element.

These transcriptions of text are “selective” and concern structural features like:

    * Title pages;
    * Headings (of  parts, section, chapters, entries etc.);
    * Lists and tables (as in the example shown).

 The idea is that the encoding of structural features of text may serve as a means for the catalogue user of finding the way through the volume. The encoded text will also be electronically searchable. The encoded text features can be described as representations of text, they constitute structural metadata.

Text versus document

The TEI encoding scheme was originally developed for the encoding of literary texts. Not surprisingly, the basic structure of the TEI encoding reflects a theory of texts rather than a theory of documents: the stress is on intellectual aspects of text.

The overruling hierarchy in a TEI conformant XML file is that of textual “content objects” that nest inside each other: Books → parts → chapters → sections, for example, or poems → stanzas → verses form such (hierarchical) structures. The physical structure of the document (the “carrier” of the text) on the other hand, is treated as secondary in the TEI encoding scheme. (Pagination, for example, forms a flattened hierarchy, expressed by empty elements.) For my purposes this is not altogether satisfying, as my primary goal is to describe the physical document and the arrangement of the text within the physical document. (For example, a manuscript page may contain several text fragments that form part of different text structures in the TEI sense and sort under different manuscript items.)


I have tried here to introduce to you the method for item-level/or volume by volume cataloguing that I am trying to apply to the Swedenborg manuscripts.

In making use of the TEI encoding scheme, I have met with certain difficulties, for example in the two respects demonstrated in this paper:

    * firstly, in that the TEI manuscript description module was not originally worked out for describing modern manuscripts, but for the cataloguing of medieval manuscripts; this seems to call for certain adaptions when the model is applied to modern manuscript material;
    * and secondly, in that the overruling hierarchy in the TEI text module is founded on a theory of the hierarchical structure of textual content objects – the physical orientation of  text in the encoded source is less prominent in the encoding.

My main impression, however, is that the TEI encoding scheme is a forcible instrument with many possibilities. The modular structure of the encoding scheme and the large number of elements available, the openness for modified solutions within specific projects are some of its strengths. A core of standardized features opens for migration of data between systems. Furthermore, the broad-based international profile of the TEI community seems to guarantee a certain continuity, as far as continuity can be attained in a rapidly changing world.

Whether it is future proof or not is something that remains to be seen.