Transcription/Producing Metadata

October 26, 2020 • Federico Perelmuter

Metadata is a crucial element of an archival digitization project. It forms the bridges that join together the thousands of scanned images that form an archive; metadata is also an opportunity to make visible connections between individual files contained within the archive in a way that is impossible without computation. It hence also allows us to visualize the data within the archive, transforming a collection of individual cases into a unique and pluralistic dataset that can be analyzed by researchers.

Transcription–one crucial form of metadata–transforms thousands of often hard-to-read images into searchable text, likewise adding a new dimension to the archive through the power of computation. Authority records–another form of metadata–allow us to catalogue the individuals and organizations present in the archive, constructing something like a web of connections that transforms the archive into a true structure of cases, interlinked and, most importantly, searchable. This process is dangerous and must be carried out with the utmost ethical care–the potential for extreme violence to an already precarious violence must be continuously weighed against the benefits of the methods being deployed.

Several procedures are crucial here: what we carefully assess to be relevant textual components of the archive–words, forms, stamps, signatures, etc–can be transcribed onto a blank text-processing software. Obviously, the crucial material components–the authors’ handwriting, the curves of their signature, the page’s color–can never be fully translated; the transcription should not be separated from the actual case file, to avoid this collection appearing as a collection of textual fragments and not a register of crucially material and embodied acts of resistance. The entire archive is held in a linked database, and the entities that populate it–the separate images, files, bundles, boxes, and collections–mimic the physical archive held in Guatemala; metadata is added atop, and in relationship to, each case. We also make case descriptions, which are often direct transcriptions of the narratives found within the cases but can often be summaries of the events described within that make them easier to find. Importantly: we never change spellings, and try to preserve the agency of the subjects that are asserting their experiences of terror within the archive. Because our project is guided by a post-custodial framework, we take very seriously the importance of communicating with the archive’s makers and true caretakers–the GAM themselves–and are continuously focused on carrying out their vision through the benefit of our resources and time.

We strive to make the archive as accessible as possible through an extensive paratextual apparatus that visibilizes, protects, and fortifies its breathtaking complexity without flattening or mortifying the life within; this computational structure is itself modeled after the structures already contained within the archive itself, ensuring the GAM has maximum control over its production and function. Metadata, perhaps the most distinctive feature of a digitized archive, is a crucial but risky path through which we hope to reach our goals.