- Full Description
The digital age has had a profound effect on our cultural heritage and the academic research that studies it. Staggering amounts of objects, many of them of a textual nature, are being digitised to make them more readily accessible to both experts and laypersons. Besides a vast potential for more effective and efficient preservation, management, and presentation, digitisation offers opportunities to work with cultural heritage data in ways that were never feasible or even imagined. To explore and exploit these possibilities, an interdisciplinary approach is needed, bringing together experts from cultural heritage, the social sciences and humanities on the one hand, and information technology on the other. Due to a prevalence of textual data in these domains, language technology has a crucial role to play in this endeavour. Language technology can break through the 'Google barrier' by offering the potential to analyse texts at advanced levels, extracting information and knowledge at the level of the humanities or social sciences researcher, who wants to know about the who, what, where, and when, but also the how and the why. At the same time cultural heritage data poses considerable challenges for existing language technology: technology aimed at 'generic' language has to face such disparate problems as historical language variation, OCR digitisation errors, and near-extinct academic expertise.
This book is primarily intended for researchers in information technology and language processing who would like to receive a state-of-the-art overview of the whole breadth of the new and vibrant field of language technology for cultural heritage and its associated academic research in the humanities and social sciences. Researchers working in the target domains of cultural heritage, the social sciences and humanities will also find this book useful, as it provides an overview of how language technology can help them with their informationneeds. The book covers applications ranging from pre-processing and data cleaning, to the adaptation and compilation of linguistic resources, to personalisation, narrative analysis, visualisation and retrieval.
- Table of Contents
Table of Contents
- Foreword by Willard McCarty.
- Language Technology for Cultural Heritage, Social Sciences and Humanities: Chances and Challenges. Caroline Sporleder, Antal van den Bosch and Kalliopi Zervanou.
- Part I Pre
- Strategies for Reducing and Correcting OCR Errors. Martin Volk, Lenz Furrer and Rico Sennrich.
- Alignment between Text Images and their Transcripts for Handwritten Documents. Alejandro H. Toselli, Verónica Romero and Enrique Vidal.
- Part II Adapting NLP Tools to Older Language Varieties.
- A Diachronic Computational Lexical Resource for 800 Years of Swedish. Lars Borin and Markus Forsberg.
- Morphosyntactic Tagging of Old Icelandic Texts and Its Use in Studying Syntactic Variation and Change. Eiríkur Rögnvaldsson and Sigrún Helgadóttir.
- Part III Linguistic Resources for CH/SSH.
- The Ancient Greek and Latin Dependency Treebanks. David Bamman and Gregory Crane.
- A Parallel Greek
- Bulgarian Corpus: A Digital Resource of the Shared Cultural Heritage. Voula Giouli, Kiril Simov and Petya Osenova.
- Part IV Personalisation.
- Authoring Semantic and Linguistic Knowledge for the Dynamic Generation of Personalized Descriptions. Stasinos Konstantopoulos, Vangelis Karkaletsis, Dimitrios Vogiatzis and Dimitris Bilidas.
- Part V Structural and Narrative Analysis Automatic Pragmatic Text Segmentation of Historical Letters. Iris Hendrickx, Michel Généreux and Rita Marquilhas.
- Proppian Content Descriptors in an Integrated Annotation Schema for Fairy Tales. Thierry Declerck, Antonia Scheidel and Piroska Lendvai.
- Adapting NLP Tools and Frame
- Semantic Resources for the Semantic Analysis of Ritual Descriptions. Nils Reiter, Oliver Hellwig, Anette Frank, Irina Gossmann, Borayin Maitreya Larios, Julio Rodrigues and Britta Zeller.
- Part VI Data Management, Visualisation and Retrieval.
- Information Retrieval and Visualization for the Historical Domain. Yevgeni Berzak, Michal Richter, Carsten Ehrler and Todd Shore.
- IntegratingWiki Systems, Natural LanguageProcessing, and Semantic Technologies for Cultural Heritage Data Management. René Witte, Thomas Kappler, Ralf Krestel, and Peter C. Lockemann.
Please Login to submit errata.No errata are currently published