The Dynamic Lexicon

The Dynamic Lexicon is an NEH-funded project to automatically create bilingual dictionaries (Greek/English and Latin/English) using parallel texts - source texts in Greek or Latin aligned with their English translations - along with the syntactic data encoded in treebanks. From these raw materials, we can construct a lexical entry that illustrates how a word is used not simply in all of Greek or Latin literature, but in any subset of that collection (e.g., Greek drama or the works of Cicero).

Dynamic Lexicon Entry for the Greek noun δύναμις.

While the automatically induced information naturally contains noise (e.g., the misclassification of ἔχις or the mistranslation of the second example sentence), it reveals larger patterns of usage consistent with traditional lexica. In particular, we have automatically induced three categories of information:

In addition, the availability of Greek/English and Latin/English parallel text that has been aligned at the level of individual sentences also allows us to supplement the lexical entry with several instances of its actual use in text – allowing us to present not only the source text but also its automatically aligned translation.

Data

The published form of the Dynamic Lexicon includes automatically generated lexical entries along with the underlying intermediate analysis used to generate them (including word-level alignments between source texts and their translations, and automatic morphological tagging and syntactic analysis for the Greek and Latin originals). All data is licensed under a Creative Commons Attribution-Sharealike license.

Publications

Bamman, David, and Gregory Crane, "Measuring Historical Word Sense Variation," in: Proceedings of the 11th ACM/IEEE Joint Conference on Digital Libraries (JCDL 2011) [pdf]

Bamman, David, and Gregory Crane, "The Ancient Greek and Latin Dependency Treebanks," in: Caroline Sporleder, Antal van den Bosch and Kalliopi Zervanou (eds.), Language Technology for Cultural Heritage (Springer, 2011). [pdf]

Bamman, David, and Gregory Crane (2009), "Computational Linguistics and Classical Lexicography," Digital Humanities Quarterly 3.1 [html]

Bamman, David, and Gregory Crane (2008), "Building a Dynamic Lexicon from a Digital Library," Proceedings of the 8th ACM/IEEE-CS Joint Conference on Digital Libraries (JCDL 2008) (Pittsburgh) [preprint]

Acknowledgments

Grants from the National Endowment for the Humanities (PR-50013- 08, "The Dynamic Lexicon: Cyberinfrastructure and the Automated Analysis of Historical Languages") and NEH/DOE/NERSC ("Large-Scale Learning and the Automatic Analysis of Historical Texts") provided support for this work. This research used resources of the National Energy Research Scientific Computing Center, which is supported by the Office of Science of the U.S. Department of Energy under Contract No. DE-AC02-05CH11231.