eTraces is a three-year collaborative project funded by the Federal Ministry of Education and Research in which the GCDH is partnering with the Natural Language Processing Group at Leipzig University and the GESIS – Leibniz Institute for the Social Sciences in Bonn. The project uses state-of-the-art as well as newly developed text mining methods to analyse text re-use – ranging from so-called “winged words” (geflügelte Worte) to literal quotations – in German language literature between 1500 and 1900 and social science texts since 1909.
GCDH is involved in the sub-project that focuses on the reception of the German Bible as evidenced in citations over the history of German literature. As a textual base, the sub-project is using the zeno.org-corpus, which contains over 1.8 million unique words in a corpus comprising more than 130 million words and is now freely available via “TextGrid” (www.textgrid.de). Pursuing a text mining approach will, for example, allow the validation (or falsification) of theories putting forward the secularisation of German culture, in general, and literature, in particular, as well as the tendency of literature towards aesthetic autonomy. Is there a diminishing frequency pattern of quotations from the Bible, as one could assume from a theory of secularisation, or is there merely a shift in the particular books being quoted or the way they are quoted? Does literature become more and more self-referencing over time, as the thesis of the aesthetic of autonomy predicates?
In the medium term, the sub-project aims at developing reliable text mining tools for literary scholarship, allowing scholars to analyse very large corpora of texts. The aim is also to paradigmatically enhance the historical and hermeneutical methods used in literary studies to also include quantitative, statistical and formalistic ones. The hope is to show that formal-algorithmic methods can be used to pose old research questions more precisely and new questions for the first time.