Hackathon on Text Re-Use ‘Don’t leave your data problems at home!’
The Göttingen Centre for Digital Humanities will host a Hackathon targeted at students and researchers with a humanities background who wish to improve their computer skills by working with their own data-set.
Rather than teaching everything there is to know about algorithms, the Hackathon will assist participants with their specific data-related problem, so that they can take away the knowledge needed to tackle the issue(s) at hand. The focus of this Hackathon is automatic text re-use detection and aims at engaging participants in intensive collaboration. Participants will be introduced to technologies representing the state of the art in the field and shown the potential of text re-use detection. Participants will also be able to equip themselves with the necessary knowledge to make sense of the output generated by algorithms detecting text re-use, and will gain an understanding of which algorithms best fit certain types of textual data. Finally, participants will be introduced to some text re-use visualisations.
Organised by: Emily Franzini, Greta Franzini and Maria Moritz
Who can apply / Eligibility
To ensure that every participant gets the right amount of individual assistance, the places will be limited to a maximum of 20.
Deadline for application
May 15th, 2015. No deadline extensions granted.
How to apply
To apply for a place, please email your CV and a letter of motivation to the contact address below, in PDF format. The letter of motivation should not exceed one A4 page and should include details regarding:
Please use “Hackathon application 2015” as the email subject line of your application.
A small budget for students is available for travel reimbursements upon request.
The main purpose of the Göttingen Hackathon of July 2015 is to help participants uncover some of the mysteries that lie behind the software tools that allow us to make research with our data. For this reason, we will be ‘unpacking’ one such tool – the TRACER – which is based on a framework made up of algorithms for different components. These are: Segmentation, Pre-processing, Featuring, Selection, Linking, Scoring and Post-processing. Participants will be able to learn about how each step works and how to feed their own data into the pipeline. It will also be possible for participants to run their data only through the components which will help answer their own specific research questions. This exercise will explain the challenges that computer scientists face when transforming research questions about texts into algorithms able to generate coherent answers. It will also show how certain algorithms make sense only when used with certain types of data, so that participants can go home with an understanding of what works for them in terms of speed and quality of output.
We look forward to reading your application!
The eTrap Team
Project Website: http://etrap.gcdh.de/
Göttingen Centre for Digital Humanities (GCDH)
Papendiek 16, D-37073 Göttingen, Germany
DH research seminar
Nigar Babayeva (Marmara University Istanbul, Turkey/ visiting research fellow@GCDH): "Muslim Intellectuals of Russia in the Political Life of the Ottoman Empire during the first decades of 20th century"
The presentation will be about 3 Turkic intellectuals (Yusuf Akçura, Ali Hüseyinzade, Ahmet Ağaoğlu) who were born in the second half of the 19th century in Tsarist Russia to concervative Muslim families. After receiving good secular and religious education, these 3 intellectuals were influenced by the ideas of the French nationalism. As the result of Russia's pan-Slavism and forced Russification of its minorities, they developed the ideology of pan-Turkism as the counter-argument. Due to constant pressures by the Russian government and constitutional reforms in the Ottoman Empire, the 3 of them eventually moved to Istanbul and cotinued to develope their ideas. They were very active in bringing the pan-Turkic consciousness to the wider Muslim public. At the end of WW1 and disintegration of the Ottoman Empire, they were very influencial in the establishing process of Turkey. Nigar will try to create a general overview of this development and the link between the ideas of nationalism and the establishment of the republic.
Place: Göttingen Centre for Digital Humanities, Papendiek 16, seminar room 1
DH research seminar
Frank Fischer: World Literature According to Wikipedia
This talk will discuss methods that can help to understand how World Literature is represented across different language versions of Wikipedia. The according paper is currently reviewed and was co-authored by Christoph Hube, Robert Jäschke, Gerhard Lauer, and Mads Rosendahl. The actual work was carried out by making use of DBpedia, one of Wikipedia's semantizised versions. The presentation will start at 4 o'clock sharp.
From 2013 to 2015, Frank Fischer was coordinator of the DH research collaboration at the GCDH, and is now part of the DARIAH-DE team over at the Göttingen State and University Library.
DH research seminar
Mauricio Nicolas Vergara (University of Padua, Italy/ visiting research fellow @GCDH): "Spatial Analysis and GIS-Modelling for the Tyrol Front of War"
Applications of GIS for Historical Research: Two Case studies from the Alpine Front in the First World War
In this presentation, Mauricio will introduce GIS and what makes it appealing to historical research. In particular, he will discuss why it is important to include the spatial dimension in military history studies, by presenting two cases from the Alpine Front in the First World War.
Place: Göttingen Centre for Digital Humanities, Papendiek 16, seminar room 1
DH research seminar
Michelle Rodzis/Uwe Sikora: Library of Neology
For further information on the project, please visit http://bdn-edition.de/
The seminar will take place at Heyne-Haus. Papendiek 16, 37073 Göttingen, seminar room 2
GCDH Open Workshop: Twitter as a Tool and Object of Research
This open workshop on Twitter as a tool for and an object of research draws on the rising relevance of Twitter for observing public opinion.
Wed 03.06.2015, 14.00–18.00 at GCDH (Heyne-Haus, Papendiek 16, Göttingen), Seminarraum 2
This open workshop on Twitter as a tool for and an object of research draws on the rising relevance of Twitter for observing public opinion. While Germany has been a slow adapter, the studies presented will show how we can use Twitter to study our changing communication environment as well as societal trends of relevance. Our four talks shed differing lights on Twitter as a tool for research and an object of research by focusing on entirely different questions:
Peta Mitchell is Vice Chancellor’s Research Fellow in the Creative Industries Faculty at Queensland University of Technology. Her fellowship project is focused on geocultural research and the new spatial turn, and her research has broadly focused on the geohumanities, including media geography, literary geography, and neogeography. Felix Victor Muench is a PhD Student in Media and Communication at QUT, Brisbane, Australia. With a B.Sc. in Physics (LMU, Munich, Germany), a M.A. in Journalism (LMU and German Journalist School, Munich, Germany) and work experience in online media brand communication as an online media concepter and strategist, his main fields of interest are network science methodologies and social media. They will talk about a new wave of social contagion research focused on mathematically modeling and visualising the spread of “contagious behaviour” on social media platforms such as Twitter and Facebook and relate these to their hidden heritage in Tardean contagion theory.
Dr Axel Bruns is a Professor in the Digital Media Research Centre at Queensland University of Technology in Brisbane, Australia, and was a Chief Investigator in the ARC Centre of Excellence for Creative Industries and Innovation (CCi). He will present data and results from an analysis of the structure and development oft he german Twittersphere. Especially focusing on the adoption of Twitter in Germany.
Robert Jäschke is a Professor for Knowledge-Based Systems at the Leibniz University Hannover. His research is focused on the development and integration of algorithms for community detection, ranking, and recommendations into collaborative tagging systems. Further topics of interest include citation and link analysis, entity matching and resolution, and social network analysis. He will talk about a research project focusing on the role of reciprocity for the following behaviour between PhD students and Professors.
Marco Schmitt is a Post-Doc at the Göttingen Centre for Digital Humanities. His research ist focused on the changes in scholarly communication and network research. He will talk about different scientific communication styles and how they relate to the possible observation of scientific communities on Twitter.
Göttingen Dialog in Digital Humanities - Mining for characterising patterns in literature using correspondence analysis: an experiment on French novels
Speaker: Francesca Frontini, Université Pierre et Marie Curie/ France
The talk presents and describes a bottom up methodology for the detection of stylistic traits in the syntax of literary texts. The extraction of syntactic patterns is performed blindly by a sequential pattern mining algorithm, while the identification of significant and interesting features is performed later by using correspondence analysis and filtering for the most contributive patterns.
Location: Heyne-Haus, Papendiek 16, 37073 Göttingen, Semniarraum 2
Göttingen Dialog in Digital Humanities - Comparing Television Formats: Using Digital Tools for Cross-Cultural Analysis
Speaker: Edward Larkey, University of Maryland/U.S.
The talk summarizes the ongoing work of a research group on international television format adaptations in which digital tools are used to make cross-cultural comparisons of culturally specific similarities and differences between different national versions of the same scripted drama and comedy television series. Using the concepts of cultural proximity, hybridization, and transnational localization, these tools will enable the researchers to compile, correlate, and visualize quantitative and qualitative data on the narrative structure, sequences, and content, while correlating these with camera shots, angles, and movements, music and scoring, dialog, lighting, and other aspects. The goal is to define and depict a notion of television language that permits the inclusion of quantitative data into an interpretive analysis and comparison of two narratives. The research will yield taxonomic information on current trends in cultural globalization through television format trade indicating a more polycentric model gradually supplanting the previously predominant center-periphery model.
Location: Academy of Sciences Göttingen, Conference Room, Theaterstr. 7, 37073 Göttingen
Kolloquiumsvorträge Digital Humanities
Prof. Fotis Yannidis, Prof. Jan Christoph Meister, Prof. Andrea Rapp, Prof. Maciej Eder
Am Montag, 12. und Dienstag 13. Januar 2015 finden in der SUB Historisches Gebäude / Paulinerkirche, Papendiek 14, 37073 Göttingen folgende Vorträge statt:
09.15 – 10.05h: Prof. Fotis Richard Jannidis
"Die Ordnung der Romane - eine korpusbasierte Gattungsanalyse"
11.15 – 12.05h: Prof. Jan Christoph Meister
"Digital Humanities in transdisziplinärer Perspektive"
14.00 – 14.50h: Prof. Andrea Rapp
"Ich danke Dir herzlich für Deinen lieben Brief" Aufbau, Erschließung und Auswertung eines digitalen Liebesbriefarchivs
16.00 – 16.50h: Prof. Maciej Eder
"Counting literature at close and distant quarters"
Göttingen Dialog in Digital Humanities - Reconstructing a website’s lost past - Methodological issues concerning the history of www.unibo.it
Speaker: Federico Nanni, University of Bologna/Italy
In my presentation I intend to describe how born digital documents could offer new insights on the recent history of the University of Bologna. My research is particularly focused on the methodological aspects of dealing with the “scarcity and abundance” (Rosenzweig, 2003) of born digital sources. In my talk I will present how I retrieved and analyzed materials from different web and newspapers archives and how I intend to employ computational methods (Natural Language Processing techniques and Topic Models) in order to extract information from the university digital library corpus.
Place: Heyne-Haus, Papendiek 16, 37073 Göttingen, Seminarraum 2
Conference "#DigitalHumanities in der Praxis"
After 3 years of gaining practical experience in the Digital Humanities, the Göttingen DH Research Collaboration holds a result-oriented conference to recap what's been going on in all the participating projects. (Contributions mostly in German.)
Details can be found here:
Göttingen Dialog in Digital Humanities - Automated Pattern Analysis in Gesture Research: Similarity Measuring in 3D Motion Capture Models of Communicative Action
Speakers: Daniel Schüller, Christian Beeck und Irene Mittelberg (RWTH Aachen/Germany)
The question of how to model similarity between gestures plays an important role in current studies in the domain of human communication. Most research into recurrent patterns in co-verbal gestures – manual communicative movements emerging spontaneously during conversation – is driven by qualitative analyses relying on observational comparisons between gestures. Due to the fact that these kinds of gestures are not bound to well-formedness conditions, however, we propose a quantitative approach consisting of a distance-based similarity model for gestures recorded and represented in motion capture data streams. To this end, we model gestures by flexible feature representations, namely gesture signatures, which are then compared via signature-based distance functions such as the Earth Mover's Distance and the Signature Quadratic Form Distance. Experiments on real conversational motion capture data evidence the appropriateness of the proposed approaches in terms of their accuracy and efficiency. Our contribution to gesture similarity research and gesture data analysis allows for new quantitative methods of identifying patterns of gestural movements in human face-to-face interaction, i.e., in complex multimodal data sets.
Place: State and University Library, Papendiek 14, Vortragsraum
Göttingen Dialog in Digital Humanities - Release of the MySQL-based implementation of the CTS protocol
In a project called "A Library of a Billion Words", we needed an implementation of the CTS-protocol that is capable of handling a text collection containing at least 1 billion words. Because the existing solutions did not work for this scale or were still in development I started a reimplementation of the CTS protocol using methods that MySQL provides. Last year we published a paper that introduced a prototype with the core functionalities but without being compliant with the specifications of CTS.
The purpose of this talk is to describe and evaluate the MySQL based implementation now that it is fulfilling the specifications version 5.0 and mark it as finished and ready to use.
Further information, online instances of CTS for all described datasets and binaries can be accessed via the projects web-site (www.urncts.de).
This time we meet in the library of the Seminar für Deutsche Philologie, Jacob-Grimm-Haus, Käte-Hamburger-Weg 3, 37073 Göttingen. We´re looking forward to the talk and discussions.
Spring School: 3D Modeling and Reconstruction with Blender & Unity
Mon–Fri, 2–6 March 2015, in Göttingen, Germany.