About
When: 30.03.2017 - 31.03.2017
Where: GESIS-Leibniz-Institut für Sozialwissenschaften, Unter Sachsenhausen 6-8, 50667 Cologne, Germany
EXCITE is a collaborative activity of the GESIS – Leibniz Institute for the Social Sciences and the Institute for Web Science and Technologies (WeST) which has started in September 2016. The project develops a tool chain implementing the following steps: Extraction of text from the source documents, identification of individual references in the text, segmentation of those references, matching of reference strings against bibliographic databases, and export of the matched references in usable formats and services. Special attention will be paid to the overall optimization of individual components of the citation extraction.
Our first community meeting is planned as a “noon to noon” event and has the goal to bring together experts in reference extraction, text mining, and machine learning to explore the possibilities in the project. We plan to have scientific presentations with invited speakers on the first day and hands-on sessions on the second day. For the second day we will release a test corpus (PDF files of scientific papers and manually annotated data) for developers.
Agenda
Day One (Thursday, 30.03.2017)
Time | Title | Speaker | Slides | Video |
11:00 | Arrival (Room: West II) | |||
12:00
|
Welcome and Introduction
|
Steffen Staab, WeST | ||
Philipp Mayr, GESIS | ||||
12:20 | Information Extraction out of Born-Digital Scientific Articles | Roman Kern, TU Graz | Link | Link |
12:40 | Advanced citation matching and large-scale full-text analysis | Nees Jan van Eck, Leiden U | Link | Link |
13:00 | Lunch Break (Cafeteria) | |||
14:20 | APIs for third parties to extract and deposit output executions of automated extraction pipelines (via videoconferencing) | Min-Yen Kan, NU Singapore | Link | Link |
14:40 | Extracting references from scientific articles in CERMINE system | Dominika Tkaczyk, U Warsaw | Link | Link |
15:00 | Coffee Break (Cafeteria) | |||
15:30
|
CitEc to CitEcCyr. A stab at distributed citation systems. (via videoconferencing)
|
Link 1, | ||
Link 2 | ||||
15:50
|
EXCITE project: Status report
|
Behnam Ghavimi, GESIS | ||
Martin Körner, WeST | ||||
Heinrich Hartmann | ||||
16:10 | Processing of in-text References: Towards a Semantic Analysis | Marc Bertin, U Toulouse | Link | Link |
16:30 | Citations in Utopia Documents | David Thorne, U Manchester | Link | Link |
16:50 | Coffee Break (Cafeteria) | |||
17:20 | Research around the Tagging System BibSonomy | Andreas Hotho, U Würzburg | ||
17:50
|
LOC-DB: A Linked Open Citation Database provided by Libraries. Motivation and Challenges.
|
Kai Eckert, HDM Stuttgart | ||
Anne Lauscher, HDM Stuttgart | ||||
Akansha Bhardwaj, DFKI | ||||
18:20 | Record Linkage between CiteSeerX and Web of Science (via videoconferencing) | Lee Giles, Penn State U | ||
18:50 | Break | |||
20:00 | Dinner at Gaffel am Dom (paid by participants) | |||
22:00 | Socializing | |||
23:00 | End |
Day Two (Friday, 31.03.2017)
Time | Title | ||
9:00 | Second Day Kickoff (Room: West II) | ||
9:15 | Extraction Result Discussion Group | Gold Standard Discussion Group | Collaboration Discussion Group |
11:15 | Coffee Break (Cafeteria) | ||
11:30 | Extraction Result Discussion Group | Gold Standard Discussion Group | Collaboration Discussion Group |
12:30 | Closing Talks (Room: West II) | ||
13:00 | End |
Gold Standard
One part of the discussions during the second workshop day will around a gold standard that we are currently building. The current version can be found on Github. Note that it is work in progress. The according PDFs can be found (for now) here.
Arrival and Accommodation
GESIS Cologne is located near the Cologne central train station. Further information on traveling to GESIS by air, rail, intercity bus, or car can be found on the GESIS website.
There are also special GESIS rates available for accommodations. More information can be found on this list.