Title of Talk

Abstract Wikipedia and Vastly Multilingual Natural Language Generation

Abstract

Abstract Wikipedia is an initiative from the Wikimedia Foundation to generate Wikipedia articles from an abstract (i.e. language-neutral) source in multiple languages. The goal has been set to 20 million articles in over 300 languages, guaranteed to be in synchrony with up-to-date information and thereby with each other. This is by far the largest Natural Language Generation (NLG) project of all times. Grammatical Framework (GF), with 40 languages and specialized domains such as science, law, and e-commerce, is orders of magnitude smaller. Nevertheless, GF has served as inspiration for Abstract Wikipedia, and pilot projects have started to scale it up to the task. Research in both NLG techniques, language resources, processing algorithms, and interaction with human authors is needed. This talk will outline a possible way to build up Abstract Wikipedia by starting with simple text-robot-like techniques and proceeding to more sophisticated NLG.

Bio

Aarne Ranta, professor of Computer Science, Department of Computer Science and Engineering, University of Gothenburg and Chalmers University of Technology

In his PhD thesis in 1990, Aarne Ranta investigated the use of constructive type theory in linguistics. With original focus in formal semantics, type theory also suggested a model for interlingual grammars, where several languages are related to a common formal structure. In 1997, Ranta joined Xerox Research Centre in Grenoble to develop this idea within a project entitled Multilingual Authoring. The result was Grammatical Framework, GF, which today has a world-wide community that has built applications for over 40 languages. In Gothenburg since 1999, Ranta has worked on both GF and the closely related fields of functional programming, compiler construction, and databases. Ranta is also co-founder and CEO of Digital Grammars AB, which applies GF in commercial projects.

 

Registration

View all events