Latest News
Workshop
concluded. Check Final
Program with all presentations and results of afternoon
discussion. Workshop Description
While it is agreed that interlingual transfer is the ultimate goal in Machine Translation (MT), much work still needs to be done to build interlingual representations for MT systems. It is difficult to determine if a representation is a good one and, failing a gold standard, a useful one. Evaluation of interlingual representations involves several levels of measurement. The representation can be measured in ontological terms and through coverage, depth, complexity and resulting graph structure. The representation and accompanying tools can be measured through the ability to analyze data into the representation consistently, through evaluating inter-annotator agreement. The representation can be measured through the application of the resulting structure to a task, in this case MT. Here, a given text is first analyzed into an interlingual (IL) representation. Then, data is generated from the IL representation, such as generating sentence output that can be compared with the original text. Each of these evaluation strategies is complex as each involves more than one source of variation. In this workshop, we explore the problem of evaluating interlingual representations in the MT context. For the morning portion of the workshop, we invite submissions related to the problem of evaluating interlingual representations and the resulting text. For the afternoon session, we encourage participation in the task presented next. The Workshop Task
At the Fifth Interlingua Workshop, held in October 2002, the focus was on inter-coder reliability in coding thematic roles. Participants were provided with a dependency structure for each of 11 sentences. Each word was then to be assigned a thematic role from a list of thematic roles previously provided and defined by the workshop organizers. At the Sixth Interlingua Workshop, held in October 2003, the participants marked up and compared events, objects, and states in a multilingual corpus of a UNESCO Courier article in fifteen languages (plus English). Although participants will be invited to write a short paper for the workshop, the primary aim is to determine an upper limit on the validity of an Interlingua for translation purposes. This year's task will involve an exercise of Manual Interlingual Translation. There are two phases to the task: Task A(nalysis) and Task G(eneration). Task A
For Task A, each participant is to provide four items: (1) a foreign language text, (2) one or more English translations, (3) an interlingual representation of the foreign language text, and (4) a description of the Interlingua used. The document of interest should not be more than 300 words (English translation words that is). Participants who do not have access to parallel text for the language of their interest should contact Nizar Habash (habash@cs.columbia.edu) to help locate such text. Task G (with report)
In Task G, participants will receive the Interlingua and Interlingua description submitted by other participants. The result of Task G is an English translation created from the Interlingua. Participants will provide a (joint) written report for the workshop on the process and results of their analysis and generation. These reports will be presented during the morning session of the workshop. The afternoon will be devoted to a general discussion of the task and examination of Interlingua utility, Manual Translation Quality (ala some automatic metric such as Bleu), cross-linguistic variation, and variation across multiple English versions of the same text. Pairs of participants who score the best Manual Translation Quality will receive a valuable prize and the admiration and envy of their colleagues. Submission Guidelines: For the paper-only portion of the workshop,
participants should send it in Word or PDF format via email by Friday July 23, 2004 to Nizar
Habash (habash@cs.columbia.edu).
Include contact info for authors, title, abstract, and full text of 4-6
pages. A workshop URL will be created for the dissemination of ongoing
information. [Extended to September 7th] Accepted workshop papers will be published by AMTA, and authors will be asked to follow AAAI formatting instructions for their final copy. These instructions can be found at http://www.aaai.org/Publications/Templates/aaai.pdf and a template can be downloaded from http://www.aaai.org/Publications/Templates/Author-kit.zip. But note that the initial submission need not conform to these guidelines. Open Task G
The open Task G is an exercise in Manual Interlingual
Translation with the primary aim of determining an upper limit on the
validity of an Interlingua for translation purposes. The task involves generating from
interlinguas produced by all the task A participants. Below is a table
linking the samples and instructions submitted by participants in part A.
The last column specifies the minimum required text to generate from
for each submission.
Please
submit your English output to habash@cs.columbia.edu
by September
15, 2004. Workshop Banquet
The workshop banquet
will be on the night before the workshop (since the workshop is
on the day after AMTA is over). The banquet will be held at Bistro
Francais: http://www.washingtonian.com/dining/Profiles/BistroFr.html
The expected cost is
around $50 per person total. Please RSVP to habash@cs.columbia.edu by
Wednesday September 22, 2004. Final Program
Workshop Organizers
Dr. Nizar Habash, Center for Computational
Learning Systems, Dr. Bonnie Dorr, Computer Science Department, Dr. Eduard Hovy, Director of the Natural Language Group, Information Sciences Institute, University of Southern California. hovy@isi.edu |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||