[main] [schedule] [papers] [participants]

Columbia Statistical Generation Day

October 9, 2001
Columbia University, New York City

Natural Language Generation (NLG) is an important field within Natural Language Processing (NLP) that enables systems to communicate to their users through both spoken and written language. Until recently, NLG relied solely on symbolic techniques. However, the requirements of new applications (e.g. text summarization, question answering and machine translation) pose a number of challenges for traditional NLG, particularly in robustness and coverage. As in many other fields of NLP, incorporating statistical methods offers the potential to address some of these challenges.

We are pleased to announce the Columbia Statistical Generation Day, held at Columbia University. The goal of this gathering is to present existing and emerging research directions in statistical language generation through talks and explore future directions through a panel discussion. The Day will also provide a forum for the exchange of ideas on how statistical methods used for other research tasks can be helpful for the generation task.

We will also address two central issues in empirical methods - data acquisition and evaluation. With many domain specific applications using generation, is there a way to facilitate the development of generic data acquisition schemes that can be used across domains? On which terms should generation systems be compared? Is it possible to create a common benchmark for evaluating generation systems? How can we define the notion of baseline?

The Columbia Statistical Generation Day will run from 9am to 6pm on October 9, 2001. Participation is open and free to interested researchers and students. Refreshments and lunch will be available. Please notify us if you plan to attend by sending email to Matthew Schlager, and cc'ing Regina Barzilay.

The slides of the SGD talks are available here