6998 Section 1, NLP for the Web

Spring 2010



R 6:00-8:00 pm


MUDD. 327

Then CS Conference Room






Kathy McKeown

Office Hours:

Tuesday 4-5 pm, Wednesday: 1-2pm


kathy [at] cs.columbia.edu









Teacher Assistant:

Yves Petinot




ypetinot [at] cs.columbia.edu

Office Hours:

Thurs 12-1

Thurs 8-9


NLP Lab, CEPSR (7th floor)



Class Description

Given the large amount of unstructured information on the web, whether text or spoken, natural language processing has the potential to have a large impact on accessing and harvesting information available on the web. In this class,we will focus on applications using natural language processing that either have already been developed or are currently topics of research. Some of these applications aim to make it easier for end users to navigate the web (e.g., summarization and question answering) while others aim to make it easier to more accurately process information on the web (e.g., paraphrasing and entailment). The class will cover the following topics:

This is a seminar style class and will focus on reading of research papers related to the class topics. Classes will alternate presentation with discussion; a list of questions for discussion will be provided before each class.

Students will be required to help in the presentation of two classes. For one class, the student will be a presenter and will be responsible for presentation of one or more papers that day. For another class, the student will be a discussant and will be responsible for raising questions related to the papers of the day. The discussant must prepare a list of questions which will be circulated to the class ahead of time. The questions should touch on issues of synergy between the papers, contradictions across papers, comparisons between papers, or future directions. In addition, students will design and carry out a semester long project. A list of possible projects will be provided by the professor, but students may also propose projects of their own, provided they are approved by the professor. Throughout the semester, students will submit incremental versions of their project. There will be no midterms or finals.

Students must have taken either Artificial Intelligence, Natural Language Processing, Machine Learning or Search Engine Technology as a pre-requisite.