Natural Language Processing (CS 4705), Fall 2003

Time: TuTh  1:10-2:25 Place 702 Hamilton 
Professor:  Julia Hirschberg Office Hours:  TBA;Th  2:30-3:30, CEPSR 705
Email:  julia@cs.columbia.edu Phone:  212-939-7114
Teaching Assistant:  Jackson Liscombe Office Hours:  M/W  2-2:50; 3-3:50, CEPSR 702 
Email: jaxin@cs.columbia.edu  Phone:  212-939-7111 

Announcements || News|| Academic Integrity || Description
Links to Resources || Requirements || Syllabus || Text || Thanks

Description:

This course provides an introduction to the field of computational linguistics, aka natural language processing (NLP) - the creation of computer programs that can understand, generate, and learn natural language. We will  study the three major subfields of NLP: syntax (the structure of an utterance), semantics (the truth-functional meaning of an utterance), and pragmatics/discourse (the context-dependent meaning of an utterance). The course will introduce both knowledge-based and statistical methods for NLP, and will illustate the use of such methods in a variety text- and speech-based application areas.

Text:

Speech and Language Processing by Jurafsky and Martin. It should be available from the Papyrus Books, as well as from Amazon and other online providers. It should also be on reserve in the Engineering Library. Please check the online errata for the text for each chapter as you read it. The authors are planning a new addition, so if you find an undocumented error, please let Professor Hirschberg know and she will pass the information along to the textbook authors and will provide a link to these by each chapter assignment below.

Requirements:

Three homework assignments, a midterm and a final exam. Each student in the course is allowed a total of 4 late days on homeworks with no questions asked. Homeworks are due by midnight of the due date.

Homework 1 submission procedure.

Academic Integrity:

Copying or paraphrasing someone's work (code included), or permitting your own work to be copied or paraphrased, even if only in part, is not allowed, and will result in an automatic grade of 0 for the entire assignment or exam in which the copying or paraphrasing was done. Your grade should reflect your own work. If you believe you are going to have trouble completing an assignment, please talk to the instructor or TA in advance of the due date.

Announcements:

In the news / On the web:

Syllabus:

 

Week Class Topic Reading Assignments
1 Sep 2 Introduction and Course Overview Ch 1  
  Sep 4 Regular Expressions and Automata Ch 2 Class participation opportunity I
2 Sep 9 Morphology and FSTs Ch 3  more errata ; Homework 1 Assigned
  Sep 11 Phonetics, Phonology and Text-to-Speech Ch 4 (4.4 opt; 
omit 4.7)
 
3 Sep 16 Word Pronunciation and Spelling Ch 5 (5.9 opt)  Homework 1 Part 1 due
  Sep 18 N-grams Ch 6  
4 Sep 23 Automatic Speech Recognition Ch 7  
  Sep 25 Word Classes and POS Tagging Ch 8  
5 Sep 30 CFGs for English Ch 9 Guest Lecturer: Dr. Owen Rambow

Homework 1 Part 2 due

  Oct 2 Parsing with CFGs Ch 10:1-3 Homework 2 assigned
6 Oct 7 No Class   Optional reading: Bangalore & Joshi '99, Abney '96
  Oct 9 The Early Algorithm Ch 10:4-6  
 7 Oct 14 Statistical Parsing Ch 12 Guest Lecturer: Dr. Robert Carpenter
  Oct 16     Midterm Examination
8 Oct 21 Meaning Representations and Semantic Analysis Ch  14-15 (15.1-3 opt)  
  Oct 23 Lexical Semantics  Ch 16  
9 Oct 28 Word Sense Disambiguation Ch 17.1-2, TBA  
  Oct 30 Information Retrieval Ch 17.3-5 Homework 2 due
10 Nov 4 No class   Holiday: Election Day
  Nov 6 Reference Resolution Ch 18.1,4 Homework 3 assigned
11 Nov 11 Text Coherence and Discourse Structure Ch 18.2-3,5  
  Nov 13 Intonation and Discourse Ch 4.7  
12 Nov 18 Information Status TBA  
  Nov 20 Conversational Implicature TBA  
13 Nov 25 Dialogue Systems Ch 19  
  Nov. 27     Thanksgiving Holiday
14 Dec 2 NL Generation Ch 20  
  Dec 4 Machine Translation Ch 21 Homework 3 due
15 Dec 9-11     Study Days
  Dec. 12-19     Final Exams

Links to Resources (cf. also resources available from the text homepage):

General:

Places to look up definitions and descriptions of terminology:

Chapters 1 and 2:

Try out one of the many versions of Eliza on the web.

Chapter3:

AT&T Labs - Research Finite State Machine Library

Later Chapters:

Chapter 19:

Thanks:

To James Martin, Diane Litman, Johanna Moore and Regina Barzilay, whose course materials have been very helpful in the preparation of this course and to Ani Nenkova for her useful comments.

Announcements || Academic Integrity || Description
Links to Resources || Requirements || Syllabus || Text || Thanks