CS 4705: Introduction to Natural Language Processing, Fall 2004


MW  1:10-2:25


Mudd 545


Julia Hirschberg

Office Hours: 

M 2:30-3:30;Th 3:30-4:30, CEPSR 705





Teaching Assistant: 

Sameer Maskey

Office Hours: 

MW 4-5





Announcements || Academic Integrity ||  Contributions || Description
Links to Resources ||
Requirements || Syllabus || Text



This course provides an introduction to the field of computational linguistics, aka natural language processing (NLP) - the creation of computer programs that can understand, generate, and learn natural language. We will  study the three major subfields of NLP: syntax (the structure of an utterance), semantics (the truth-functional meaning of an utterance), and pragmatics/discourse (the context-dependent meaning of an utterance). The course will introduce both linguistic (knowledge-based) and statistical approaches to language processing, and will illustate the use of such methods in a variety of text- and speech-based application areas, including spoken dialogue systems, speech recognition and synthesis, machine translation, and language summarization.


Speech and Language Processing by Jurafsky and Martin. It will be available from the Morningside Bookshop (was Papyrus Books), as well as from Amazon and other online providers. It should also be on reserve in the Engineering Library. Please check the online errata for the text for each chapter as you read it.


Three homework assignments, a midterm and a final exam. Graduate students will have one additional assignment. Each student in the course is allowed a total of 5 late days on homeworks with no questions asked; after that, points will be deducted for late submission, unless you have a note from your doctor.  Do not use these up early!  Save them for real emergencies.  Homeworks are due by midnight on the due date. 

All students are required to have a Computer Science Account for this class. To sign up for one, go to the CRF website and then click on "Apply for an Account".

Homework submission procedure.

Academic Integrity:

Copying or paraphrasing someone's work (code included), or permitting your own work to be copied or paraphrased, even if only in part, is not allowed, and will result in an automatic grade of 0 for the entire assignment or exam in which the copying or paraphrasing was done. Your grade should reflect your own work. If you believe you are going to have trouble completing an assignment, please talk to the instructor or TA in advance of the due date.










Sep 8

Introduction and Course Overview




Sep 13

Regular Expressions and Automata

Ch 1-2



Sep 15

Morphology and FSTs

Ch 3

 Homework 1 Assigned (nb: Homework submission procedure)


Sep 20

Phonetics, Phonology and Text-to-Speech

Ch 4



Sep 22

N-grams and Machine Learning

Ch 6

 Guest Speaker:  Sameer Maskey


Sep 27

Word Pronunciation and Spelling

Ch 5



Sep 29

Automatic Speech Recognition

Ch 7



Oct 4

Word Classes and POS Tagging

Ch 8

 Guest Speaker:  Martin Jansche


Oct 6

CFGs for English

Ch 9

 Guest Speaker: Owen Rambow


Oct 11

Basic Parsing with CFGs

Ch 10:1-3

Homework 1 due


Oct 13

Parsing Problems and Some Solutions

Ch 10:4-6; 11:0-3



Oct 18

Probabilistic and Lexicalized Parsing

Ch 12

 Be sure to replace figure 12.3 with new version


Oct 20


 Sample midterm

Midterm Examination; Grad assignment paper list due


Oct 25

Meaning Representations and Semantic Analysis

Ch  14-15 (15.1-3 opt)

 Homework 2 Assigned.


Oct 27

Lexical Semantics 

Ch 16



Nov 1





Nov 3

Word Sense Disambiguation

Ch 17.1-2, TBA



Nov 8

Robust Semantics and Information Retrieval

Ch 17.3-5



Nov 10

YALE Review

 Nov 12


Nov 15

Text Coherence and Discourse Structure

Ch 18.2-3,5; Grosz&Sidner86



Nov 17

Reference Resolution

Ch 18.1,4

Guest Lecturer: Ani Nenkova

Homework 2 First Report due

12 Nov 22 Information Status Prince92

Nov 24

Information Status 2




Nov 25



Thanksgiving Holiday


Nov 29

Spoken Dialogue Systems

Ch 19


Dec 1

Intonation in TTS Systems

 Ch 4.7



Dec 6

New Approaches to Story Modeling for Understanding, Generation and Summarization

Sengers, Smith97, and optional:  NYT

Guest Lecturer: David Elson

Dec 8

Machine Translation

Ch 20

Guest Lecturer: Nizar Habash 

Homework 2 Final Report due


Dec. 13

Summing Up: NLP Research and Applications


Grad assignment report due


Dec. 20



Final Examination

Links to Resources (cf. also resources available from the text homepage):


Places to look up definitions and descriptions of terminology:

Chapters 1 and 2:

Try out one of the many versions of Eliza on the web.


AT&T Labs - Research Finite State Machine Library

Later Chapters:

Chapter 19:

Announcements || Academic Integrity || Contributions || Description
 Links to Resources|| Requirements || Syllabus || Text