Spoken Language Processing (CS 4706), Spring 2008


MW  2:40-3:55


Mudd 1127


Julia Hirschberg

Office Hours: 

M 4-6


julia [at] cs.columbia.edu



Teaching Assistant:

Fadi Biadsy

Office Hours:

W 12-2


fadi [at] cs.columbia.edu



This course introduces students to research in spoken language in computational linguistics, aka natural language processing (NLP). We will study the different `meanings' that can be conveyed by the way that speakers produce sentences, techniques for analyzing spoken language, methods of developing speech technologies, and applications of such technologies in the real world, such as text-to-speech systems, speech recognizers, spoken dialogue systems, and detectors for various types of emotional speech.  NB: This course can be counted as a PhD elective in Advanced AI .  It is a requirement for the MS NLP Track.  There are no official prerequisites for this course except Data Structures or equivalent, and no prior knowledge of NLP will be assumed.


Required readings: Chapters from the second edition of Speech and Language Processing by Jurafsky and Martin available in draft form as a reader from the Village Copier on Amsterdam & 119th Street.    Recommended readings: Acoustic & Auditory Phonetics by Keith Johnson (Chapter 1 is available on line) and all other readings marked by ‘*’.

Course Requirements:

Midterm and final; N lab homeworks.  The Speech Lab is available for use in homeworks on a signup basis.

60% Homeworks
20% Midterm Exam
20% Final Exam

Late policy:
Each student starts the semester with 5 late days. Students use up late days when they turn in homework anytime after the due date and time.  For example, if homework is due at 2:40 pm on Wednesday, anything turned in after 2:40 pm on Wednesday, but before 2:40 pm on Thursday uses up one late day. Once the ‘free’ late days are exhausted, homework submitted after the due date will be penalized @10% per late day (e.g. 3 days late, grade will be penalized by 30%).
Late days can be used for all homeworks.
(Note:  Weekdays and weekends all count equally in the late day calculation.)

Academic Integrity:

Copying or paraphrasing someone's work (code included), or permitting your own work to be copied or paraphrased, even if only in part, is not allowed, and will result in an automatic grade of 0 for the entire assignment or exam in which the copying or paraphrasing was done. Your grade should reflect your own work. If you believe you are going to have trouble completing an assignment, please talk to the professor in advance of the due date.

Announcements: See some cool Praat manipulations below under Feb 4.


Praat - Praat resources


Help using ToBI - ToBI Annotation Environments

Text-to-Speech Links and more...

Text-to-Song synthesis





Readings and Assignments

Reports and HW



Jan 23

It's not what you said, it's how you said it

Hirschberg03 [ps] [pdf]




Jan 28

From Sounds to Language

J&M 7.1-7.3




Jan 30

Acoustics of Speech

J&M  7.4; *Johnson, Ch. 1-2




Feb 4

Tools for Speech Analysis

Praat tutorial 1 Praat tutorial 2 (some good contours:1, 2)

HW1: Using Praat (assigned)



Feb 6

Studying Intonation:  How do people ask questions?

Wilson93;Hedberg&Sosa02; Syrdal&Jilka04; *Dohertyetal04




Feb 11

Representing Intonational Variation

J&M 8.3.0-8.3.4




Feb 13

 ToBI and ToBI Labeling

 ToBI labeling conventions; Pierrehumbert&Hirschberg90

Listen to the ToBI examples

HW1 due
HW2: ToBI (assigned)



Feb 18

Tobi Labeling (continued)




Feb 20

Speech Generation

 J&M 8 (all); TTS-history




Feb 25

Text Normalization

J&M 8.1




Feb 27

Predicting Accents and Phrasing

J&M 8.3.4-8.3.7; Pan99, *Sun02, Rosenberg07

Guest Speaker:  Andrew Rosenberg



Mar 3

Modeling Pronunciation

J&M 8.2; Fackrell&Skut04

HW2 due
HW3: Data Collection (assigned)



Mar 5

Information Status: Focus and Given/New

Nakatani99, GBrown83, *Bard99, Prince92, Dahan02




Mar 10

Speech Recognition and Understanding

J&M 9 (all)




Mar 12

Speech Recognition and Understanding (continue)





Mar 17-21

Spring Break





Mar 24

Speech Disfluencies

J&M 10.6; Hindle83;Nakatani&Hirschberg94;Bear92;





Mar 26

Sentence  and Topic Segmentation

J&M 10.6; Shriberg00, Choi00, *Utiyama01,

HW3 due; HW4: TTS-oncampus, TTS-cvn (assigned)



Mar 31

Spoken Dialogue Systems: Overview

J&M 24;, *Bell&Gustafson00




Apr 2

Managing Dialogue

J&M 24.1.2; 24.5.1-2





Apr 7

Dialogue Acts and Information State

J&M 24.5.3 Hirschbergetal04, Rosset&Lamel04  




Apr 9

Confirmation Strategies and SDS Evaluation

Walkeretal97, Goldberg03




Apr 14

Entrainment in SDS

Brennan96; Roth05




Apr 16

Speech Data Mining and Distillation

Maskeyetal04, Koumpis&Renals05

HW4 due; HW5: ASR (assigned)



Apr 21

ASR for SDS (HTK Toolkit)


Guest Lecturer:  Fadi Biadsy



Apr 24

Emotional Speech

Cowie00, *Pereira00, Schroeder01, *Bosch00, Burkhardt00, Ang02,*Gobl&Chasaide03




Apr 28

Deceptive Speech

 DePauloetal83, Frank92, *Mehrabian77, Streeteretal71




Apr 30

Charismatic Speech

 Boss76, Tuppen74, Weber47




May 5

Summing Up


HW5 due



May 6-8

Study Days





TBA (May 9-16)

Final Exam


Covers the entire course


