Advanced Natural Language Processing (CS 6998), Spring 2004: Speech Research and Technologies |
|||
Time: | W 4:10-6:00 | Place | 825 MUD |
Professor: | Julia Hirschberg | Office Hours: | Tu 2-3; Th 12:30-1:30, CEPSR 705 |
Email: | julia@cs.columbia.edu | Phone: | 212-939-7114 |
Teaching Assistant: |
Sameer Maskey |
Office Hours: |
Tu 2:30-4;Th, 12-1 --
CEPSR 720 |
Email: |
smaskey@cs.columbia.edu |
Phone: |
212-939-7116 |
Announcements || Academic Integrity || Description
Resources || Requirements || Syllabus || Readings
This course introduces students to research in spoken language in computational linguistics, aka natural language processing (NLP). We will examine approaches to the analysis of the speech signal of particular interest to NLP research, look at several speech technologies in some detail, including approaches to recognizing and to generating speech, and consider several important application areas for these technologies, including spoken dialogue systems and speech data mining. There are no formal prerequisites for the course (i.e. no knowledge of speech or signal processing is assumed) but some knowledge of/serious interest in NLP is assumed. Format for the class will be lecture and discussion. NB: This course can be counted as a PhD elective in Advanced AI.
Acoustic & Auditory Phonetics by Keith Johnson is available from Papyrus. Speech and Language Processing by Jurafsky and Martin will be a useful reference for those with no formal background in NLP. The early chapters of one of the following paperbacks, The Speech Chain by Peter Denes and Elliot Pinson (on reserve in Psychology Library), Elements of Acoustic Phonetics by Peter Ladefoged will also be useful. Used copies of all these should be available in local bookstores, as well as from Amazon and other online providers. Most course readings will be available either on the web or in-class handouts. NB: '*' means that the reading is optional.
Class participation and a term project. Class participation will include a) bringing 3 discussion questions to class each week, based on the readings; and b) helping with one class during the semester. The project (done alone or in collaboration) on one of the topics coverered in the course or some other topic in spoken language will be defined by each class participant in consultation with the professor. These projects will involve a) a project description; b) periodic project reports on progress; c) a class presentation of project results; and d) delivery of the final project. Project formats may include literature reviews, data collection and analysis, experiments, and/or systems or system components.
Copying or paraphrasing someone's work (code included), or permitting your own work to be copied or paraphrased, even if only in part, is not allowed, and will result in an automatic grade of 0 for the entire assignment or exam in which the copying or paraphrasing was done. Your grade should reflect your own work. If you believe you are going to have trouble completing an assignment, please talk to the professor in advance of the due date.
Help
using xwaves
Help
using ToBI
Getting
wavesurfer
Week | Date | Topic | Readings and Assignments |
1 | Jan 21 | Introduction to the Course | |
2 | Jan 28 | Interpreting Speech Variation | Hirschberg03, ToBI labeling conventions, and see ToBI examples |
3 | Feb 4 |
Analyzing the Speech Signal | Handout;
Recommended: Johnson (Chs 1,2) or Denes&Pinson (Chs
1,3,4); TTS
exercises (extended to Feb 20) |
4 | Feb 11 | Speech Generation: From Concept and from Text | HLT96-ch5, TTS
systems, Guest Speaker: Martin Jansche |
5 | Feb 18 | Meanings of
Intonational Contours (Agus G) |
Pierrehumbert&Hirschberg '90,
Liberman&Sag '75, Hirschberg&Ward '92 (all
handouts) |
6 | Feb 25 | Predicting
Accents and Phrasing |
Pan99, Sun02, Koehn00, Rambow01 |
7 | Mar 3 |
Tools
for Speech Analysis |
Guest Lecturer: Jean-Philippe
Goldman See also Praat tutorial |
8 | Mar 10 | Information
Status: Focus and Given/New (Ani N) |
*Nakatani99, GBrown83, Bard99, Prince92, Dahan02 |
9 | Mar 17 | Spring Break | |
10 | Mar 25 | Speech Recognition and Understanding | HLT96-ch1 (Sameer will hold a TTS clinic in the Speech Lab, CEPSR, 7th floor; do the recognition reading as background for weeks 11-15) |
11 | Mar 31 |
Speech Acts and
Topic Segmentation (Eric,Corey) |
Jurafsky98, Shriberg00, Nickerson&Chu-Carroll99, *Shriberg98 |
12 | Apr 7 |
Speech Disfluencies
and Turntaking (Michael Mu,Sarah,Judd,Aron) |
Turn-taking in Conversational Analysis (follow links) , *Sacksetal74, Bear92, *Brennan&Schober99, Hindle83, |
13 | Apr 14 |
Spoken Dialogue Systems (Vera,Andy,David) |
Walkeretal97, Goldberg03, Bell&Gustafson00,[pdf] Krahmer01 |
14 | Apr 21 | Speech Search, Data
Mining, and Summarization (Aaron H,Kyle,Matthew) |
SCANMail demo, Furui02, Barzilay00, Hearst99 |
15 | Apr 28 |
Emotional
Speech (Michael Ma, Wayne, Jared) |
Cowie00,
Pereira00,
Schroeder01,
Bosch00,
*Burkhardt00,
*Ang02 |
16 | May 5 |
No Class |
|
17 | May 12 | Presentation of Term Projects in CEPSR 415, 3-6:30pm |
Announcements || Academic Integrity || Description
Links to
Resources || Requirements || Syllabus || Text || Thanks