CS4706 - Spring 2008
Homework 2 - "ToBI Labeling"


Due: Monday, Mar 3, 2008, by 2:40pm.
Submit in Courseworks

IMPORTANT: Submit all your files in one compressed file named "hw2-YourUni", with the corresponding extension (.gz, .zip, .rar, etc.). Example: "hw2-ag2251.zip".

Solve all the exercises using Praat, which can be downloaded from http://www.fon.hum.uva.nl/praat/. Please remember that you should do these exercises on your own; getting the 'right' answer is less important than working out something you can justify as consistent with the ToBI manual, if asked. :-)
You can use the following documentation and software:

(1) Boston Directions Corpus (25 points)

About the corpus: The Boston Directions Corpus comprises monologues produced by non-professional speakers, who were asked to perform a series of direction-giving tasks. For example, they had to explain simple routes such as getting from one station to another on the subway.

The file bdc.wav is an extract from one of such tasks performed by a female speaker. Mark the pitch accents with a "*" in its tones tier, and its number 4 breaks in its breaks tier, using the file bdc.TextGrid, for which the words tier has already been transcribed.

Submit the new Textgrid file.

Extra credit: Try to label as many of the following as you can: type of pitch accents (L*, H*, etc.), phrase accents (L-, H-, etc.), boundary tones (L%, H%, etc.) (all of these in the tones tier), and number 3 breaks (in the breaks tier).

(2) Games Corpus (25 points)

About the corpus: The Games Corpus consists of a series of dialogues between non-professional speakers, who were asked to play three collaborative computer-based games. There was no visual contact between the players, and the games required detailed descriptions of the objects on the screen and their positions.

The file games.wav is an extract from one of the games tasks, elicited by a female speaker. Mark the pitch accents with a "*" in its tones tier, and its number 4 breaks in its breaks tier, using the file games.TextGrid, for which the words tier has already been transcribed.

Submit the new Textgrid file.

Extra credit: Try to label as many of the following as you can: type of pitch accents (L*, H*, etc.), phrase accents (L-, H-, etc.), boundary tones (L%, H%, etc.) (all of these in the tones tier), and number 3 breaks (in the breaks tier).

(3) Recording speech (30 points)

Record the phrase "spoken language" with your own voice, following these intonation contours (you can decide how many words to accent, but all accented words should have the specified pitch accent):

  a)   L* H- H%
  b)   H* L- L%
  c)   H* H- L%

Do the full ToBI transcription of the three files. Submit the wav and TextGrid files.

(4) Speech analysis (20 points)

In HW1, you modified the file thermometer.wav to convey different meanings. Use the ToBI conventions to explain the differences in intonation between the original file and your two versions. (2 paragraphs each)