CS4705 MIDTERM, FALL 2002 1) Give an example of each of the following: (15 points) a wh-word a determiner a pronoun a proper noun an auxiliary verb a modal verb a phrasal verb (verb plus particle) a manner adverb a temporal adverb a locative adverb a conjunction a relative clause an NP a PP a VP 2) Identification of terms: Give 1-3 sentence identifications for 5 of the following: (15 pts) Closed class words Unification parsing Treebank (linguistic) Head Long distance dependency Affixation Subcategorization frame Minimum redundancy hypothesis (for lexical representation) 3) Short answer: Answer 2 of the following. (20 pts) What is the difference between mass and count nous? Give 2 examples of each. What is the difference between derivational and inflectional morphology, e.g. in English. Give 2 examples of each. What is the difference between a deterministic and a non-deterministic finite state automaton? 4) Short exercises: Do 2 of the following exercises. (20 pts) Create a finite state transducer that translate the emphatic sheep language' baa*!' into the quizzical cow language 'moo*?'. Write a grammar rule and an associated subcategorization frame to enforce subject-verb agreement on person and number. What are left-recursive grammar rules? What type of parsers are they a problem for? Give an example of a left-recursive grammar fragment. Turn this into a grammar *without* left recursion. 5) Essay questions: Write a 2-3 paragraph answer to 2 of the following questions (30 pts) Discuss at least three sources of ambiguity in natural language. Which do you think are the most difficult to deal with? Describe the strengths and weaknesses of Bottom-Up vs. Top-Down parsing. How does the Earley algorithm combine the two? What is the role of the 'dot' in this parsing technique? Left corners? How do probabilistic parsing approaches such as the PCYK parser improve over (non-probabilistic) CFG parsers such as the original Early algorithm? What are their drawbacks? 4) Extra credit: (10 pts each) Describe the algorithm used in Brill tagging (TBL). Describe Kimmo Koskenniemi-style Two Level Morphological parsing.