Advaith's Pages
----------------
Advaith Siddharthan
Postdoctoral Research Scientist
Columbia University

Research Interests

I am a Postdoctoral Research Scientist at the Natural Language Processing Group, Computer Science Department, Columbia University. My advisors are Kathy McKeown and Owen Rambow. I obtained my PhD from the Natural Language and Information Processing Group at the Computer Lab, University of Cambridge, where my supervisor was Ann Copestake.

My current research interests are multi-document and multilingual news summarization, text simplification and re-generation, open-domain referring expression generation and the semantic annotation of multilingual corpora. My PhD thesis was on syntactic simplification and text cohesion, and focused on the discourse issues that arise when syntactically rewriting text. A am currently involved in two projects: multilingual news summarization and semantic annotation of large multilingual corpora.

Publications

    2005

  • Advaith Siddharthan and Kathleen McKeown.Improving Multilingual Summarization: Using Redundancy in the Input to Correct MT errors. To appear in Proceedings of Human Language Technology / Empirical Methods in Natural Language Processing Conference (HLT/EMNLP 2005), Vancouver, Canada.
    paper(pdf)

  • Ani Nenkova, Advaith Siddharthan and Kathleen McKeown. Automatically Learning Cognitive Status for Multi-Document Summarization of Newswire. To appear in Proceedings of Human Language Technology / Empirical Methods in Natural Language Processing Conference (HLT/EMNLP 2005), Vancouver, Canada.
    paper(pdf)

    2004

  • Advaith Siddharthan. Syntactic Simplification and Text Cohesion. To appear in the Journal of Language and Computation, Kluwer Academic Publishers, the Netherlands.
    draft paper(pdf)

  • Advaith Siddharthan, Ani Nenkova and Kathleen McKeown. Syntactic Simplification for Improving Content Selection in Multi-Document Summarization. In the Proceedings of the 20th International Conference on Computational Linguistics (COLING 2004), Geneva, Switzerland.
    paper(pdf)

  • Advaith Siddharthan and Ann Copestake. Generating Referring Expressions in Open Domains. In Proceedings of the 42th Meeting of the Association for Computational Linguistics Annual Conference (ACL 2004), Barcelona, Spain.
    paper(pdf)

  • David Farwell, Stephen Helmreich, Florence Reed, Bonnie Dorr, Nizar Habash, Eduard Hovy, Lori Levin,, Keith Miller, Teruko Mitamura, Owen Rambow and Advaith Siddharthan. Interlingual Annotation of Multilingual Text Corpora. In Proceedings of the Workshop on Frontiers in Corpus Annotation, NAACL/HLT' 04, Boston, MA.
    paper(pdf)

  • Bonnie Dorr, Lori Levin, Owen Rambow, David Farwell, Rebecca Green, Nizar Habash, Stephen Helmreich, Eduard Hovy, Keith Miller, Teruko Mitamura, Florence Reeder and Advaith Siddharthan. Semantic Annotation and Lexico-Syntactic Paraphrase. In Proceedings of the Workshop on Building Lexical Resources from Semantically Annotated Corpora, LREC'04, Lisbon, Portugal.
    paper(pdf)

  • Florence Reed, David Farwell, Teruko Mitamura, Stephen Helmreich, Bonnie Dorr, Nizar Habash, Eduard Hovy, Lori Levin,, Keith Miller, Owen Rambow and Advaith Siddharthan. Semantic Annotation for Interlingual Representation of Multilingual Texts. In Proceedings of the Workshop on Beyond Named Entity Recognition Semantic labeling for NLP, LREC'04, Lisbon, Portugal.
    paper(pdf)

  • Reeder, Florence, Bonnie Dorr, David Farwell, Nizar Habash, Stephen Helmreich, Eduard Hovy, Lori Levin, Teruko Mitamura, Keith Miller, Owen Rambow, Advaith Siddharthan, Interlingual Annotation for MT Development. In Proceedings of the 6th Conference of the Association for Machine Translation in the Americas (AMTA-2004), Georgetown University, Washington DC, 2004.
    paper(pdf)

  • Sasha Blair-Goldensohn, Dave Evans, Vassilios Hatzivassiologlou, Kathleen McKeown, Ani Nenkova, Rebecca Passonneau, Barry Schiffman, Andrew Schlajiker, Advaith Siddharthan and Segei Siegelman. Columbia University at DUC 2004 . In Proceedings of the Document Understanding Workshop (DUC 2004) at HLT/NAACL 2004, Noston, MA, pages 23-30.
    paper(pdf)

    2003

  • Advaith Siddharthan. Syntactic Simplification and Text Cohesion. PhD Thesis, November 2003 OR Technical Report TR-597, University of Cambridge, August 2004.
    technical report (.pdf and singlespacing) / thesis (.ps.gz and 1.5 spacing) / Abstract (popup)

  • Advaith Siddharthan. Preserving Discourse Structure when Simplifying Text. In Proceedings of the European Natural Language Generation Workshop (ENLG), 11th Conference of the European Chapter of the Association for Computational Linguistics (EACL 2003). pages 103-110.
    paper(pdf) / slides(pdf)

  • Advaith Siddharthan. Resolving Pronouns Robustly: Plumbing the Depths of Shallowness. In Proceedings of the Workshop on Computational Treatments of Anaphora, 11th Conference of the European Chapter of the Association for Computational Linguistics (EACL 2003). pages 7-14.
    paper(pdf) / slides(pdf)

    2002

  • Advaith Siddharthan. Resolving Attachment and Clause Boundary Ambiguities for Simplifying Relative Clause Constructs. In Proceedings of the Student Research Workshop, 40th Meeting of the Association for Computational Linguistics (ACL 2002), pages 60-65.
    paper(pdf) / slides(pdf)

  • Advaith Siddharthan and Ann Copestake. Generating Anaphora for Simplifying Text. In Proceedings of the 4th Discourse Anaphora and Anaphor Resolution Colloquium (DAARC 2002). Pages 199-204.
    paper(pdf) / slides(pdf)

  • Advaith Siddharthan. An Architecture for a Text Simplification System. In Proceedings of the Language Engineering Conference 2002 (LEC 2002). Pages 64-71.
    paper(pdf) / slides(pdf)

  • Advaith Siddharthan. Resolving Relative Clause Attachment Ambiguities using Machine Learning Techniques and WordNet Hierarchies. In Proceedings of the 5th National Colloquium for Computational Linguistics in the UK (CLUK 2002). Pages 45-49.
    paper(pdf) / slides(pdf)

Book Reviews

  • Advances in Automatic Text Summarization
    Edited by Inderjeet Mani and Mark T. Maybury
    MIT Press, 1999.
    ISBN 0-262-13359-8
    442 pp., 150 illus.
    £32.95(cloth)
    draft of review
    In Natural Language Engineering Volume 7, Issue 3, September 2001, ISSN 1351-3249, pp 271-274.

  • Foundations of Statistical Natural Language Processing
    Manning, Christopher D. and Schutze, Hinrich.
    MIT Press 2000.
    ISBN 0-262-13360-1.
    620 pp.
    $64.95/£44.95(cloth)
    draft of review
    In { Natural Language Engineering}, Volume 8, Issue 1, March 2002, pp 91-92.

  • Building Natural Language Generation Systems
    Reiter, Ehud and Dale, Robert.
    Cambridge University Press 2000.
    ISBN 0-521-62036-8.
    270 pp., 128 line diagrams.
    $64.95/£37.50(Hardback)
    draft of review
    In Natural Language Engineering Volume 7, Issue 3, September 2001, ISSN 1351-3249, pp 271-274.