Research

I'm a member of the natural language processing group at Columbia advised by Kathy McKeown and have been involved with the machine learning group and the center for computational learning systems. My current research focuses on structured prediction approaches to problems such as text alignment and text-to-text generation. I'm particularly interested in inference strategies that incorporate multiple structural representations of text. This work is relevant to applications like summarization, question answering and machine translation.

Other projects that I'm currently involved in include approaches for automatically defining and surveying scientific concepts and graph-based models for summarizing web pages. I've previously worked on problems like redundancy reduction in text, semi-parametric density estimation, selectional preference discovery, time-series clustering, unsupervised syntactic language models, adaptive topic-based language modeling for speech recognition and semi-automated corpus annotation.

Refereed publications

Patents and other publications

Datasets

Miscellany

The papers that I covered for my candidacy exam on text-to-text generation are available here.

My Erdős number is at most 4 but my Bacon number is still woefully undefined.