Smaranda Muresan


Former PhD student in the Natural Language Processing Group , Computer Science Department at Columbia University .

My new web page is here .

I have moved to University of Maryland Institute for Advanced Computer Studies (UMIACS) as a Postdoctoral Research Associate. I am working with Philip Resnik and the Machine Translation group.

My Curriculum Vitae


Research Interests


Publications


DISSERTATION: Learning Constraint-based Grammars from Representative Examples: Theory and Applications

( abstract , TOC , Full version)

Projects

My research is part of PERSIVAL (PErsonalized Retrieval and Summarization of Image, Video And Language Resource) project. It is a DLI2 project funded by NSF that tailors search, presentation, and summarization of online medical literature and consumer health information to the end user, for both patient or healthcare provider.

In my thesis, I have designed, implemented and evaluated a relational learning framework for inducing constraint-based grammars using a domain ontology as background knowledge. This grammar induction framework is general and I have shown its ability to cover large fragments of natural language and its usefulness for acquiring domain knowledge from text. In particular, I have focused on acquiring terminological knowledge in the medical domain. Understanding and sharing terminology, both by systems and humans, are important aspects of communication. Many domains, including the medical domain, evolve rapidly, new concepts being defined in textual resources, such as on-line articles and web documents. Thus, relying on static dictionaries and glossaries is not enough to keep the information up-to-date. I designed, implemented and evaluated DEFINDER ([1], [2]), a system for extracting definitions from online medical articles. I applied my grammar induction tool and inference mechanisms to the definitional corpus extracted by DEFINDER to build a terminological knowledge base. For multiple definitions of the same term extracted from different sources, I have implemented a merging algorithm, in which similarities, differences and contradictions are identified. Contradictions might be used as an indicator of potentially unreliable source documents.


Teaching

Spring 2001 - Head TA for Programming Languages and Translators, instructor prof. Chris Okasaki


smara@cs.columbia.edu