HLT/NAACL 2004 logo  


Human Language Technology conference / North American chapter of the Association for Computational Linguistics annual meeting
Local Arrangements


May 2-7, 2004
The Park Plaza Hotel,
one block from the
Boston Common



Student Research Workshop

May 2, 2004
Paper submission deadline: February 8

Student researchers are invited to submit their work to the upcoming HLT/NAACL 2004 Student Workshop. The main mission of the workshop is to provide feedback for students' work in progress. Original and unpublished research is invited on all aspects of speech, information retrieval, and computational linguistics, but we encourage research that is in the intersection of two or three of these areas.

5th SIGDIAL Workshop on Discourse and Dialogue

Friday April 30 and Saturday May 1, 2004
This workshop will be held at MIT. http://sigdial04.eml-research.de
Paper submission deadline: January 12

Continuing with a series of successful workshops in Hong Kong, Aalborg, Philadelphia, and Sapporo this workshop spans the ACL and ISCA SIGdial interest area of discourse and dialogue. This series provides a regular forum for the presentation of research in this area to both the larger SIGdial community as well as researchers outside this community. The workshop is organized by SIGdial, which is sponsored jointly by ACL and ISCA.

CoNLL-2004: Eighth Conference on Computational Natural Language Learning

Thursday and Friday May 6 and 7, 2004
Paper submission deadline: February 4

CoNLL is an international conference for discussion and presentation of research on natural language learning. We invite submission of papers about natural language learning topics, including, but not limited to:

  • Computational models of human language acquisition
  • Computational models of the evolution of language
  • Machine learning methods applied to natural language processing tasks (speech processing, phonology, morphology, syntax, semantics, discourse processing, language engineering applications)
  • Symbolic learning methods (Rule Induction and Decision Tree Learning, Lazy Learning, Inductive Logic Programming, Analytical Learning, Transformation-based Error-driven Learning)
  • Biologically-inspired methods (Neural Networks, Evolutionary Computing)
  • Statistical methods (Bayesian Learning, HMM, maximum entropy, SNoW, Support Vector Machines)
  • Reinforcement Learning
  • Active learning, ensemble methods, meta-learning
  • Computational Learning Theory analysis of language learning
  • Empirical and theoretical comparisons of language learning methods
  • Models of induction and analogy in Linguistics

Workshop on Pragmatics of Question Answering

Thursday and Friday May 6 and 7, 2004
Paper submission deadline: January 26

Open-domain Question Answering (QA) has determined substantial advances in the past few years. Answering factual questions is performed with better and better accuracy; multiple forms of definition questions are processed correctly, and list questions retrieve sequences of answers with good recall from large text collections. Evaluations in the Text REtrieval Conference (TREC) QA track, as well as ARDA's Advanced Question Answering for Intelligence (AQUAINT) program, enable these advances in QA. These results and some of the research that made them possible were discussed in various workshops on QA topics organized at ACL (2001 and 2003), COLING (2002), LREC (2002), EACL (2003) and the AAAI Spring Symposium series (2002 and 2003). In the past year, with the emergence of scenario-based questions, several forms of pragmatic processing have started to influence the architecture of QA systems. The processing calls for handling multiple interactions with a QA system in the context of a given scenario, the question decomposition required by such contexts, the usage of the context and its interaction with the user background. These are just a few of the new features required in QA systems processing complex questions.

Document Understanding Conference 2004

Thursday and Friday May 6 and 7, 2004
Paper submission deadline (only from DUC participants): April 26

Text summarization has enjoyed a rebirth as can be noted by the number of summarization meetings held recently. Summarization is of interest to the NLP community and the IR community, both of which have made significant contributions to this rebirth. In 2001 SIGIR hosted a workshop that was the first official meeting of DUC (Document Understanding Conference), a new evaluation for text summarization. The DUC workshop has continued to grow, with 21 sites worldwide taking part in DUC 2003, which was held as a workshop at the HLT/NAACL meeting in Edmondton. The Boston workshop will present the results of the 2004 DUC evaluation, along with papers by many of the DUC participants. Additionally there will be open discussion of a new roadmap to guide further DUC evaluations over the next couple of years.

Workshop on Frontiers in Corpus Annotation

Thursday May 6, 2004
Paper submission deadline: January 27

Corpus annotation has taken a pivotal role in computational linguistics. As corpora become available with new sorts of annotation, new tasks are born and new approaches are spawned to solve old problems. The first treebanks made new types of statistical parsing possible. Newer treebanks make it possible for treebank-based parsers and related programs to provide more detailed output: we are seeing a resurgence of multistage parsing, this time with a statistical bent. Similarly, the annotation of corpora with part of speech, named entity, coreference and sense disambiguation has resulted in new tasks and extensions of old tasks. Corpus annotation has also served as a bridge between knowledge-based and statistical approaches. A model of research is emerging in which the target analysis (the corpus annotation) is knowledge-based, but the means of deriving that analysis are statistical. Corpus annotation is providing a means for researchers with seemingly disparate research agendas to work together in a way that simply was not possible before.

Workshop on Computational Lexical Semantics

Thursday May 6, 2004
Paper submission deadline: February 2nd

Lexical semantics is the study of word semantic properties in context and it is at the core of NLP and many of its applications. Recently, there has been a renewed interest in text semantics fueled in part by the complexity of some major research initiatives, such as Question Answering, Text Summarization, Machine Translation, Information Extraction, Reasoning, and others. The aim of this workshop is to bring together researchers from academia, government, and industry interested in text understanding, lexical semantics, knowledge representation, question answering, information retrieval, machine translation, and speech processing to submit papers reporting on recent advances and new perspectives in computational lexical semantics.

Second International Workshop on Scalable Natural Language Understanding (ScaNaLU 2004)

Thursday May 6, 2004
Paper submission deadline: February 5

There is a growing need for systems that can understand and generate natural language in applications that require substantial amounts of knowledge as well as reasoning capabilities. Most current implemented systems for natural language understanding (NLU) are decoupled from any reasoning processes, which makes them narrow and brittle. Furthermore, they do not appear to be scalable in the sense that the techniques used in such systems do not appear to generalize to more complex applications. While significant work has been done in developing theoretical underpinnings of systems that use knowledge and reasoning (e.g., development of models of linguistic interpretation using abductive reasoning, intention recognition, formal models of dialogue, formal models of lexical and utterance meaning, and utterance planning), it has often proved difficult to utilize such theories in robust working systems. Another major barrier has been the vast amount of linguistic and world knowledge needed. But there is now significant progress in compiling the required knowledge, using manual, statistical and hybrid techniques. But even as these resources become available, we still lack some key conceptual and computational frameworks that will form the foundation for effective scalable natural language systems.

Workshop on Interdisciplinary Approaches to Speech Indexing and Retrieval

Thursday May 6, 2004
Paper submission deadline: February 9

For nearly a decade, speech indexing and retrieval have been a focus of research in two largely independent communities, one at the intersection of speech recognition and information retrieval, a second at the intersection of information science and digital libraries. Much has been learned, but there has been remarkably little crossover between the two communities. As a result, we know a lot about the technical requirements for audio indexing of broadcast materials, but the state of the art for many other types of spoken word collections still depends on manual indexing or on automated harvesting of associated text and metadata.

Higher-Level Linguistic and Other Knowledge for Automatic Speech Processing

Thursday May 6, 2004
Paper submission deadline: January 21

The theme of this workshop is the use of higher-level linguistic and other types of knowledge for automatic speech processing, especially, but not limited to, speech recognition (ASR). Most current state-of-the-art speech recognizers do not explicitly use linguistic information (with the exception of pronunciation dictionaries), relying mainly on information encoded in statistical N-gram language models. Higher-level linguistic processes such as prosody, syntax, semantics, and pragmatics are obviously important, but such information is typically harder to label, model, and integrate into the standard computational frameworks (such as hidden Markov models). In addition, high-level meta-information, such as personal information stored in a database or dialogue and pragmatic coherence constraints, can also play important roles. All these sources of information can potentially compensate for acoustic confusability resulting from noisy environments and unexpected channel and speaker mismatch, which are very challenging issues for automatic speech recognizers. Furthermore, high-level information is typically crucial when the ultimate goal is to interpret the spoken input (i.e., the same sequence of words can mean different things depending on prosodic and syntactic features, as well as pragmatic constraints). Speaker recognition is another field that has recently recognized the importance of higher-level linguistic features, due to the fact that speakers exhibit idiosyncratic prosodic, lexico-syntactic, and pragmatic patterns ("conversational biometrics").

Spoken Language Understanding for Conversational Systems

Friday May 7, 2004
Paper submission deadline: January 26

The success of a conversational system depends on a synergistic integration of technologies such as speech recognition, spoken language understanding (SLU), dialog modeling, natural language generation, speech synthesis and user interface design. In this workshop, we will address the SLU component of a conversational system and its relation to the speech recognizer and the dialog model. In particular, we aim to bring together techniques that address the issue of robustness of SLU to speech recognition errors, language variability and dysfluencies in speech with issues of representation that provide greater flexibility to the dialog model.

Linking Biological Literature, Ontologies and Databases: Tools for Users

Thursday May 6, 2004
Paper submission deadline: January 16

This workshop will bring together researchers from the fields of bioinformatics, natural language processing, ontologies, data mining, and information retrieval. Our focus will be on tools that can provide improved access and cross-indexing for the biomedical literature, databases and ontologies. We strongly encourage presentation of approaches that support end users and user-defined tasks. Biological databases have become increasingly important resources in this field. These databases contain a mix of data types, including sequence data (DNA and protein sequences), structured data such as molecular weights or GC content, and annotations in terms of controlled vocabularies and, increasingly, ontologies such as the Gene Ontology http://www.geneontology.org/, as well as free text data in comment fields. Many biological databases are manually curated, that is, constructed by PhD biologists who read the literature and encode the information contained in the literature in the appropriate fields of the database that they are building.



Comments? Suggestions? Contact the Webmaster
Last modified: Thu Mar 25 09:29:37 2004