MADA


MADA is a full morphological tagger for Modern Standard Arabic developped by Nizar Habash and Owen Rambow.  A web page describing it (along with pointers to papers) can be found here

We distribute MADA (along with the TOKAN component which handles tokenization) free of charge for educational, research, and in-house uses.  However, MADA uses the Buckwalter Arabic Morphological Analyzer (BAMA-2), which is distributed by the Linguistic Data Consortium (LDC).  We do not distribute BAMA-2, and we cannot help you obtain it.  If you cannot get BAMA-2 from the LDC, you cannot use MADA, unfortunately. Note: BAMA-1 may also work with MADA, we have not tested it. Please let is know of your experience if you do use BAMA-1.

MADA is created for Linux.  It is probably possible to get it to work on other platforms.

To obtain MADA, follow these steps:

  1. Fill out this form and fax it to Giselle Garcia, +1 212 870 1285 .  You can also scan the form and email it to gg2166<at>columbia.edu .  Please indicate if you wish to receive a fully signed version back by fax (give a fax number).  Be sure your email address is legible on the form!
  2. Once we have your form, we will send you a pointer to a gzipped tar file.  There will be installation instructions, which also explain how to incorporate the BAMA-2 release from the LDC.

If you have any questions, please contact Owen Rambow (<last-name><at>ccls.columbia.edu).