MADA
MADA is a full morphological tagger for Modern Standard Arabic
developped by Nizar Habash and Owen Rambow. A web page describing it
(along with pointers to papers)
can be found here.
We distribute MADA (along with the TOKAN component which handles tokenization) free of charge for educational, research, and
in-house uses. However, MADA uses the Buckwalter
Arabic Morphological Analyzer (BAMA-2), which is distributed by the
Linguistic Data Consortium (LDC). We do not distribute
BAMA-2, and we cannot help you obtain it. If you cannot get
BAMA-2 from the LDC, you cannot use MADA, unfortunately. Note: BAMA-1 may
also work with MADA, we have not tested it. Please let is know of your
experience if you do use BAMA-1.
MADA is created for Linux. It is probably possible to get it to
work on other platforms.
To obtain MADA, follow these steps:
- Fill out this
form and fax it to Giselle Garcia, +1 212 870 1285 . You can
also scan the form and email it to gg2166<at>columbia.edu .
Please indicate if you wish to receive a fully signed version back by
fax (give a fax number). Be sure your email address is legible on
the form!
- Once we have your form, we will send you a pointer to a gzipped
tar file. There will be installation instructions, which also
explain how to incorporate the BAMA-2 release from the LDC.
If you have any questions, please contact Owen Rambow
(<last-name><at>ccls.columbia.edu).