The IRMA Community
Newsletters
Research IRM
Click a keyword to search titles using our InfoSci-OnDemand powered search:
|
Enhancing Arabic NLP: A Comparative Study of AI-Driven Text Preprocessing Tools
Abstract
Arabic is a first language for more than 300 million people. It has some unique features that can make it one of the most complex languages, such as multiple derivatives, unlimited vocabulary, diacritics, and others. Preprocessing Arabic text is an essential step in order to prepare text for Natural Language Processing (NLP) purposes. This article provides a comparison study of several preprocessing tools for Arabic text. It explains the challenges in pre-processing the Arabic language as well as the techniques that used in every particular tool. However, the authors used the PRISMA for reporting the systematic reviews, which they started with screening 200 articles and ended-up with including only 30 articles. After reviewing these articles deeply, the results show that different tools such as AMIRA, CAMel ,and NLP packages added value in text-preprocessing. However, most of this papers considered that the ambiguity in Arabic orthography as well as the dialectal variants are the most challenges in Arabic NLP.
Related Content
|
Frederic Andres.
© 2027.
14 pages.
|
|
Kalsoom Safdar, Khairul Najmy Abdul Rani, Mohd Aminudin Jamlos, Siti Julia Rosli, Muhammad Usman Younus, Zanab Safdar.
© 2027.
27 pages.
|
|
Bani Adam, Binastya Anggara Sekti, Muhammad Adi Zacky Zahran.
© 2027.
24 pages.
|
|
Swetha Margaret T. A., Renuka Devi D..
© 2027.
31 pages.
|
|
Maurice Saluschke, Michael Schulz.
© 2027.
30 pages.
|
|
Mirjam Sepesy Maučec, Gregor Donaj.
© 2027.
16 pages.
|
|
Jorge A. Ruiz-Vanoye, Ocotlan Diaz-Parra, Ricardo A. Barrera-Cámara, Alejandro Fuentes-Penna, Francisco R. Trejo-Macotela, Jaime Aguilar-Ortiz, Eric Simancas-Acevedo.
© 2027.
21 pages.
|
|
|