Jordan University of Science and Technology

A framework for retrieving Arabic documents based on queries written in Arabic slang language


Authors:  Mohammed Q Shatnawi, Muneer Bani Yassein, Reem Mahafza

Abstract:  
Due to the widespread use of the internet, there are large amounts of information and documents available in several languages. The Arabic language is one of the available important languages in terms of its usage and structure. Search engines like Google and Yahoo support searching in Arabic, yet fail to get good results when slang terms are used in the query. There are difficulties associated with the Arabic language. The main goal of this research is to refine Arabic text-based searching by using Arabic slang terms in queries. This research proposed a framework to enable users to use their slang language in order to retrieve the relevant documents that have been posted in both forms ? slang and classical. The framework is designed and implemented based on a context-free grammar that is used to map the user?s slang queries to the equivalent classical ones. On a classical dataset,?results showed?a 3% improvement on the average values of precision, recall, and F-measure achieved using classical-based queries rather than slang-based ones. Using slang-based queries gives 13% improvement on the average values of the used measures on a slang dataset and 7% improvement on the average values of the used measures on a hybrid dataset