Jordan University of Science and Technology

A Survey on MapReduce Implementations

Authors:  Amer Al-Badarneh, Amr Mohammad, Salah Harb

A distinguished successful platform for parallel data processing MapReduce is attracting a significant momentum from both academia and industry as the volume of data to capture, transform, and analyse grows rapidly. Although MapReduce is used in many applications to analyse large scale data sets, there is still a lot of debate among scientists and researchers on its efficiency, performance, and usability to support more classes of applications. This survey presents a comprehensive review of various implementations of MapReduce framework. Initially we give an overview of MapReduce programming model. We then present a broad description of various technical aspects of the most successful implementations of MapReduce framework reported in the literature and discuss their main strengths and weaknesses. Finally, we conclude by introducing a comparison between MapReduce implementations and discuss open issues and challenges on enhancing MapReduce.