Vol.8, No.4, November 2019.                                                                                                                                                                           ISSN: 2217-8309

                                                                                                                                                                                                                        eISSN: 2217-8333


TEM Journal



Association for Information Communication Technology Education and Science

A Hybrid Model for Near-Duplicate Image Detection in MapReduce Environment


Nadiah Yusof, Amirah Ismail, Nazatul Aini Abd Majid


© 2019 Amirah Ismail, published by UIKTEN. This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. (CC BY-NC-ND 4.0)


Citation Information: TEM Journal. Volume 8, Issue 4, Pages 1252-1258, ISSN 2217-8309, DOI: 10.18421/TEM84-21, November 2019.


Received: 16 August 2019.

Revised:   02 November 2019.
Accepted:  09 November 2019.
Published: 30 November 2019.




It has been proven that the large-scale image dataset is strictly complex in content-based image retrieval (CBIR) as the present strategies in CBIR might have difficulties in processing it. Other than this, near-duplicate images would possibly consume space, in which as an alternative can be used for storing other or unique images. In order to solve these problems, MapReduce has been used for speed-up filtering near-duplicate images. However, there is still a lack of accuracy in detecting near-duplicate images. Hence, this study has discovered that image features extraction by means of Principal Component Analysis (PCA) technique, which is primarily based on the matrix of image representation that will expand the similarity of detection. There is a need whereby PCA approach requires to be enhanced resulting from the lack of the extraction of features in Songket motives images. Therefore, this study proposes a new hybrid model that will integrate PCA with MapReduce for image feature extraction and clustering the large-scale image dataset in the cloud environment. In view of this, the present study employs the use of a qualitative experimental design model and goes through three main phases iteration: firstly, is the analysis and design phase, secondly is a development phase and lastly is testing and evaluation phase. However, this study focuses only on the analysis and design phase. The outcomes process of the empirical phase is followed by designing the algorithm and model according to the result of literature reviews. The expected results of thisstudy is a proposed model and extract principal component elements of the large-scale image dataset using PCA, as well as boosting up time in filtering the images through MapReduce environment.


Keywords – Image, Image Retrieval, Near-Duplicate, PCA, Geometric, MapReduce.



Full text PDF >  



Copyright © 2012-2019 UIKTEN, All Rights reserved
Copyright licence: All articles are licenced via Creative Commons CC BY-NC-ND 4.0 licence