Vol.7, No.3, August 2018.                                                                                                                                                                             ISSN: 2217-8309

                                                                                                                                                                                                                eISSN: 2217-8333


TEM Journal



Association for Information Communication Technology Education and Science

Analysis of Apache Logs Using Hadoop and Hive


Aleksandar Velinov, Zoran Zdravev


© 2018 Aleksandar Velinov, published by UIKTEN. This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. (CC BY-NC-ND 4.0)


Citation Information: TEM Journal. Volume 7, Issue 3, Pages 645-650, ISSN 2217-8309, DOI: 10.18421/TEM73-22, August 2018.


Received: 09 March 2018
Accepted: 29 June 2018
Published: 27 August 2018




In this paper we consider an analysis of Apache web logs using Cloudera Hadoop distribution and Hive for querying the data in the web logs. We used public available web logs from NASA Kennedy Space Center server. HDFS (Hadoop distributed file system) was used as a logs container. The apache web logs were copied to the HDFS from the local file system. We made an analysis for the total number of hits, unique IPs, the most common hosts that made request to the NASA server in Florida, the most common types of errors. We also examined the ratio between the number of rows in the logs and the time of execution.


Keywords –Logs, Hadoop, Hive, analysis.



Full text PDF >  



Copyright © 2012-2018 UIKTEN, All Rights reserved
Copyright licence: All articles are licenced via Creative Commons CC BY-NC-ND 4.0 licence