Vol.13, No.4, November 2024.                                                                                                                                                                          ISSN: 2217-8309

                                                                                                                                                                                                                       eISSN: 2217-8333

 

TEM Journal

 

TECHNOLOGY, EDUCATION, MANAGEMENT, INFORMATICS

Association for Information Communication Technology Education and Science


Proposed Approach for Overcoming the Impact of Unbalanced Distribution in Predicting Students' Performance

 

Gabrijela Dimić, Ljiljana Pecić

 

© 2024 Gabrijela Dimić, published by UIKTEN. This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. (CC BY-NC-ND 4.0)

 

Citation Information: TEM Journal. Volume 13, Issue 4, Pages 2839-2849, ISSN 2217-8309, DOI: 10.18421/TEM134-20, November 2024.

 

Received: 26 May 2024.

Revised: 23 September 2024.
Accepted: 08 November 2024.
Published: 27 November 2024.

 

Abstract:

 

The paper presents a method for mitigating the impact of an unbalanced distribution of multidimensional class features on grade prediction accuracy. For the purposes of the case study, an educational data set named APOD was created by integrating data from heterogeneous sources. The input features and the multidimensional class feature were defined. The effectiveness of adopting the Synthetic Minority Over-Sampling Technique (SMOTE) to handle data imbalance issues was explored using various classification methods. To determine which algorithm performed best in terms of minority class distribution, three experiments were carried out. The SMOTE approach with automatic minority class detection and a 100% sampling factor demonstrated a considerable improvement in model performance for four out of five classifiers that were tested. The primary objective of the study described in this paper is to address the problem of predicting students' final grades in situations where a small dataset causes data imbalance. Small datasets provide insufficient representation of instances within specific classes, resulting in unreliable models with poor performance in predicting student success. The proposed approach for implementing SMOTE is based on an algorithm for identifying minority classes, with a predetermined minimum number of samples per class. This approach enables the development of precise models for predicting students' final test results, even with small educational datasets. The contribution of the proposed research lies in achieving greater accuracy in predicting students' final grades, regardless of dataset size and the presence of minority classes.

 

Keywords – Classification, SMOTE, unbalananced distribution, machine learning, educational data mining.

 

-----------------------------------------------------------------------------------------------------------

Full text PDF >  

-----------------------------------------------------------------------------------------------------------

 


Copyright © 2024 UIKTEN
Copyright licence: All articles are licenced via Creative Commons CC BY-NC-ND 4.0 licence