International Journal of Academic Research in Business and Social Sciences

search-icon

A Thematic Review on Predicting Student Performance Using Data Mining

Open access
Student performance, a key criterion for academic placement and success, can be effectively predicted and potentially improved using advanced data mining techniques. However, more studies delve into patterns and methods for forecasting student performance in this context. This study presents a comprehensive thematic review of the literature published between 2016 and 2022 to highlight prevailing trends and underline the significant gaps in this rapidly growing research domain. The review emphasizes that the highest number of related publications were seen in 2021, with a clear focus on higher educational institutions. Two major factors were recurrent in these studies: academic records and demographic information of the students, widely recognized as significant determinants of academic success. This observation reaffirms previous research asserting the substantial role of academic records in predicting student performance. Our analysis expands the scope of prior literature reviews by incorporating studies from Mendeley and Science Direct databases. Interestingly, we discovered a glaring research gap in predicting performance in Malaysian secondary schools using locally relevant datasets. Therefore, further exploration in this niche could benefit this geographical context. Furthermore, our review identified the need to investigate other variables that may impact student performance. There is vast potential for unearthing novel insights that could substantially enhance our understanding and predictive accuracy. Consequently, this study emphasizes the importance of continuing research in this field, integrating various datasets, geographical contexts, and potential predictors to better anticipate and address student performance across various educational scenarios.

Abdulazeez, Y., & Abdulwahab, L. (2019). Application of classification models to predict students’ academic performance using classifiers ensemble and synthetic minority over sampling techniques. Bayero Journal of Pure and Applied Sciences, 11(2), 142. https://doi.org/10.4314/bajopas.v11i2.17
Aha, D. W., Kibler, D., & Albert, M. K. (1991). Instance-based learning algorithms. Machine Learning, 6(1), 37-66.
Akar, S. G. M., & Altun, A. (2020). Individual Differences in Learning Computer Programming: A Social Cognitive Approach. Contemporary Educational Technology, 8(3), 195–213. https://doi.org/10.30935/cedtech/6196
Akçap?nar, G., Altun, A., & A?kar, P. (2019). Using learning analytics to develop an early-warning system for at-risk students. International Journal of Educational Technology in Higher Education, 16(1). https://doi.org/10.1186/s41239-019-0172-z
Alturki, S., Alturki, N., & Stuckenschmidt, H. (2021). Using Educational Data Mining To Predict Students’ Academic Performance For Applying Early Interventions. Journal of Information Technology Education: Innovations in Practice, 20, 121–137. https://doi.org/10.28945/4835
Aluko, R. O., Daniel, E. I., Shamsideen Oshodi, O., Aigbavboa, C. O., & Abisuga, A. O. (2018). Towards reliable prediction of academic performance of architecture students using data mining techniques. Journal of Engineering, Design and Technology, 16(3), 385–397. https://doi.org/10.1108/JEDT-08-2017-0081
Amrieh, A., Hamtini, T., & Aljarah, I. (2016). Mining Educational Data to Predict Student’s academic Performance using Ensemble Methods. 9(8), 119–136.
Asif, R., Merceron, A., Ali, S. A., & Haider, N. G. (2017). Analyzing undergraduate students’ performance using educational data mining. Computers and Education, 113, 177–194. https://doi.org/10.1016/j.compedu.2017.05.007
Badr, G., Algobail, A., Almutairi, H., & Almutery, M. (2016). Predicting Students’ Performance in University Courses: A Case Study and Tool in KSU Mathematics Department. Procedia Computer Science, 82, 80–89. https://doi.org/10.1016/j.procs.2016.04.012
Basu, K., Basu, T., Buckmire, R., & Lal, N. (2019). Predictive models of student college commitment decisions using machine learning. Data, 4(2), 1–18. https://doi.org/10.3390/data4020065
Begum, S., & Padmannavar, S. S. (2022). Genetically Optimized Ensemble Classifiers for Multiclass Student Performance Prediction. International Journal of Intelligent Engineering and Systems, 15(2), 316–328. https://doi.org/10.22266/ijies2022.0430.29
Bernacki, M. L., Chavez, M. M., & Uesbeck, P. M. (2020). Predicting achievement and providing support before STEM majors begin to fail. Computers and Education, 158(September 2019), 103999. https://doi.org/10.1016/j.compedu.2020.103999
Boži?, R. (2022). Predicting Study Programme Selection with Data Mining Classification Technique. Ijeec - International Journal of Electrical Engineering and Computing, 5(2), 94–101. https://doi.org/10.7251/ijeec2102094b
Burgos, C., Campanario, M. L., Peña, D. de la, Lara, J. A., Lizcano, D., & Martínez, M. A. (2018). Data mining for modeling students’ performance: A tutoring action plan to prevent academic dropout. Computers and Electrical Engineering, 66, 541–556. https://doi.org/10.1016/j.compeleceng.2017.03.005
Chango, W., Cerezo, R., & Romero, C. (2021). Multi-source and multimodal data fusion for predicting academic performance in blended learning university courses. Computers and Electrical Engineering, 89(October 2020). https://doi.org/10.1016/j.compeleceng.2020.106908
Chaudhari, K. P., Sharma, R. A., Jha, S. S., & Bari, R. J. (2017). Student Performance Prediction System using Data Mining Approach. Ijarcce, 6(3), 833–839. https://doi.org/10.17148/ijarcce.2017.63195
Costa, E. B., Fonseca, B., Santana, M. A., de Araújo, F. F., & Rego, J. (2017). Evaluating the effectiveness of educational data mining techniques for early prediction of students’ academic failure in introductory programming courses. Computers in Human Behavior, 73, 247–256. https://doi.org/10.1016/j.chb.2017.01.047
Dabhade, P., Agarwal, R., Alameen, K. P., Fathima, A. T., Sridharan, R., & Gopakumar, G. (2021). Educational data mining for predicting students’ academic performance using machine learning algorithms. Materials Today: Proceedings, 47(xxxx), 5260–5267. https://doi.org/10.1016/j.matpr.2021.05.646
Delen, D., Topuz, K., & Eryarsoy, E. (2020). Development of a Bayesian Belief Network-based DSS for predicting and understanding freshmen student attrition. European Journal of Operational Research, 281(3), 575–587. https://doi.org/10.1016/j.ejor.2019.03.037
Ezz, M., & Elshenawy, A. (2020). Adaptive recommendation system using machine learning algorithms for predicting student’s best academic program. Education and Information Technologies, 25(4), 2733–2746. https://doi.org/10.1007/s10639-019-10049-7
F. Gontzis, A., Kotsiantis, S., T. Panagiotakopoulos, C., & Verykios, V. S. (2022). A predictive analytics framework as a countermeasure for attrition of students. Interactive Learning Environments, 30(3), 568–582. https://doi.org/10.1080/10494820.2019.1674884
Fernandes, E., Holanda, M., Victorino, M., Borges, V., Carvalho, R., & Erven, G. Van. (2019). Educational data mining: Predictive analysis of academic performance of public school students in the capital of Brazil. Journal of Business Research, 94(August 2017), 335–343. https://doi.org/10.1016/j.jbusres.2018.02.012
Galvan, P. (2016). Educational Evaluation and Prediction of School Performance through Data Mining and Genetic Algorithms. International Journal of Advanced Engineering Research and Science, 3(10), 215–220. https://doi.org/10.22161/ijaers/3.10.34
Gelman, A., Carlin, J. B., Stern, H. S., Dunson, D. B., Vehtari, A., & Rubin, D. B. (2014). Bayesian Data Analysis (3rd ed.). Chapman and Hall/CRC.
Gao, L., Zhao, Z., Li, C., Zhao, J., & Zeng, Q. (2022). Deep cognitive diagnosis model for predicting students’ performance. Future Generation Computer Systems, 126, 252–262. https://doi.org/10.1016/j.future.2021.08.019
Garg, R. (2018). Predicting Student Performance Of Different Regions Of Punjab Using Classification Techniques. 9(1), 236–240.
Guabassi, I. El, Bousalem, Z., Marah, R., & Qazdar, A. (2021). Comparative Analysis of Supervised Machine Learning Algorithms to Build a Predictive Model for Evaluating Students’ Performance. International Journal of Online and Biomedical Engineering, 17(2), 90–105. https://doi.org/10.3991/ijoe.v17i02.20025
Han, J., Kamber, M., & Pei, J. (2011). Data mining: concepts and techniques. Elsevier.
Hastie, T., Tibshirani, R., & Friedman, J. (2009). The Elements of Statistical Learning: Data Mining, Inference, and Prediction. Springer Science & Business Media.
Helal, S., Li, J., Liu, L., Ebrahimie, E., Dawson, S., Murray, D. J., & Long, Q. (2018). Predicting academic performance by considering student heterogeneity. Knowledge-Based Systems, 161, 134–146. https://doi.org/10.1016/j.knosys.2018.07.042
Hung, H. C., Liu, I. F., Liang, C. T., & Su, Y. S. (2020). Applying educational data mining to explore students’ learning patterns in the flipped learning approach for coding education. Symmetry, 12(2), 1–14. https://doi.org/10.3390/sym12020213
Injadat, M. N., Moubayed, A., Nassif, A. B., & Shami, A. (2020). Multi-split optimized bagging ensemble model selection for multi-class educational data mining. Applied Intelligence, 50(12), 4506–4528. https://doi.org/10.1007/s10489-020-01776-3
Kabathova, J., & Drlik, M. (2021). Towards predicting student’s dropout in university courses using different machine learning techniques. Applied Sciences (Switzerland), 11(7). https://doi.org/10.3390/app11073130
Kerži?, D., Aristovnik, A., Tomaževi?, N., & Umek, L. (2019). Assessing the impact of students’ activities in e-courses on learning outcomes: a data mining approach. Interactive Technology and Smart Education, 16(2), 117–129. https://doi.org/10.1108/ITSE-09-2018-0069
Kishan Das Menon, H., & Janardhan, V. (2020). Machine learning approaches in education. Materials Today: Proceedings, 43(xxxx), 3470–3480. https://doi.org/10.1016/j.matpr.2020.09.566
Koller, D., & Friedman, N. (2009). Probabilistic Graphical Models: Principles and Techniques. MIT Press.
Kostopoulos, G., Kotsiantis, S., Fazakis, N., Koutsonikos, G., & Pierrakeas, C. (2019). A Semi-Supervised Regression Algorithm for Grade Prediction of Students in Distance Learning Courses. International Journal on Artificial Intelligence Tools, 28(4), 1–19. https://doi.org/10.1142/S0218213019400013
Kostopoulos, G., Kotsiantis, S., Pierrakeas, C., Koutsonikos, G., & Gravvanis, G. A. (2018). Forecasting students’ success in an open university. International Journal of Learning Technology, 13(1), 26–43. https://doi.org/10.1504/IJLT.2018.091630
Maheswari, K., Priya, A., Balamurugan, A., & Ramkumar, S. (2021). Analyzing student performance factors using KNN algorithm. *Materials Today: Proceedings, xxxx. https://doi.org/10.1016/j.matpr.2020.12.1024
Mai, T. T., Bezbradica, M., & Crane, M. (2022). Learning behaviours data in programming education: Community analysis and outcome prediction with cleaned data. Future Generation Computer Systems, 127(September 2021), 42–55. https://doi.org/10.1016/j.future.2021.08.026
Malini, J., & Kalpana, Y. (2021). Investigation of factors affecting student performance evaluation using education materials data mining technique. Materials Today: Proceedings, 47(xxxx), 6105–6110. https://doi.org/10.1016/j.matpr.2021.05.026
Merchán Rubiano, S., Beltrán Gómez, A., & Duarte García, J. (2017). Engineering Students´ Academic Performance Prediction using ICFES Test Scores and Demo-graphic Data. Ingeniería Solidaria, 13(21), 53–61. https://doi.org/10.16925/in.v13i21.1729
Miguéis, V. L., Freitas, A., Garcia, P. J. V., & Silva, A. P. (2020). Predicting Students’ Performance: Comparing Traditional Predictive Models with Survival Analysis. Springer, Cham, 7(6), 93. https://doi.org/10.1007/s40593-020-00219-y
Mohammed, J., & Majeed, A. K. (2018). Predictive Data Mining for the Performance of Iraqi Students using Decision Tree. Journal of Theoretical and Applied Information Technology, 96(12), 3959–3966.
Morais, A. S., Amandi, A., & Schiaffino, S. (2018). Using neural networks for the prediction of performance of students. Journal of Computer Science and Technology, 18(2), 265–273. https://doi.org/10.2424/ijcsit.2018.2.2.05
Moreno García, C. F., Cortes Martinez, R. H., Cortes Martinez, J. J., Rodríguez, M. del P. C., & Leyva López, J. C. (2017). Predicting academic performance of Computer Engineering Students through Artificial Neural Networks. Research in Computing Science, 1332(c), 13–23.
Nananukul, N., & Tongdhamachart, T. (2019). Predicting student academic performance in introductory programming courses. Computers and Education, 140, 103606. https://doi.org/10.1016/j.compedu.2019.103606
Naseeb, N., & Atif, Y. (2018). Student's Performance Prediction: A New Approach in the Perspective of Students. 6(2), 13–20.
Noel, N., & Doy, A. A. (2017). Predictive data mining and modeling of student’s performance in E-learning. Education and Information Technologies, 22(2), 599–619. https://doi.org/10.1007/s10639-1555-0
Ola, O., Lasisi, R., & Eweoya, I. (2019). Predicting Academic Performance: A Comparative Analysis of Machine Learning Algorithms. International Journal of Interactive Mobile Technologies, 13(8), 124–132. https://doi.org/10.3991/ijim.v13i08.10466
Patil, A. G., & Chopade, R. (2021). A Predictive Data Mining Model for Academic Performance Improvement of Students. 2020 2nd International Conference on Advances in Science, Engineering and Robotics Technology (ICASERT), 2020, 692–696. https://doi.org/10.1109/ICASERT48347.2020.9354117
Paul, R. A. J., Nagarajan, K., & Eswaran, R. (2020). Analysis of University Academic Performance Prediction Using Data Mining Techniques. *Materials Today: Proceedings, xxxx. https://doi.org/10.1016/j.matpr.2020.12.475
Pinon, L., Dorronzoro, E., & López, A. S. (2021). Feature selection for predicting academic performance in higher education through machine learning techniques. Computer Applications in Engineering Education, 29(6), 1392–1408. https://doi.org/10.1002/cae.22295
Pratap Reddy, K., & Deo, B. S. (2017). A survey of data mining methods and models for the prediction of educational performance. Int J Adv Res Comput Sci, 8(9), 1411–1416.
Praveen, S. S., & Kumar, P. V. (2018). An overview on data mining approach for education performance prediction. Materials Today: Proceedings, 5(1), 2678–2684. https://doi.org/10.1016/j.matpr.2017.11.487
Ramírez-Corona, N., Martínez-Trinidad, J. F., & Carrasco-Ochoa, J. A. (2022). Analysis of Students’ Dropout in Virtual Learning Environments through Data Mining and Machine Learning: A Survey. Computers, 11(2), 102. https://doi.org/10.3390/computers11020010
Rojas, G., Munoz, R., Zvare, R., & Mandler, B. (2021). Predictive models for academic performance in online exams. Computers in Human Behavior, 118(July 2020), 106677. https://doi.org/10.1016/j.chb.2020.106677
Rokhman, F. A., & Ayu, M. A. (2018). Students’ Performance Prediction Using Classification Algorithms. Jurnal Ilmiah SISFOTENIKA (SISFOTENIKA), 8(1), 25–34. https://doi.org/10.24912/sisfotenika.v8i1.1903
Sakar, C. O., Polat, S. O., Kursun, O., & Sert, D. E. (2019). Real-time prediction of student scores: A flipped classroom experiment with international postgraduate students. Computers and Education, 136, 118–132. https://doi.org/10.1016/j.compedu.2019.03.001
Saldaña-Duenez, E. L., López, A. J. A., Tapia, L. A., & López, G. J. (2017). Learning Analytics in E-learning environments using multi-label classification. Journal of Computer Science and Technology, 17(1), 69–79. https://doi.org/10.2424/ijcsit.2017.1.1.01
Sathik, M. M., & Chandrakala, S. (2018). Prediction of student performance in programming courses based on data mining classification models. Materials Today: Proceedings, 5(1), 2533–2538. https://doi.org/10.1016/j.matpr.2017.11.398
Soni, A. K., Prajapati, P., & Patel, R. (2018). Predicting student performance for e-learning courses using the multilayer perceptron neural networks. Journal of Ambient Intelligence and Humanized Computing, 9(6), 1897–1904. https://doi.org/10.1007/s12652-018-0753-1
Sreedhar, C., & Vasista, A. K. (2021). Predicting Student Academic Performance using Supervised Learning Algorithms. *Materials Today: Proceedings, xxxx. https://doi.org/10.1016/j.matpr.2021.02.622
Sreeja, S., & Pramodh, S. (2019). Enhancing Student Performance Prediction using Feature Selection. Journal of Critical Reviews, 6(5), 293–298.
Tamada, T., Ono, R., & Takahashi, S. (2020). A Clustering-Based Approach to Predicting Student Dropout and Efficiently Selecting Early Warning Indicators in Online Courses. Computers in Human Behavior, 104(June 2019), 106171. https://doi.org/10.1016/j.chb.2019.106171
Tanveer, R., Ashfaq, A., Abid, M., Kavitha, R., & Lee, S. (2022). A feature selection technique based on evolutionary search for predicting academic performance. Journal of King Saud University - Computer and Information Sciences. https://doi.org/10.1016/j.jksuci.2021.11.019
Tsai, T. H., & Tseng, T. H. (2018). A review of user's behavior and performance in e-learning and learning analytics for the perspective of a comprehensive study. Computers and Education, 125, 389–411. https://doi.org/10.1016/j.compedu.2018.07.013
Tüzün, H., Y?lmaz-Soylu, M., Karaku?, T., ?lker, Y., & K?z?lkaya, G. (2009). The effects of computer games on primary school students’ achievement and motivation in geography learning. Computers & Education, 52(1), 68–77. https://doi.org/10.1016/j.compedu.2008.06.008
Verma, B., & Shah, P. (2020). Predicting Academic Performance of Engineering Students Using the Decision Tree Technique. Materials Today: Proceedings, 23(August 2019), 68–72. https://doi.org/10.1016/j.matpr.2019.07.185
Vishnu, V. S., & Sri, R. (2018). Predicting Student Performance Using Machine Learning. Ijarcce, 7(1), 57–60. https://doi.org/10.17148/ijarcce.2018.72106
Wang, Y. H., Tseng, C. C., Chou, C. H., Chan, T. W., & Chen, G. D. (2019). The role of personal and social factors in designing an affective student modeling system. Computers & Education, 131(March), 15–29. https://doi.org/10.1016/j.compedu.2018.12.003
Yang, D. (2021). A robust data preprocessing technique for estimating the performance of predictive modeling in big data analytics. Computers in Human Behavior, 114(June 2020), 106572. https://doi.org/10.1016/j.chb.2020.106572
Y?lmaz, R. M. (2017). Predicting student academic achievement by the use of machine learning techniques. Neural Computing and Applications, 28(7), 1773–1781. https://doi.org/10.1007/s00521-017-3055-8
Zain, J. M., Saad, M. R., & Abdullah, A. (2017). Predicting student academic performance using classification algorithms. Journal of Theoretical and Applied Information Technology, 95(10), 2351–2362.