Predictive model based on logistic regression for student dropout
Keywords:
logistic regression, student dropout, machine learning, SMOTE, higher educationAbstract
Student dropout constitutes a significant issue in higher education, with social and academic implications. This study presents the design of a predictive model based on logistic regression, a machine learning technique used to model the relationship between independent variables and the probability of a binary event, to identify students at risk of dropping out. A public dataset containing academic and sociodemographic variables was used. Additionally, the SMOTE oversampling technique was applied to balance the data, achieving a prediction accuracy of 73%. The proposed model is envisioned as a technological tool that could be adapted with real data from the Faculty of Computer Engineering at the Technological University of Havana “José Antonio Echeverría” (Cujae) to prevent student dropout and optimize educational resources.