Result: Aplicación de Machine Learning para predecir y explicar el rendimiento académico universitario.
Further Information
In the educational field, academic performance represents the outcomes of evaluation processes and is related to students’ learning achievements. Early identification of the factors that influence performance allows for timely interventions to prevent course repetition and student dropout. In this regard, the objective of this study was to apply machine learning models to predict and explain academic performance, with a particular focus on students with a history of failing at least one course. A quantitative approach was used, with a non-experimental, ex post facto design, based on a population of 12,211 university students. Data were collected through a 32-item questionnaire covering sociodemographic, socioeconomic, emotional, institutional-academic, self-efficacy, and self-control aspects, linked to the student enrollment system, as well as an institutional database with seven academic variables. Three supervised classification algorithms were trained: Random Forest, XGBoost, and CatBoost. In addition, the SHAP method was used to interpret the model’s outputs. Data processing and analysis were conducted using Python in the Google Colab environment. CatBoost showed the best performance, achieving a 70% recall for the “failed” class. The most influential indicators were faculty, academic program, academic level or cycle, emotional state, teacher support, and previous academic performance. It is concluded that academic failure is influenced primarily by institutional-academic variables, followed by emotional, sociodemographic, and socioeconomic factors. The value of interpretable machine learning (SHAP) is highlighted as a tool to support educational decision-making. [ABSTRACT FROM AUTHOR]
En el ámbito educativo, el rendimiento académico representa los resultados de procesos de evaluación y se relaciona con logros de aprendizaje. La identificación temprana de los factores que influyen en el rendimiento permite orientar intervenciones oportunas para prevenir la repitencia y el abandono estudiantil. Ante esto, la investigación tuvo como propósito aplicar modelos de Machine Learning para predecir y explicar el rendimiento académico, enfocándose particularmente en estudiantes con antecedentes de reprobación. Se utilizó un enfoque cuantitativo, con diseño no experimental y análisis ex post facto, basado en una población de 12211 estudiantes universitarios. La información se recopiló mediante un cuestionario de 32 ítems sobre aspectos sociodemográficos, socioeconómicos, emocionales, institucionales-académicos, autoeficacia y autocontrol, anclado a la opción matrícula del estudiante, y una base institucional con siete variables académicas. Se entrenaron tres algoritmos de clasificación supervisada: Random Forest, XGBoost y CatBoost. Además, se utilizó el método SHAP para interpretar los resultados del modelo. El procesamiento y análisis se realizó en Python utilizando Google Colab. CatBoost mostró el mejor desempeño, alcanzando un recall del 70 % para la clase “reprobada”. Los indicadores más influyentes fueron: facultad, carrera, nivel o ciclo, estado emocional, atención docente y desempeño académico previo. Se concluye que la reprobación está influenciada, por variables académicas-institucionales, seguidas de emocionales, sociodemográficas y socioeconómicas; y, se destaca el valor del Machine Learning interpretativo (SHAP) para apoyar la toma de decisiones educativas. [ABSTRACT FROM AUTHOR]
Copyright of Comunicar is the property of Oxbridge Publishing House and its content may not be copied or emailed to multiple sites without the copyright holder's express written permission. Additionally, content may not be used with any artificial intelligence tools or machine learning technologies. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)