Cost prediction for work and human development education (ETDH) programs in Colombia: a machine learning and efficiency analysis approach
| dc.date.created | 2025 | |
| dc.description.abstract | En muchas economías emergentes, los programas técnicos y vocacionales asumen una proporción desproporcionada de la responsabilidad de expandir las oportunidades; sin embargo, la forma en que se estructuran sus costos sigue siendo, en gran medida, opaca. En la Educación para el Trabajo y el Desarrollo Humano (ETDH) en Colombia, las instituciones, los futuros estudiantes y los responsables de formular políticas deben tomar decisiones en un contexto donde las matrículas son visibles, pero las estructuras de costos subyacentes no lo son.Este estudio aborda esa brecha mediante la construcción de un modelo de predicción de costos para los programas de ETDH, utilizando datos administrativos del Sistema de Información de la Educación para el Trabajo y el Desarrollo Humano (SIETDH). Siguiendo el marco de trabajo CRISP-DM y combinando el análisis de eficiencia con el aprendizaje automático (machine learning), analizamos quince variables agrupadas en nueve categorías y estimamos varios modelos predictivos.Se compararon tres algoritmos —K-Nearest Neighbors (KNN), Random Forest y XGBoost— mediante validación cruzada de K-pliegues (K-fold cross-validation). XGBoost ofrece el mejor rendimiento, con un coeficiente de determinación (R2) de 0,3879 y un error porcentual absoluto medio (MAPE) del 15,2%, superando claramente a la regresión lineal tradicional. El modelo destaca cinco variables como los principales impulsores de los costos de los programas: duración, certificaciones de calidad, ubicación geográfica, área de desempeño y características institucionales.Más que tratar estos resultados como un logro puramente técnico, los interpretamos como una forma de dar transparencia a las estructuras de costos en un sector que atiende a muchos estudiantes con márgenes financieros limitados. Los líderes institucionales pueden utilizar el modelo para apoyar la asignación de recursos y las decisiones de fijación de precios; los futuros estudiantes obtienen una base más clara para evaluar la relación costo-beneficio; y los responsables de políticas obtienen insumos empíricos para diseñar estrategias de financiamiento más transparentes y basadas en evidencia, siempre que los resultados del modelo complementen, en lugar de reemplazar, las consideraciones educativas y de equidad más amplias. | |
| dc.description.abstractenglish | In many emerging economies, technical and vocational programmes carry a disproportionate share of the responsibility for expanding opportunities, yet the way their costs are formed remains largely opaque. ForWork and Human Development Education (ETDH) in Colombia, institutions, prospective students, and policy makers must make decisions in a context where tuition fees are visible but the underlying cost structures are not. This study addresses that gap by building a cost prediction model for ETDH programmes using administrative data from the Information System for Work and Human Development Education (SIETDH). Following the CRISP-DM framework, and combining efficiency analysis with machine learning, we analyse fifteen variables grouped into nine categories and estimate several predictive models. Three algorithms—K-Nearest Neighbors, Random Forest, and XGBoost—are compared using K-fold cross-validation. XGBoost delivers the best performance, with a coefficient of determination (R2) of 0.3879 and a mean absolute percentage error (MAPE) of 15.2%, clearly improving on traditional linear regression. The model highlights five variables as key drivers of programme costs: duration, quality certifications, geographic location, performance area, and institutional characteristics. Rather than treating these results as a purely technical achievement, we interpret them as a way to make cost structures more transparent in a sector that serves many students with limited financial margins. Institutional leaders can use the model to support resource allocation and pricing decisions; prospective students gain a clearer basis for cost–benefit assessments; and policy makers obtain empirical inputs for designing more transparent and evidencebased financing strategies—provided that model outputs complement, rather than replace, broader educational and equity considerations. | |
| dc.format.extent | 22 páginas | |
| dc.format.mimetype | application/pdf | |
| dc.identifier.uri | https://hdl.handle.net/20.500.12010/38729 | |
| dc.language.iso | en | |
| dc.relation.references | Guzmán Rincón, A.; Barragán Moreno, S.; Cala-Vitery, F. Rural Population and COVID-19: A Model for Assessing the Economic Effects of Drop-Out in Higher Education. Frontiers in Education 2021, 6, 1–14. | |
| dc.relation.references | Vidal, J.; Gilar-Corbí, R.; Pozo-Rico, T.; Castejón, J.L.; Sánchez-Almeida, T. Predictors of University Attrition: Looking for an Equitable and Sustainable Higher Education. Sustainability 2022, 14(3), 1187. | |
| dc.relation.references | Tayefeh Hashemi, S.; Ebadati, O.M.; Kaur, H. Cost Estimation and Prediction in Construction Projects: A Systematic Review on Machine Learning Techniques. SN Applied Sciences 2020, 2(10), 1703. | |
| dc.relation.references | Contreras Cueva, A.B.; Bolancé Losilla, C. Aplicación de los modelos probabilísticos para analizar la eficiencia de las universidades. Revista de Métodos Cuantitativos para la Economía y la Empresa 2007, 4, 97–118. | |
| dc.relation.references | Visbal-Cadavid, D.; Mendoza Mendoza, A.; Quintero Hoyos, I. Prediction of Efficiency in Colombian Higher Education Institutions with Data Envelopment Analysis and Neural Networks. Pesquisa Operacional 2019, 39(2), 261–275. | |
| dc.relation.references | Locatelli, R.L.; dos Santos, R.B.; Ramalho, W.; Cunha, G.R. Dinâmica de custos de uma instituição de ensino: Modelo, cálculo da inflação interna e simulações. Revista de Educação PUC-Campinas 2019, 24(1), 81–95. | |
| dc.relation.references | Gallegos Talavera, T.Y.; Altamirano, L.A.; Altamirano Salazar, W.A. El impacto de la inclusión financiera, la pobreza y el desempleo en el acceso a la educación superior del Ecuador: Un modelo de financiamiento. Revista Conectividad 2023, 5(1), 45–62. | |
| dc.relation.references | Visbal Cadavid, D.A.; Mendoza, A.; Pedraza, S. Predicting the Efficiency of Colombian Higher Education Institutions with Data Envelopment Analysis and Data Mining. In Proceedings of the 16th LACCEI International Multi-Conference for Engineering, Education, and Technology; LACCEI: Boca Raton, FL, USA, 2017; pp. 1–8. | |
| dc.relation.references | Visbal Cadavid, D.A.; Mendoza, A.; Orjuela Pedraza, S.J. Predicción de la eficiencia de las instituciones de educación superior colombianas con análisis envolvente de datos y minería de datos. Revista Educación en Ingeniería 2017, 12(24), 55–62. | |
| dc.relation.references | Mideros, A. A Cost-Effectiveness Analysis of Social Transfers on Human Capital Accumulation. Problemas del Desarrollo. Revista Latinoamericana de Economía 2021, 52(205), 57–86. | |
| dc.relation.references | Johnes, G.; Johnes, J. Higher Education Institutions’ Costs and Efficiency: Taking the Decomposition a Further Step. Economics of Education Review 2009, 28(1), 107–113. | |
| dc.relation.references | Izadi, H.; Johnes, G.; Oskrochi, R.; Crouchley, R. Stochastic Frontier Estimation of a CES Cost Function: The Case of Higher Education in Britain. Economics of Education Review 2002, 21(1), 63–71. | |
| dc.relation.references | Johnes, G. Costs and Industrial Structure in Contemporary British Higher Education. The Economic Journal 1997, 107(442), 727–737. | |
| dc.relation.references | Glass, J.C.; McKillop, D.G.; Hyndman, N. Efficiency in the Provision of University Teaching and Research: An Empirical Analysis of UK Universities. Journal of Applied Econometrics 1995, 10(1), 61–72. | |
| dc.relation.references | Thanassoulis, E.; Kortelainen, M.; Johnes, G.; Johnes, J. Costs and Efficiency of Higher Education Institutions in England: A DEA Analysis. Journal of the Operational Research Society 2011, 62(7), 1282–1297. | |
| dc.relation.references | Gralka, S. Persistent Inefficiency in the Higher Education Sector: Evidence from Germany. Education Economics 2018, 26(4), 373–392. | |
| dc.relation.references | Ferro, G. Higher Education Efficiency Frontier Analysis: A Review of Variables to Consider. Journal on Efficiency and Responsibility in Education and Science (ERIES) 2020, 13(1), 5–19. | |
| dc.relation.references | Romero, C.; Ventura, S. Educational Data Mining and Learning Analytics: An Updated Survey. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery 2020, 10(3), e1355. | |
| dc.relation.references | Delen, D. A Comparative Analysis of Machine Learning Techniques for Student Retention Management. Decision Support Systems 2010, 49(4), 498–506. | |
| dc.relation.references | Jayaprakash, S.M.; Moody, E.W.; Lauría, E.J.M.; Regan, J.R.; Baron, J.D. Early Alert of Academically At-Risk Students: An Open Source Analytics Initiative. Journal of Learning Analytics 2014, 1(1), 6–47. | |
| dc.relation.references | Tomaševi´c, N.; Gvozdenovi´c, N.; Vraneš, S. An Overview and Comparison of Supervised Data Mining Techniques for Student Exam Performance Prediction. Computers & Education 2013, 63, 202–217. | |
| dc.relation.references | Waheed, H.; Hassan, S.-U.; Aljohani, N.R.; Hardman, J.; Alelyani, S.; Nawaz, R. Predicting Academic Performance of Students from VLE Big Data Using Deep Learning Models. Computers in Human Behavior 2020, 104, 106189. | |
| dc.relation.references | Atchoarena, D.; Delluc, A.M. Revisiting Technical and Vocational Education and Training in Sub-Saharan Africa; International Institute for Educational Planning, UNESCO: Paris, France, 2001. | |
| dc.relation.references | UNESCO. Strategy for Technical and Vocational Education and Training (TVET) 2016–2021; United Nations Educational, Scientific and Cultural Organization: Paris, France, 2016. | |
| dc.relation.references | OECD; ECLAC; CAF. Latin American Economic Outlook 2015: Education, Skills and Innovation for Development; Organisation for Economic Co-operation and Development: Paris, France, 2015. | |
| dc.relation.references | UNESCO. Unleashing the Potential: Transforming Technical and Vocational Education and Training; UNESCO Publishing: Paris, France, 2015. | |
| dc.relation.references | Inter-American Development Bank (IDB). Skills for the 21st Century in Latin America and the Caribbean; Inter-American Development Bank: Washington, DC, USA, 2014. | |
| dc.relation.references | World Bank. Skills Development in Sub-Saharan Africa;World Bank: Washington, DC, USA, 2004. | |
| dc.relation.references | Creemers, B.P.M.; Kyriakides, L. The Dynamics of Educational Effectiveness: A Contribution to Policy, Practice and Theory in Contemporary Schools; Routledge: London, UK, 2008. | |
| dc.relation.references | Teddlie, C.; Reynolds, D. The International Handbook of School Effectiveness Research; Falmer Press: London, UK, 2000. | |
| dc.relation.references | Muijs, D.; Reynolds, D. Effective Teaching: Evidence and Practice, 3rd ed.; SAGE: London, UK, 2011. | |
| dc.relation.references | Scheerens, J.; Bosker, R.J. The Foundations of Educational Effectiveness; Pergamon: Oxford, UK, 1997. | |
| dc.relation.references | Muijs, D.; Kyriakides, L.; van der Werf, G.; Creemers, B.P.M.; Timperley, H.; Earl, L. State of the Art—Teacher Effectiveness and Professional Learning. School Effectiveness and School Improvement 2014, 25(2), 231–256. | |
| dc.subject | Prediccion de costos | |
| dc.subject | Aprendizaje automático | |
| dc.subject | Analítica educativa | |
| dc.subject | Programas ETDH | |
| dc.subject | XGBoost | |
| dc.subject | Optimización de recursos | |
| dc.subject | Metodología CRISP-DM | |
| dc.subject | Educación técnica | |
| dc.subject.keyword | Cost prediction | |
| dc.subject.keyword | Machine learning | |
| dc.subject.keyword | Educational analytics | |
| dc.subject.keyword | ETDH programs | |
| dc.subject.keyword | XGBoost | |
| dc.subject.keyword | Resource optimization | |
| dc.subject.keyword | CRISP-DM methodology | |
| dc.subject.keyword | Technical education | |
| dc.subject.lemb | Educación para el trabajo | |
| dc.subject.lemb | Minería de datos - Educación | |
| dc.subject.lemb | Planeación educativa - Toma de decisiones | |
| dc.title | Cost prediction for work and human development education (ETDH) programs in Colombia: a machine learning and efficiency analysis approach | |
| dc.type.coar | http://purl.org/coar/resource_type/c_6501 |
Archivos
Bloque original
1 - 1 de 1
Cargando...
- Nombre:
- 0__Predictive_Cost_Modeling_for_Work_and_Human_Development_Education.pdf
- Tamaño:
- 958.16 KB
- Formato:
- Adobe Portable Document Format
- Descripción:
- Tesis
Bloque de licencias
1 - 2 de 2
Cargando...
- Nombre:
- license.txt
- Tamaño:
- 3.28 KB
- Formato:
- Item-specific license agreed upon to submission
- Descripción:
Cargando...
- Nombre:
- FOR-EFE-GDB-007_AUTORIZACION_DE_PUBLICACION_DE_TESIS_O_TRABAJO_DE_GRADONLA.pdf
- Tamaño:
- 322.26 KB
- Formato:
- Adobe Portable Document Format
- Descripción:
- Carta de autorización
