Resiliencia de LLMs ante inyección de prompts en el contexto del marco técnico regulatorio eléctrico colombiano

dc.contributor.advisorGalpin, Ixent
dc.contributor.advisorRiascos, Javier
dc.creatorMora Martínez, Erick Giovanni
dc.date.accessioned2026-01-15T14:21:24Z
dc.date.created2026-01-09
dc.description.abstractEvaluamos la resiliencia de tres LLM comerciales: Gemini 2.5 Pro, gpt-4o y DeepSeek Reasoner, frente a ataques override y role-playing usando un sistema RAG reproducible sobre un caso aplicable en el marco regulatorio y técnico eléctrico (RETIE), con 504 respuestas (168/modelo). Medimos Precisión Normativa (PN), Adhesión a la Inyección (AI) y Resiliencia Simple (SR=PN×AI), y reportamos un Weighted Resilience Score (WRS) que pondera por impacto la combinación de los ataques. Observamos WRS similares para gpt-4o (47.76) y DeepSeek (47.32), y menor para Gemini (40.17). No obstante, los modos de fallo difieren: gpt-4o y Gemini ceden con engaño sutil (AI=0.5) manteniendo PN relativamente alta, mientras DeepSeek, cuando falla, lo hace con mayor severidad. Nuestros resultados sugieren que no basta con medir la resiliencia; es clave analizar cómo fallan los modelos, especialmente en dominios regulados.
dc.description.abstractenglishWe evaluate the resilience of three commercial LLMs: Gemini 2.5 Pro, gpt-4o, and DeepSeek Reasoner, against override and role-playing attacks using a reproducible RAG system on a case applicable to the electrical technical regulation (RETIE), analyzing 504 responses (168/model). We measure Normative Precision (NP), Injection Adherence (IA), and Simple Resilience (SR=NP×IA), and report a Weighted Resilience Score (WRS) that weights the combined impact of the attacks. We observe similar WRS values for gpt-4o (47.76) and DeepSeek (47.32), and a lower score for Gemini (40.17). However, failure modes differ: gpt-4o and Gemini yield to subtle deception (IA=0.5) while maintaining relatively high NP, whereas DeepSeek, when it fails, does so with greater severity. Our results suggest that measuring resilience is not enough; it is crucial to analyze how models fail, especially in regulated domains.
dc.format.extent19 páginas
dc.format.mimetypeapplication/pdf
dc.identifier.urihttps://hdl.handle.net/20.500.12010/38790
dc.language.isoes
dc.relation.referencesAmjad, F., Korótko, T., Rosin, A.: Review of LLMs applications in electrical power and energy systems. IEEE Access 13, 150951–150969 (2025) https://doi. org/10.1109/ACCESS.2025.3599922
dc.relation.referencesBernadić, A., Kujundžić, G., Primorac, I.: Large language models in power systems: Enhancing control and decision-making. International Journal of Innovative Solutions in Engineering 1(1), 10–17 (2025) https://doi.org/10.47960/3029-3200. 2025.1.1.10
dc.relation.referencesRuan, J., Liang, G., Zhao, H., Liu, G., Sun, X., Qiu, J., Xu, Z., Wen, F., Dong, Z.Y.: Applying large language models to power systems: Potential security threats. IEEE Transactions on Smart Grid 15(3), 3333–3336 (2024) https: //doi.org/10.1109/TSG.2024.3373256
dc.relation.referencesHeverin, T., Benjamin, V., Braca, E., Carter, I., Kanchwala, H., Khojasteh, N., Landow, C., Luo, Y., Ma, C., Magarelli, A., Mirin, R., Moyer, A., Simpson, K., Skawinski, A.: Systematically analysing prompt injection vulnerabilities in diverse llm architectures. Technical report, The Baldwin School (2025)
dc.relation.referencesWorld Economic Forum: Global cybersecurity outlook 2025. Technical report, World Economic Forum (January 2025)
dc.relation.referencesYip, D.W., Esmradi, A., Chan, C.F.: A novel evaluation framework for assessing resilience against prompt injection attacks in large language models (2023) https: //doi.org/10.1109/CSDE59766.2023.10487667
dc.relation.referencesZhang, C., Jin, M., Yu, Q., Liu, C., Xue, H., Jin, X.: Goal-guided generative prompt injection attack on large language models. arXiv preprint arXiv:2404.07234 (2024)
dc.relation.referencesMinisterio de Minas y Energía: Reglamento Técnico de Instalaciones Eléctricas - RETIE. https://www.minenergia.gov.co/es/misional/energia-electrica-2/ reglamentos-tecnicos/reglamento-t%C3%A9cnico-de-instalaciones-el%C3% A9ctricas-retie/
dc.relation.referencesLi, M.Q., Fung, B.C.M.: Security concerns for large language models: A survey. arXiv preprint arXiv:2505.18889 (2025)
dc.relation.referencesMomcilovic, T.B., Balta, D., Buesser, B., Zizzo, G., Purcell, M.: Developing assurance cases for adversarial robustness and regulatory compliance in llms. arXiv preprint arXiv:2410.05304 (2024)
dc.relation.referencesJones, N., Whaiduzzaman, M., Jan, T., Adel, A., Alazab, A., Alkreisat, A.: A cia triad-based taxonomy of prompt attacks on large language models. Future Internet 17 (2025) https://doi.org/10.3390/fi17030113
dc.relation.referencesPingua, B., Murmu, D., Kandpal, M., Rautaray, J., Mishra, P., Barik, R.K., Saikia, M.J.: Mitigating adversarial manipulation in llms: a prompt-based approach to counter jailbreak attacks (prompt-g). Zenodo (2024) https://doi. org/10.5281/zenodo.13501821
dc.relation.referencesKondamani, A.: RAG Poisoning: An Emerging Threat in AI Systems. https://medium.com/nfactor-technologies/ rag-poisoning-an-emerging-threat-in-ai-systems-660f9ff279f9 (2024)
dc.relation.referencesZou, W., Geng, R., Wang, B., Jia, J.: Poisoned rag: Knowledge corruption attacks to retrieval-augmented generation of large language models. arXiv preprint arXiv:2402.07867 (2024)
dc.relation.referencesZhang, Y., Li, Q., Du, T., Zhang, X., Zhao, X., Feng, Z., Yin, J.: Hijackrag: Hijacking attacks against retrieval-augmented large language models. arXiv preprint arXiv:2410.22832 (2024)
dc.relation.referencesAn, B., Zhang, S., Dredze, M.: Rag llms are not safer: A safety analysis of retrieval-augmented generation for large language models. arXiv preprint arXiv:2504.18041 (2025)
dc.relation.referencesSiino, M., Falco, M., Croce, D., Rosso, P.: Exploring llms applications in law: A literature review on current legal nlp approaches. IEEE Access (2023) https: //doi.org/10.1109/ACCESS.2023.0322000
dc.relation.referencesZadenoori, M.A., Dąbrowski, J., Alhoshan, W., Zhao, L., Ferrari, A.: Large language models (llms) for requirements engineering (re): A systematic literature review. arXiv preprint arXiv:2509.11446 (2025)
dc.relation.referencesBandurin, D., Matevosyan, N.: Large Language Models: Applications, Limitations and Potential Risks for Power Grids. https://utilityanalytics.com/ large-language-models-grid-analytics/ (2024)
dc.relation.referencesNazi, Z.A., Peng, W.: Large Language Models in Healthcare and Medical Domain: A Review. Multidisciplinary Digital Publishing Institute (MDPI) (2024). https: //doi.org/10.3390/informatics11030057
dc.relation.referencesWang, J., Zhao, H., Yang, Z., Shu, P., Chen, J., Sun, H., Liang, R., Li, S., Shi, P., Ma, L., Liu, Z., Liu, Z., Zhong, T., Zhang, Y., Ma, C., Zhang, X., Zhang, T., Ding, T., Ren, Y., Liu, T., Jiang, X., Zhang, S.: Legal evalutions and challenges of large language models. arXiv preprint arXiv:2411.10137 (2024)
dc.relation.referencesMeo, S.A., Abukhalaf, F.A., Eltoukhy, R.A., Sattar, K.: Exploring the role of deepseek-r1, chatgpt-4, and google gemini in medical education: How valid and reliable are they? Pakistan Journal of Medical Sciences 41, 1887–1892 (2025) https://doi.org/10.12669/pjms.41.7.12183
dc.relation.referencesHaldar, R., Wang, Z., Song, Q., Lin, G., Xing, Y.: Llm safety alignment is divergence estimation in disguise. arXiv preprint arXiv:2502.00657 (2025)
dc.relation.referencesWirth, R., Hipp, J.: Crisp-dm: Towards a standard process model for data mining. In: Proceedings of the 4th International Conference on the Practical Applications of Knowledge Discovery and Data Mining (2000)
dc.relation.referencesMinisterio de Minas y Energía: Libro 1 - disposiciones generales. Technical report, Ministerio de Minas y Energía (April 2024)
dc.relation.referencesMinisterio de Minas y Energía: Libro 2 - productos objeto del retie. Technical report, Ministerio de Minas y Energía (April 2024)
dc.subjectLLM
dc.subjectRAG
dc.subjectPrompt Injection
dc.subjectHuman Judgment
dc.subjectOverride
dc.subjectRole-Playing
dc.subject.keywordLLM
dc.subject.keywordRAG
dc.subject.keywordPrompt Injection
dc.subject.keywordHuman Judgment
dc.subject.keywordOverride
dc.subject.keywordRole-Playing
dc.subject.lembInteligencia artificial - Seguridad informática
dc.subject.lembModelos de lenguaje - Pruebas de desempeño
dc.subject.lembSistemas eléctricos - Normas técnicas
dc.titleResiliencia de LLMs ante inyección de prompts en el contexto del marco técnico regulatorio eléctrico colombiano
dc.type.coarhttp://purl.org/coar/resource_type/c_baaf

Archivos

Bloque original

Mostrando 1 - 1 de 1
Cargando...
Miniatura
Nombre:
Resiliencia de LLMs Ante Inyección de Prompts.pdf
Tamaño:
1.6 MB
Formato:
Adobe Portable Document Format

Bloque de licencias

Mostrando 1 - 2 de 2
Cargando...
Miniatura
Nombre:
license.txt
Tamaño:
3.28 KB
Formato:
Item-specific license agreed upon to submission
Descripción:
Cargando...
Miniatura
Nombre:
FOR-EFE-GDB-008_AUTORIZACION_DE_PUBLICACION_DE_TESIS_O_TRABAJO_DE_GRADO_DE_FORMA_CONFIDENCIAL_EM_IG_JRO.pdf
Tamaño:
228.23 KB
Formato:
Adobe Portable Document Format
Descripción:
Documento reservado