Resiliencia de LLMs ante inyección de prompts en el contexto del marco técnico regulatorio eléctrico colombiano
| dc.contributor.advisor | Galpin, Ixent | |
| dc.contributor.advisor | Riascos, Javier | |
| dc.creator | Mora Martínez, Erick Giovanni | |
| dc.date.accessioned | 2026-01-15T14:21:24Z | |
| dc.date.created | 2026-01-09 | |
| dc.description.abstract | Evaluamos la resiliencia de tres LLM comerciales: Gemini 2.5 Pro, gpt-4o y DeepSeek Reasoner, frente a ataques override y role-playing usando un sistema RAG reproducible sobre un caso aplicable en el marco regulatorio y técnico eléctrico (RETIE), con 504 respuestas (168/modelo). Medimos Precisión Normativa (PN), Adhesión a la Inyección (AI) y Resiliencia Simple (SR=PN×AI), y reportamos un Weighted Resilience Score (WRS) que pondera por impacto la combinación de los ataques. Observamos WRS similares para gpt-4o (47.76) y DeepSeek (47.32), y menor para Gemini (40.17). No obstante, los modos de fallo difieren: gpt-4o y Gemini ceden con engaño sutil (AI=0.5) manteniendo PN relativamente alta, mientras DeepSeek, cuando falla, lo hace con mayor severidad. Nuestros resultados sugieren que no basta con medir la resiliencia; es clave analizar cómo fallan los modelos, especialmente en dominios regulados. | |
| dc.description.abstractenglish | We evaluate the resilience of three commercial LLMs: Gemini 2.5 Pro, gpt-4o, and DeepSeek Reasoner, against override and role-playing attacks using a reproducible RAG system on a case applicable to the electrical technical regulation (RETIE), analyzing 504 responses (168/model). We measure Normative Precision (NP), Injection Adherence (IA), and Simple Resilience (SR=NP×IA), and report a Weighted Resilience Score (WRS) that weights the combined impact of the attacks. We observe similar WRS values for gpt-4o (47.76) and DeepSeek (47.32), and a lower score for Gemini (40.17). However, failure modes differ: gpt-4o and Gemini yield to subtle deception (IA=0.5) while maintaining relatively high NP, whereas DeepSeek, when it fails, does so with greater severity. Our results suggest that measuring resilience is not enough; it is crucial to analyze how models fail, especially in regulated domains. | |
| dc.format.extent | 19 páginas | |
| dc.format.mimetype | application/pdf | |
| dc.identifier.uri | https://hdl.handle.net/20.500.12010/38790 | |
| dc.language.iso | es | |
| dc.relation.references | Amjad, F., Korótko, T., Rosin, A.: Review of LLMs applications in electrical power and energy systems. IEEE Access 13, 150951–150969 (2025) https://doi. org/10.1109/ACCESS.2025.3599922 | |
| dc.relation.references | Bernadić, A., Kujundžić, G., Primorac, I.: Large language models in power systems: Enhancing control and decision-making. International Journal of Innovative Solutions in Engineering 1(1), 10–17 (2025) https://doi.org/10.47960/3029-3200. 2025.1.1.10 | |
| dc.relation.references | Ruan, J., Liang, G., Zhao, H., Liu, G., Sun, X., Qiu, J., Xu, Z., Wen, F., Dong, Z.Y.: Applying large language models to power systems: Potential security threats. IEEE Transactions on Smart Grid 15(3), 3333–3336 (2024) https: //doi.org/10.1109/TSG.2024.3373256 | |
| dc.relation.references | Heverin, T., Benjamin, V., Braca, E., Carter, I., Kanchwala, H., Khojasteh, N., Landow, C., Luo, Y., Ma, C., Magarelli, A., Mirin, R., Moyer, A., Simpson, K., Skawinski, A.: Systematically analysing prompt injection vulnerabilities in diverse llm architectures. Technical report, The Baldwin School (2025) | |
| dc.relation.references | World Economic Forum: Global cybersecurity outlook 2025. Technical report, World Economic Forum (January 2025) | |
| dc.relation.references | Yip, D.W., Esmradi, A., Chan, C.F.: A novel evaluation framework for assessing resilience against prompt injection attacks in large language models (2023) https: //doi.org/10.1109/CSDE59766.2023.10487667 | |
| dc.relation.references | Zhang, C., Jin, M., Yu, Q., Liu, C., Xue, H., Jin, X.: Goal-guided generative prompt injection attack on large language models. arXiv preprint arXiv:2404.07234 (2024) | |
| dc.relation.references | Ministerio de Minas y Energía: Reglamento Técnico de Instalaciones Eléctricas - RETIE. https://www.minenergia.gov.co/es/misional/energia-electrica-2/ reglamentos-tecnicos/reglamento-t%C3%A9cnico-de-instalaciones-el%C3% A9ctricas-retie/ | |
| dc.relation.references | Li, M.Q., Fung, B.C.M.: Security concerns for large language models: A survey. arXiv preprint arXiv:2505.18889 (2025) | |
| dc.relation.references | Momcilovic, T.B., Balta, D., Buesser, B., Zizzo, G., Purcell, M.: Developing assurance cases for adversarial robustness and regulatory compliance in llms. arXiv preprint arXiv:2410.05304 (2024) | |
| dc.relation.references | Jones, N., Whaiduzzaman, M., Jan, T., Adel, A., Alazab, A., Alkreisat, A.: A cia triad-based taxonomy of prompt attacks on large language models. Future Internet 17 (2025) https://doi.org/10.3390/fi17030113 | |
| dc.relation.references | Pingua, B., Murmu, D., Kandpal, M., Rautaray, J., Mishra, P., Barik, R.K., Saikia, M.J.: Mitigating adversarial manipulation in llms: a prompt-based approach to counter jailbreak attacks (prompt-g). Zenodo (2024) https://doi. org/10.5281/zenodo.13501821 | |
| dc.relation.references | Kondamani, A.: RAG Poisoning: An Emerging Threat in AI Systems. https://medium.com/nfactor-technologies/ rag-poisoning-an-emerging-threat-in-ai-systems-660f9ff279f9 (2024) | |
| dc.relation.references | Zou, W., Geng, R., Wang, B., Jia, J.: Poisoned rag: Knowledge corruption attacks to retrieval-augmented generation of large language models. arXiv preprint arXiv:2402.07867 (2024) | |
| dc.relation.references | Zhang, Y., Li, Q., Du, T., Zhang, X., Zhao, X., Feng, Z., Yin, J.: Hijackrag: Hijacking attacks against retrieval-augmented large language models. arXiv preprint arXiv:2410.22832 (2024) | |
| dc.relation.references | An, B., Zhang, S., Dredze, M.: Rag llms are not safer: A safety analysis of retrieval-augmented generation for large language models. arXiv preprint arXiv:2504.18041 (2025) | |
| dc.relation.references | Siino, M., Falco, M., Croce, D., Rosso, P.: Exploring llms applications in law: A literature review on current legal nlp approaches. IEEE Access (2023) https: //doi.org/10.1109/ACCESS.2023.0322000 | |
| dc.relation.references | Zadenoori, M.A., Dąbrowski, J., Alhoshan, W., Zhao, L., Ferrari, A.: Large language models (llms) for requirements engineering (re): A systematic literature review. arXiv preprint arXiv:2509.11446 (2025) | |
| dc.relation.references | Bandurin, D., Matevosyan, N.: Large Language Models: Applications, Limitations and Potential Risks for Power Grids. https://utilityanalytics.com/ large-language-models-grid-analytics/ (2024) | |
| dc.relation.references | Nazi, Z.A., Peng, W.: Large Language Models in Healthcare and Medical Domain: A Review. Multidisciplinary Digital Publishing Institute (MDPI) (2024). https: //doi.org/10.3390/informatics11030057 | |
| dc.relation.references | Wang, J., Zhao, H., Yang, Z., Shu, P., Chen, J., Sun, H., Liang, R., Li, S., Shi, P., Ma, L., Liu, Z., Liu, Z., Zhong, T., Zhang, Y., Ma, C., Zhang, X., Zhang, T., Ding, T., Ren, Y., Liu, T., Jiang, X., Zhang, S.: Legal evalutions and challenges of large language models. arXiv preprint arXiv:2411.10137 (2024) | |
| dc.relation.references | Meo, S.A., Abukhalaf, F.A., Eltoukhy, R.A., Sattar, K.: Exploring the role of deepseek-r1, chatgpt-4, and google gemini in medical education: How valid and reliable are they? Pakistan Journal of Medical Sciences 41, 1887–1892 (2025) https://doi.org/10.12669/pjms.41.7.12183 | |
| dc.relation.references | Haldar, R., Wang, Z., Song, Q., Lin, G., Xing, Y.: Llm safety alignment is divergence estimation in disguise. arXiv preprint arXiv:2502.00657 (2025) | |
| dc.relation.references | Wirth, R., Hipp, J.: Crisp-dm: Towards a standard process model for data mining. In: Proceedings of the 4th International Conference on the Practical Applications of Knowledge Discovery and Data Mining (2000) | |
| dc.relation.references | Ministerio de Minas y Energía: Libro 1 - disposiciones generales. Technical report, Ministerio de Minas y Energía (April 2024) | |
| dc.relation.references | Ministerio de Minas y Energía: Libro 2 - productos objeto del retie. Technical report, Ministerio de Minas y Energía (April 2024) | |
| dc.subject | LLM | |
| dc.subject | RAG | |
| dc.subject | Prompt Injection | |
| dc.subject | Human Judgment | |
| dc.subject | Override | |
| dc.subject | Role-Playing | |
| dc.subject.keyword | LLM | |
| dc.subject.keyword | RAG | |
| dc.subject.keyword | Prompt Injection | |
| dc.subject.keyword | Human Judgment | |
| dc.subject.keyword | Override | |
| dc.subject.keyword | Role-Playing | |
| dc.subject.lemb | Inteligencia artificial - Seguridad informática | |
| dc.subject.lemb | Modelos de lenguaje - Pruebas de desempeño | |
| dc.subject.lemb | Sistemas eléctricos - Normas técnicas | |
| dc.title | Resiliencia de LLMs ante inyección de prompts en el contexto del marco técnico regulatorio eléctrico colombiano | |
| dc.type.coar | http://purl.org/coar/resource_type/c_baaf |
Archivos
Bloque original
1 - 1 de 1
Cargando...
- Nombre:
- Resiliencia de LLMs Ante Inyección de Prompts.pdf
- Tamaño:
- 1.6 MB
- Formato:
- Adobe Portable Document Format
Bloque de licencias
1 - 2 de 2
Cargando...
- Nombre:
- license.txt
- Tamaño:
- 3.28 KB
- Formato:
- Item-specific license agreed upon to submission
- Descripción:
Cargando...
- Nombre:
- FOR-EFE-GDB-008_AUTORIZACION_DE_PUBLICACION_DE_TESIS_O_TRABAJO_DE_GRADO_DE_FORMA_CONFIDENCIAL_EM_IG_JRO.pdf
- Tamaño:
- 228.23 KB
- Formato:
- Adobe Portable Document Format
- Descripción:
- Documento reservado
