Fine-Grained emotion classification in reddit mental health posts using LLMs
| dc.contributor.advisor | Galpin, Ixent | |
| dc.creator | Orjuela Llanos, Valery Pamela | |
| dc.date.accessioned | 2025-06-16T22:15:32Z | |
| dc.date.available | 2025-06-16T22:15:32Z | |
| dc.date.created | 2025-06-16 | |
| dc.description.abstract | Este estudio evalúa el discurso sobre salud mental en Reddit a través del análisis de emociones basado en IA. A partir de un conjunto de datos exhaustivo que incluye 7899 publicaciones y 54755 comentarios sobre 10 afecciones relacionadas con la salud mental, este estudio combina técnicas de PNL como BERTopic y el análisis léxico, con el uso de GPT-4o-mini para clasificar las emociones. Los resultados indican claras diferencias emocionales entre publicaciones y comentarios: mientras que los títulos suelen expresar urgencia (p. ej., "ayuda"), los comentarios muestran empatía (p. ej., "amor"). GPT-4o-mini coincidió con las anotaciones de emociones humanas con una precisión del 52,6 %, obteniendo un mejor rendimiento con las emociones positivas (58 %). El análisis temático reveló que los diagnósticos clínicos predominaban en las publicaciones, pero los comentarios contenían diálogos de apoyo. Este trabajo ilustró el potencial de los LLM como herramienta para el monitoreo de la salud mental y el bienestar, a la vez que indica posibles inconvenientes con respecto a la clasificación ambigua de emociones. El estudio también ofrece información práctica para el desarrollo de herramientas de IA en el ámbito de la salud mental y describe direcciones para el trabajo futuro. | spa |
| dc.description.abstractenglish | This study evaluates the discourse surrounding mental health on Reddit through the lens of AI-based emotion analysis. Drawing from a comprehensive dataset comprising 7,899 posts and 54,755 comments on 10 mental health related conditions, this study combines NLP techniques such as BERTopic and lexical analysis, and the usage of GPT-4o-mini to classify emotions. The results point to clear emotional differences between posts and comments: while titles often express urgency (e.g., ’help’), comments show empathy (e.g., ’love’). GPT-4o-mini matched human emotion annotations with an accuracy of 52.6%, performing better with positive emotions (58%). Thematic analysis revealed that clinical diagnoses were dominant in the posts, but the comments contained supportive dialogue. This work illustrated that LLMs have potential as an avenue for monitoring mental health and well-being, while also indicating potential drawbacks with respect to ambiguous emotion classification. The study also offers practical information for the development of AI tools in the mental health space and outlines directions for future work. | spa |
| dc.format.extent | 16 páginas | spa |
| dc.format.mimetype | application/pdf | spa |
| dc.identifier.uri | https://hdl.handle.net/20.500.12010/36877 | |
| dc.language.iso | eng | spa |
| dc.relation.references | Al-Ansari, Ahmed, J.: From sentiment analysis to choreography of emotions: social media analysis for improved customer relationship management (CRM) in the Omani telecom sector. Ph.D. thesis, University of Strathclyde (2019) | spa |
| dc.relation.references | Atari, M., Mehl, M.R., Graham, J., Doris, J.M., Schwarz, N., Davani, A.M., Om- rani, A., Kennedy, B., Gonzalez, E., Jafarzadeh, N., Hussain, A., Mirinjian, A., Madden, A., Bhatia, R., Burch, A., Harlan, A., Sbarra, D.A., Raison, C.L., Mose- ley, S.A., Polsinelli, A.J., Dehghani, M.: The paucity of morality in everyday talk. Scientific Reports 13, 5967 (4 2023). https://doi.org/10.1038/s41598-023-3 2711-4 | spa |
| dc.relation.references | Barrett, L.F.: Are Emotions Natural Kinds? Perspectives on Psychological Science 1, 28–58 (3 2006). https://doi.org/10.1111/j.1745-6916.2006.00003.x | spa |
| dc.relation.references | Center, P.: Teen Internet Access Demographics, Pew Internet and American Life Project (2012) | spa |
| dc.relation.references | Cha, J., Kim, S., Park, E.: A lexicon-based approach to examine depression de- tection in social media: the case of Twitter and university community. Humanities and Social Sciences Communications 9, 325 (9 2022). https://doi.org/10.105 7/s41599-022-01313-2 | spa |
| dc.relation.references | Coppersmith, G., Dredze, M., Harman, C., Hollingshead, K.: From ADHD to SAD: Analyzing the Language of Mental Health on Twitter through Self-Reported Diag- noses.In:Proceedingsofthe2ndWorkshoponComputationalLinguisticsandClin- ical Psychology: From Linguistic Signal to Clinical Reality. pp. 1–10. Association for Computational Linguistics (2015). https://doi.org/10.3115/v1/W15-1201 | spa |
| dc.relation.references | Dante, Durante, R., Sobbrio, F., Zejcirovic, D.D.: Lost in the net? broadband internet and youth mental health. JSTOR (2022) | spa |
| dc.relation.references | Demszky,D.,Movshovitz-Attias,D.,Ko,J.,Cowen,A.,Nemade,G.,Ravi,S.:GoE- motions: A Dataset of Fine-Grained Emotions. arXiv preprint arXiv:2005.00547 (5 2020) | spa |
| dc.relation.references | Egger, R., Yu, J.: A Topic Modeling Comparison Between LDA, NMF, Top2Vec, and BERTopic to Demystify Twitter Posts. Frontiers in Sociology 7 (5 2022). https://doi.org/10.3389/fsoc.2022.886498 | spa |
| dc.relation.references | Eichstaedt, J.C., Smith, R.J., Merchant, R.M., Ungar, L.H., Crutchley, P., Preoţiuc-Pietro, D., Asch, D.A., Schwartz, H.A.: Facebook language predicts de- pression in medical records. Proceedings of the National Academy of Sciences 115, 11203–11208 (10 2018). https://doi.org/10.1073/pnas.1802331115 | spa |
| dc.relation.references | Euronews: ChatGPT a year on: 3 ways the AI chatbot has completely changed the world in 12 months (11 2023), https://www.euronews.com/next/2023/11/30/c hatgpt-a-year-on-3-ways-the-ai-chatbot-has-completely-changed-the-w orld-in-12-months | spa |
| dc.relation.references | Grootendorst, M.: BERTopic: Neural topic modeling with a class-based TF-IDF procedure. arXiv preprint arXiv:2203.05794 (3 2022) | spa |
| dc.relation.references | GWI: Social: GWI’s Flagship Report on the Latest Trends in Social Media. Tech. rep., GWI (2023) | spa |
| dc.relation.references | Haythornthwaite, C., Kumar, P., Gruzd, A., Gilbert, S., del Valle, M.E., Paulin, D.: Learning in the wild: coding for learning and practice on Reddit. Learning, Media and Technology 43, 219–235 (7 2018). https://doi.org/10.1080/174398 84.2018.1498356 | spa |
| dc.relation.references | Jose, R., Matero, M., Sherman, G., Curtis, B., Giorgi, S., Schwartz, H.A., Un- gar, L.H.: Using Facebook language to predict and describe excessive alcohol use. Alcoholism: Clinical and Experimental Research 46, 836–847 (5 2022). https: //doi.org/10.1111/acer.14807 | spa |
| dc.relation.references | Kheiri, K., Karimi, H.: SentimentGPT: Exploiting GPT for Advanced Senti- ment Analysis and its Departure from Current Machine Learning. arXiv preprint arXiv:2307.10234 (7 2023) | spa |
| dc.relation.references | Kim, J., Lee, J.E.R.: The Facebook Paths to Happiness: Effects of the Num- ber of Facebook Friends and Self-Presentation on Subjective Well-Being. Cy- berpsychology, Behavior, and Social Networking 14, 359–364 (6 2011). https: //doi.org/10.1089/cyber.2010.0374 | spa |
| dc.relation.references | Lee, J., Kim, C.: A Structure of basic emotions: A review of basic emotion theories usingan emotionallyfine-tunedlanguagemodel,vol.45.Proceedings oftheAnnual Meeting of the Cognitive Science Society (2023) | spa |
| dc.relation.references | Ocal, A.: Framing, Emotions, Salience: The Future Of AI As Seen By Redditors. Ph.D. thesis, Syracuse University (2023) | spa |
| dc.relation.references | Ocal, A.: Perceptions of the Future of Artificial Intelligence on Social Media: A Topic Modeling and Sentiment Analysis Approach. IEEE Access 12, 182386– 182409 (2024). https://doi.org/10.1109/ACCESS.2024.3510526 | spa |
| dc.relation.references | Odilboyevich, J.X.: Psychological Description of Teenage Outlaws in Interpersonal Relationships. Journal of Preschool Education and Psychology 1, 20–22 (2024) | spa |
| dc.relation.references | Omuya, E., Okeyo, G., Kimwele, M.: Sentiment Analysis on Social Media using Machine Learning Approach (11 2021). https://doi.org/10.22541/au.16362014 3.37655829/v1 | spa |
| dc.relation.references | Organization., W.H.: Mental disorders. (6 2022), https://www.who.int/news-r oom/fact-sheets/detail/mental-disorders | spa |
| dc.relation.references | Paris, C., Christensen, H., Batterham, P., O’Dea, B.: Exploring Emotions in Social Media.In:2015IEEEConferenceonCollaborationandInternetComputing(CIC). pp. 54–61. IEEE (10 2015). https://doi.org/10.1109/CIC.2015.43 | spa |
| dc.relation.references | Pirina, I., cagri coltekin: Identifying Depression on Reddit: The Effect of Training Data. In: Proceedings of the 2018 EMNLP Workshop SMM4H: The 3rd Social Media Mining for Health Applications Workshop and Shared Task. pp. 9–12. As- sociation for Computational Linguistics (2018). https://doi.org/10.18653/v1/ W18-5903 | spa |
| dc.relation.references | Plutchik, R., Williams, M., Jerrett, I., Karasu, T.B., Kane, C.: Emotions, person- ality and life stresses in asthma. Journal of Psychosomatic Research 22, 425–431 (1 1978). https://doi.org/10.1016/0022-3999(78)90065-X | spa |
| dc.relation.references | Proferes, N., Jones, N., Gilbert, S., Fiesler, C., Zimmer, M.: Studying Reddit: A Systematic Overview of Disciplines, Approaches, Methods, and Ethics. Social Media + Society 7 (4 2021). https://doi.org/10.1177/20563051211019004 | spa |
| dc.relation.references | Pénard, T., Poussing, N., Suire, R.: Does the Internet make people happier? The Journal of Socio-Economics 46, 105–116 (10 2013). https://doi.org/10.1016/j. socec.2013.08.004 | spa |
| dc.relation.references | Rathje, S., Mirea, D.M., Sucholutsky, I., Marjieh, R., Robertson, C.E., Bavel, J.J.V.: GPT is an effective tool for multilingual psychological text analysis. Pro- ceedings of the National Academy of Sciences 121 (8 2024). https://doi.org/10 .1073/pnas.2308950121 | spa |
| dc.relation.references | Valenzuela, S., Park, N., Kee, K.F.: Is There Social Capital in a Social Network Site?: Facebook Use and College Students’ Life Satisfaction, Trust, and Partic- ipation. Journal of Computer-Mediated Communication 14, 875–901 (7 2009). https://doi.org/10.1111/j.1083-6101.2009.01474.x | spa |
| dc.relation.references | Wankhade, M., Rao, A.C.S., Kulkarni, C.: A survey on sentiment analysis meth- ods, applications, and challenges. Artificial Intelligence Review 55, 5731–5780 (10 2022). https://doi.org/10.1007/s10462-022-10144-1 | spa |
| dc.relation.references | Wei, J., Bosma, M., Zhao, V.Y., Guu, K., Yu, A.W., Lester, B., Du, N., Dai, A.M., Le, Q.V.: Finetuned Language Models Are Zero-Shot Learners. arXiv preprint arXiv:2109.01652 (9 2021) | spa |
| dc.relation.references | Wen, R., Crowe, S.E., Gupta, K., Li, X., Billinghurst, M., Hoermann, S., Allan, D., Nassani, A., Piumsomboon, T.: Large Language Models for Automatic Detection of Sensitive Topics. arXiv preprint arXiv:2409.00940 (9 2024) | spa |
| dc.relation.references | Willcox, G.: The Feeling Wheel. Transactional Analysis Journal 12, 274–276 (10 1982). https://doi.org/10.1177/036215378201200411 | spa |
| dc.relation.references | Zad,S.,Heidari,M.,Jones,J.H.J.,Uzuner,O.:EmotionDetectionofTextualData: An Interdisciplinary Survey. In: 2021 IEEE World AI IoT Congress (AIIoT). pp. 0255–0261.IEEE(52021).https://doi.org/10.1109/AIIoT52608.2021.9454192 | spa |
| dc.relation.references | Zhan, T., Shi, C., Shi, Y., Li, H., Lin, Y.: Optimization Techniques for Sentiment Analysis Based on LLM (GPT-3) (5 2024) | spa |
| dc.subject | Modelos de lenguaje grandes (LLM) | |
| dc.subject | ||
| dc.subject | Salud mental | |
| dc.subject | Procesamiento del lenguaje natural (PLN). | |
| dc.subject | Análisis de emociones | spa |
| dc.subject.keyword | Large language models (LLMs) | |
| dc.subject.keyword | ||
| dc.subject.keyword | Mental health | |
| dc.subject.keyword | Natural language processing (NLP). | |
| dc.subject.keyword | Emotion analysis | spa |
| dc.subject.lemb | Salud mental - Aspectos sociales - Redes sociales. | |
| dc.subject.lemb | Inteligencia artificial - Aplicaciones en medicina | |
| dc.subject.lemb | Procesamiento del lenguaje natural (Informática). | |
| dc.title | Fine-Grained emotion classification in reddit mental health posts using LLMs | spa |
| dc.type.coar | http://purl.org/coar/resource_type/c_bdcc | spa |
Archivos
Bloque original
1 - 1 de 1
Cargando...
- Nombre:
- TesisMaestria_ValeryOrjuela.pdf
- Tamaño:
- 6 MB
- Formato:
- Adobe Portable Document Format
- Descripción:
- Documento reservado
Bloque de licencias
1 - 2 de 2
Cargando...
- Nombre:
- license.txt
- Tamaño:
- 2.87 KB
- Formato:
- Item-specific license agreed upon to submission
- Descripción:
Cargando...
- Nombre:
- FOR-EFE-GDB-008_AUTORIZACION_DE_PUBLICACION_DE_TESIS_O_TRABAJO_DE_GRADO_DE_FORMA_CONFIDENCIAL_Valery_IG.pdf
- Tamaño:
- 98.94 KB
- Formato:
- Adobe Portable Document Format
- Descripción:
- Carta de autorización
