Fine-Grained emotion classification in reddit mental health posts using LLMs

Orjuela Llanos, Valery Pamela

Fine-Grained emotion classification in reddit mental health posts using LLMs

dc.contributor.advisor	Galpin, Ixent
dc.creator	Orjuela Llanos, Valery Pamela
dc.date.accessioned	2025-06-16T22:15:32Z
dc.date.available	2025-06-16T22:15:32Z
dc.date.created	2025-06-16
dc.description.abstract	Este estudio evalúa el discurso sobre salud mental en Reddit a través del análisis de emociones basado en IA. A partir de un conjunto de datos exhaustivo que incluye 7899 publicaciones y 54755 comentarios sobre 10 afecciones relacionadas con la salud mental, este estudio combina técnicas de PNL como BERTopic y el análisis léxico, con el uso de GPT-4o-mini para clasificar las emociones. Los resultados indican claras diferencias emocionales entre publicaciones y comentarios: mientras que los títulos suelen expresar urgencia (p. ej., "ayuda"), los comentarios muestran empatía (p. ej., "amor"). GPT-4o-mini coincidió con las anotaciones de emociones humanas con una precisión del 52,6 %, obteniendo un mejor rendimiento con las emociones positivas (58 %). El análisis temático reveló que los diagnósticos clínicos predominaban en las publicaciones, pero los comentarios contenían diálogos de apoyo. Este trabajo ilustró el potencial de los LLM como herramienta para el monitoreo de la salud mental y el bienestar, a la vez que indica posibles inconvenientes con respecto a la clasificación ambigua de emociones. El estudio también ofrece información práctica para el desarrollo de herramientas de IA en el ámbito de la salud mental y describe direcciones para el trabajo futuro.	spa
dc.description.abstractenglish	This study evaluates the discourse surrounding mental health on Reddit through the lens of AI-based emotion analysis. Drawing from a comprehensive dataset comprising 7,899 posts and 54,755 comments on 10 mental health related conditions, this study combines NLP techniques such as BERTopic and lexical analysis, and the usage of GPT-4o-mini to classify emotions. The results point to clear emotional differences between posts and comments: while titles often express urgency (e.g., ’help’), comments show empathy (e.g., ’love’). GPT-4o-mini matched human emotion annotations with an accuracy of 52.6%, performing better with positive emotions (58%). Thematic analysis revealed that clinical diagnoses were dominant in the posts, but the comments contained supportive dialogue. This work illustrated that LLMs have potential as an avenue for monitoring mental health and well-being, while also indicating potential drawbacks with respect to ambiguous emotion classification. The study also offers practical information for the development of AI tools in the mental health space and outlines directions for future work.	spa
dc.format.extent	16 páginas	spa
dc.format.mimetype	application/pdf	spa
dc.identifier.uri	https://hdl.handle.net/20.500.12010/36877
dc.language.iso	eng	spa
dc.relation.references	Al-Ansari, Ahmed, J.: From sentiment analysis to choreography of emotions: social media analysis for improved customer relationship management (CRM) in the Omani telecom sector. Ph.D. thesis, University of Strathclyde (2019)	spa
dc.relation.references	Atari, M., Mehl, M.R., Graham, J., Doris, J.M., Schwarz, N., Davani, A.M., Om- rani, A., Kennedy, B., Gonzalez, E., Jafarzadeh, N., Hussain, A., Mirinjian, A., Madden, A., Bhatia, R., Burch, A., Harlan, A., Sbarra, D.A., Raison, C.L., Mose- ley, S.A., Polsinelli, A.J., Dehghani, M.: The paucity of morality in everyday talk. Scientific Reports 13, 5967 (4 2023). https://doi.org/10.1038/s41598-023-3 2711-4	spa
dc.relation.references	Barrett, L.F.: Are Emotions Natural Kinds? Perspectives on Psychological Science 1, 28–58 (3 2006). https://doi.org/10.1111/j.1745-6916.2006.00003.x	spa
dc.relation.references	Center, P.: Teen Internet Access Demographics, Pew Internet and American Life Project (2012)	spa
dc.relation.references	Cha, J., Kim, S., Park, E.: A lexicon-based approach to examine depression de- tection in social media: the case of Twitter and university community. Humanities and Social Sciences Communications 9, 325 (9 2022). https://doi.org/10.105 7/s41599-022-01313-2	spa
dc.relation.references	Coppersmith, G., Dredze, M., Harman, C., Hollingshead, K.: From ADHD to SAD: Analyzing the Language of Mental Health on Twitter through Self-Reported Diag- noses.In:Proceedingsofthe2ndWorkshoponComputationalLinguisticsandClin- ical Psychology: From Linguistic Signal to Clinical Reality. pp. 1–10. Association for Computational Linguistics (2015). https://doi.org/10.3115/v1/W15-1201	spa
dc.relation.references	Dante, Durante, R., Sobbrio, F., Zejcirovic, D.D.: Lost in the net? broadband internet and youth mental health. JSTOR (2022)	spa
dc.relation.references	Demszky,D.,Movshovitz-Attias,D.,Ko,J.,Cowen,A.,Nemade,G.,Ravi,S.:GoE- motions: A Dataset of Fine-Grained Emotions. arXiv preprint arXiv:2005.00547 (5 2020)	spa
dc.relation.references	Egger, R., Yu, J.: A Topic Modeling Comparison Between LDA, NMF, Top2Vec, and BERTopic to Demystify Twitter Posts. Frontiers in Sociology 7 (5 2022). https://doi.org/10.3389/fsoc.2022.886498	spa
dc.relation.references	Eichstaedt, J.C., Smith, R.J., Merchant, R.M., Ungar, L.H., Crutchley, P., Preoţiuc-Pietro, D., Asch, D.A., Schwartz, H.A.: Facebook language predicts de- pression in medical records. Proceedings of the National Academy of Sciences 115, 11203–11208 (10 2018). https://doi.org/10.1073/pnas.1802331115	spa
dc.relation.references	Euronews: ChatGPT a year on: 3 ways the AI chatbot has completely changed the world in 12 months (11 2023), https://www.euronews.com/next/2023/11/30/c hatgpt-a-year-on-3-ways-the-ai-chatbot-has-completely-changed-the-w orld-in-12-months	spa
dc.relation.references	Grootendorst, M.: BERTopic: Neural topic modeling with a class-based TF-IDF procedure. arXiv preprint arXiv:2203.05794 (3 2022)	spa
dc.relation.references	GWI: Social: GWI’s Flagship Report on the Latest Trends in Social Media. Tech. rep., GWI (2023)	spa
dc.relation.references	Haythornthwaite, C., Kumar, P., Gruzd, A., Gilbert, S., del Valle, M.E., Paulin, D.: Learning in the wild: coding for learning and practice on Reddit. Learning, Media and Technology 43, 219–235 (7 2018). https://doi.org/10.1080/174398 84.2018.1498356	spa
dc.relation.references	Jose, R., Matero, M., Sherman, G., Curtis, B., Giorgi, S., Schwartz, H.A., Un- gar, L.H.: Using Facebook language to predict and describe excessive alcohol use. Alcoholism: Clinical and Experimental Research 46, 836–847 (5 2022). https: //doi.org/10.1111/acer.14807	spa
dc.relation.references	Kheiri, K., Karimi, H.: SentimentGPT: Exploiting GPT for Advanced Senti- ment Analysis and its Departure from Current Machine Learning. arXiv preprint arXiv:2307.10234 (7 2023)	spa
dc.relation.references	Kim, J., Lee, J.E.R.: The Facebook Paths to Happiness: Effects of the Num- ber of Facebook Friends and Self-Presentation on Subjective Well-Being. Cy- berpsychology, Behavior, and Social Networking 14, 359–364 (6 2011). https: //doi.org/10.1089/cyber.2010.0374	spa
dc.relation.references	Lee, J., Kim, C.: A Structure of basic emotions: A review of basic emotion theories usingan emotionallyfine-tunedlanguagemodel,vol.45.Proceedings oftheAnnual Meeting of the Cognitive Science Society (2023)	spa
dc.relation.references	Ocal, A.: Framing, Emotions, Salience: The Future Of AI As Seen By Redditors. Ph.D. thesis, Syracuse University (2023)	spa
dc.relation.references	Ocal, A.: Perceptions of the Future of Artificial Intelligence on Social Media: A Topic Modeling and Sentiment Analysis Approach. IEEE Access 12, 182386– 182409 (2024). https://doi.org/10.1109/ACCESS.2024.3510526	spa
dc.relation.references	Odilboyevich, J.X.: Psychological Description of Teenage Outlaws in Interpersonal Relationships. Journal of Preschool Education and Psychology 1, 20–22 (2024)	spa
dc.relation.references	Omuya, E., Okeyo, G., Kimwele, M.: Sentiment Analysis on Social Media using Machine Learning Approach (11 2021). https://doi.org/10.22541/au.16362014 3.37655829/v1	spa
dc.relation.references	Organization., W.H.: Mental disorders. (6 2022), https://www.who.int/news-r oom/fact-sheets/detail/mental-disorders	spa
dc.relation.references	Paris, C., Christensen, H., Batterham, P., O’Dea, B.: Exploring Emotions in Social Media.In:2015IEEEConferenceonCollaborationandInternetComputing(CIC). pp. 54–61. IEEE (10 2015). https://doi.org/10.1109/CIC.2015.43	spa
dc.relation.references	Pirina, I., cagri coltekin: Identifying Depression on Reddit: The Effect of Training Data. In: Proceedings of the 2018 EMNLP Workshop SMM4H: The 3rd Social Media Mining for Health Applications Workshop and Shared Task. pp. 9–12. As- sociation for Computational Linguistics (2018). https://doi.org/10.18653/v1/ W18-5903	spa
dc.relation.references	Plutchik, R., Williams, M., Jerrett, I., Karasu, T.B., Kane, C.: Emotions, person- ality and life stresses in asthma. Journal of Psychosomatic Research 22, 425–431 (1 1978). https://doi.org/10.1016/0022-3999(78)90065-X	spa
dc.relation.references	Proferes, N., Jones, N., Gilbert, S., Fiesler, C., Zimmer, M.: Studying Reddit: A Systematic Overview of Disciplines, Approaches, Methods, and Ethics. Social Media + Society 7 (4 2021). https://doi.org/10.1177/20563051211019004	spa
dc.relation.references	Pénard, T., Poussing, N., Suire, R.: Does the Internet make people happier? The Journal of Socio-Economics 46, 105–116 (10 2013). https://doi.org/10.1016/j. socec.2013.08.004	spa
dc.relation.references	Rathje, S., Mirea, D.M., Sucholutsky, I., Marjieh, R., Robertson, C.E., Bavel, J.J.V.: GPT is an effective tool for multilingual psychological text analysis. Pro- ceedings of the National Academy of Sciences 121 (8 2024). https://doi.org/10 .1073/pnas.2308950121	spa
dc.relation.references	Valenzuela, S., Park, N., Kee, K.F.: Is There Social Capital in a Social Network Site?: Facebook Use and College Students’ Life Satisfaction, Trust, and Partic- ipation. Journal of Computer-Mediated Communication 14, 875–901 (7 2009). https://doi.org/10.1111/j.1083-6101.2009.01474.x	spa
dc.relation.references	Wankhade, M., Rao, A.C.S., Kulkarni, C.: A survey on sentiment analysis meth- ods, applications, and challenges. Artificial Intelligence Review 55, 5731–5780 (10 2022). https://doi.org/10.1007/s10462-022-10144-1	spa
dc.relation.references	Wei, J., Bosma, M., Zhao, V.Y., Guu, K., Yu, A.W., Lester, B., Du, N., Dai, A.M., Le, Q.V.: Finetuned Language Models Are Zero-Shot Learners. arXiv preprint arXiv:2109.01652 (9 2021)	spa
dc.relation.references	Wen, R., Crowe, S.E., Gupta, K., Li, X., Billinghurst, M., Hoermann, S., Allan, D., Nassani, A., Piumsomboon, T.: Large Language Models for Automatic Detection of Sensitive Topics. arXiv preprint arXiv:2409.00940 (9 2024)	spa
dc.relation.references	Willcox, G.: The Feeling Wheel. Transactional Analysis Journal 12, 274–276 (10 1982). https://doi.org/10.1177/036215378201200411	spa
dc.relation.references	Zad,S.,Heidari,M.,Jones,J.H.J.,Uzuner,O.:EmotionDetectionofTextualData: An Interdisciplinary Survey. In: 2021 IEEE World AI IoT Congress (AIIoT). pp. 0255–0261.IEEE(52021).https://doi.org/10.1109/AIIoT52608.2021.9454192	spa
dc.relation.references	Zhan, T., Shi, C., Shi, Y., Li, H., Lin, Y.: Optimization Techniques for Sentiment Analysis Based on LLM (GPT-3) (5 2024)	spa
dc.subject	Modelos de lenguaje grandes (LLM)
dc.subject	Reddit
dc.subject	Salud mental
dc.subject	Procesamiento del lenguaje natural (PLN).
dc.subject	Análisis de emociones	spa
dc.subject.keyword	Large language models (LLMs)
dc.subject.keyword	Reddit
dc.subject.keyword	Mental health
dc.subject.keyword	Natural language processing (NLP).
dc.subject.keyword	Emotion analysis	spa
dc.subject.lemb	Salud mental - Aspectos sociales - Redes sociales.
dc.subject.lemb	Inteligencia artificial - Aplicaciones en medicina
dc.subject.lemb	Procesamiento del lenguaje natural (Informática).
dc.title	Fine-Grained emotion classification in reddit mental health posts using LLMs	spa
dc.type.coar	http://purl.org/coar/resource_type/c_bdcc	spa

Archivos

Bloque original

Mostrando 1 - 1 de 1

Nombre:: TesisMaestria_ValeryOrjuela.pdf
Tamaño:: 6 MB
Formato:: Adobe Portable Document Format
Descripción:: Documento reservado

Descargar

Bloque de licencias

Mostrando 1 - 2 de 2

Nombre:: license.txt
Tamaño:: 2.87 KB
Formato:: Item-specific license agreed upon to submission
Descripción:

Descargar

Nombre:: FOR-EFE-GDB-008_AUTORIZACION_DE_PUBLICACION_DE_TESIS_O_TRABAJO_DE_GRADO_DE_FORMA_CONFIDENCIAL_Valery_IG.pdf
Tamaño:: 98.94 KB
Formato:: Adobe Portable Document Format
Descripción:: Carta de autorización

Descargar

Colecciones

Maestría en Ingeniería y Analítica de Datos