Análisis de sentimientos basados en embeddings sobre las opiniones de los clientes en mercado libre
| dc.contributor.advisor | Galpin, Ixent | |
| dc.creator | Steward Tibaduiza, Larry | |
| dc.date.accessioned | 2026-01-15T16:41:40Z | |
| dc.date.created | 2025-02-07 | |
| dc.description.abstract | El artículo presenta un análisis del uso de técnicas de procesamiento del lenguaje natural (PLN) y modelos de aprendizaje automático para la evaluación de opiniones de productos en Mer cadoLibre, con un enfoque en embeddings como Word2Vec, GPT-3, GloVe y MPNet. Se realiza una extracción de datos mediante web scraping, recopilando comentarios estructurados para entrenar mod elos que asignen puntuaciones predictivas de 1 a 5, reflejando la percepción del cliente. Los resultados muestran que GPT-3 supera en rendimiento, especialmente con Support Vector Machines, destacando su capacidad para interpretar el lenguaje natural en tareas de clasificación. La investigación concluye la efectividad de estas herramientas para transformar datos textuales en insumos valiosos para la toma de decisiones en comercio electrónico | |
| dc.description.abstractenglish | The article presents an analysis of the use of natural language processing (NLP) tech niques and machine learning models for evaluating product reviews on MercadoLibre, with a focus on embeddings such as Word2Vec, GPT-3, GloVe, and MPNet. Data extraction is performed through web scraping, collecting structured reviews to train models that assign predictive ratings from 1 to 5, reflecting customer perception. The results show that GPT-3 outperforms other methods, especially when combined with Support Vector Machines, highlighting its ability to interpret natural language in classification tasks. The research concludes the effectiveness of these tools in transforming textual data into valuable inputs for decision-making in e-commerce | |
| dc.format.extent | 10 paginas | |
| dc.format.mimetype | application/pdf | |
| dc.identifier.uri | https://hdl.handle.net/20.500.12010/38794 | |
| dc.language.iso | en | |
| dc.relation.references | Aydin, O.: R web scraping quick start guide techniques and tools to crawl and scrape data from websites (2018), www.packtpub.com | |
| dc.relation.references | Badri, N., Kboubi, F., Chaibi, A.H.: Combining fasttext and glove word embedding for offensive and hate speech text detection. Procedia Computer Science 207, 769–778 (2022). | |
| dc.relation.references | Birunda, S.S., Devi, R.K.: A Review on Word Embedding Techniques for Text Classification, pp. 267–281 (2021). | |
| dc.relation.references | Brewer, R., Westlake, B., Hart, T., Arauza, O.: The Ethics of Web Crawling and Web Scraping in Cybercrime Re search: Navigating Issues of Consent, Privacy, and Other Potential Harms Associated with Automated Data Col lection, pp. 435–456. Springer International Publishing (2021). | |
| dc.relation.references | Cao, L., Zeng, R., Peng, S., Yang, A., Niu, J., Yu, S.: Textual emotion classification using mpnet and cascading broad learning. Neural Networks 179, 106582 (11 2024). | |
| dc.relation.references | Casas, J.O.C., Castillos, C.D.R.: Percepción y preferencia de compra de los clientes por medio de la plataforma mercado libre.com. Universidad Cooperativa de Colombia (2021) | |
| dc.relation.references | Chung, H.W., Févry, T., Tsai, H., Johnson, M., Ruder, S.: Rethinking embedding coupling in pre-trained language models, vol. 1. 1 edn. (2020) | |
| dc.relation.references | Camara de Comercio Electronico, C.C.: Medición de indicadores-tendencia de la oferta de bienes y servicios en línea (2019) | |
| dc.relation.references | Dogucu, M., Çetinkaya Rundel, M.: Web scraping in the statistics and data science curriculum: Challenges and opportunities. Journal of Statistics and Data Science Education 29, S112–S122 (1 2021). | |
| dc.relation.references | Filipetto, S.: Vender en mercadolibre. Papeles de trabajo: La revista electrónica del IDAES 17 (2023) | |
| dc.relation.references | Grohe, M.: word2vec, node2vec, graph2vec, x2vec: Towards a theory of vector embeddings of structured data. In: Proceedings of the 39th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems. pp. 1–16. ACM (6 2020). | |
| dc.relation.references | Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with empha sis on word2vec. Multimedia Tools and Applications 83, 37979–38007 (10 2023). | |
| dc.relation.references | Khder, M.A.: Web scraping or web crawling: State of art, techniques, approaches and application, vol. 13. 3 edn. (2021) | |
| dc.relation.references | Krotov, V., Johnson, L., Silva, L.: Legality and ethics of web scraping. Communications of the Association for Information Systems 47, 539–563 (2020). | |
| dc.relation.references | LaValley, M.P.: Logistic regression. Circulation 117, 2395–2399 (2008). | |
| dc.relation.references | Lunn, S., Zhu, J., Ross, M.: Utilizing web scraping and natural language processing to better inform pedagogical practice. In: 2020 IEEE Frontiers in Education Conference (FIE). pp. 1–9. IEEE (10 2020). | |
| dc.relation.references | Manning, C.D.: An introduction to information retrieval (2009) | |
| dc.relation.references | mercadolibre: https://www.mercadolibre.com.co/compra-protegida | |
| dc.relation.references | Meyer, D.: Support vector machines. R News 1, 23–26 (2001) | |
| dc.relation.references | Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space (9 2013) | |
| dc.relation.references | Neelakantan, A.: Text and code embeddings by contrastive pre-training. 1OpenAI | |
| dc.relation.references | nltk: https://www.nltk.org/ | |
| dc.relation.references | Ofori, D.: Gpt-3 vs other text embeddings techniques for text classification: A performance evaluation. Medium (2 2023) | |
| dc.relation.references | Pennington, J., Socher, R., Manning, C.: Glove: Global vectors for word representation. Computer Science De partment, Stanford University, Stanford, CA 94305 | |
| dc.relation.references | Perdomo, M.P.: Percepción de confianza que genera a sus clientes el modelo de negocio de mercado libre. Uni versidad de La Salle (2018) | |
| dc.relation.references | Quinlan, J.: Simplifying decision trees. International Journal of Man-Machine Studies 27, 221–234 (9 1987). | |
| dc.relation.references | Rodriguez, P.L., Spirling, A.: Word embeddings: What works, what doesn’t, and how to tell the difference for applied research. The Journal of Politics 84, 101–115 (1 2022). | |
| dc.relation.references | Song, K., Tan, X., Qin, T., Lu, J., Liu, T.Y.: Mpnet: Masked and permuted pre-training for language under standing. Nanjing University of Science and Technology (2020) | |
| dc.relation.references | Sourd, F.P.: Xml, json y el intercambio de información. ACUNAH 18 (1 2022) | |
| dc.relation.references | St-Aubin, P., Agard, B.: Precision and reliability of forecasts performance metrics. Forecasting 4, 882–903 (10 2022). | |
| dc.relation.references | Sánchez, J.A., Montoya, L.A.: La confianza como elemento fundamental en las compras a través de canales de comercio electrónico. caso de los consumidores en antioquia (colombia). Innovar 27, 11–22 (4 2017) | |
| dc.relation.references | Villegas, A.J.R., Romero, M., Serna, N.: Risk adjustment revisited using machine learning techniques. Documen tos CEDE (2017) | |
| dc.relation.references | Xu, M.: Understanding Graph Embedding Methods and Their Applications, vol. 63 (1 2021) | |
| dc.relation.references | Yacouby, R., Axman, D.: Probabilistic extension of precision, recall, and f1 score for more thorough evaluation of classification models. In: Proceedings of the First Workshop on Evaluation and Comparison of NLP Systems. pp. 79–91. Association for Computational Linguistics (2020). | |
| dc.relation.references | Zúñiga, J.J.E.: Aplicación de algoritmos random forest y xgboost en una base de solicitudes de tarjetas de crédito. Ingeniería Investigación y Tecnología 21, 1–16 (7 2020) | |
| dc.relation.references | Željko Ð. Vujovic: Classification model evaluation metrics. International Journal of Advanced Computer Science and Applications 12 (2021). | |
| dc.subject | Embeddings | |
| dc.subject | Vectorización | |
| dc.subject | Scraping | |
| dc.subject | PLN | |
| dc.subject.keyword | Incrustaciones | |
| dc.subject.keyword | Vectorización | |
| dc.subject.keyword | Extracción de datos | |
| dc.subject.keyword | PLN | |
| dc.subject.lemb | Procesamiento del lenguaje natural (Informática) | |
| dc.subject.lemb | Comercio electrónico ─ Análisis de datos | |
| dc.subject.lemb | Aprendizaje automático ─ Aplicaciones en minería de textos | |
| dc.title | Análisis de sentimientos basados en embeddings sobre las opiniones de los clientes en mercado libre | |
| dc.type.coar | http://purl.org/coar/resource_type/c_6501 |
Archivos
Bloque original
1 - 1 de 1
Bloque de licencias
1 - 2 de 2
Cargando...
- Nombre:
- license.txt
- Tamaño:
- 3.28 KB
- Formato:
- Item-specific license agreed upon to submission
- Descripción:
Cargando...
- Nombre:
- FOR-EFE-GDB-008_AUTORIZACION_DE_PUBLICACION_DE_TESIS_O_TRABAJO_DE_GRADO_DE_FORMA_CONFIDENCIAL_LT_IG.pdf
- Tamaño:
- 132.06 KB
- Formato:
- Adobe Portable Document Format
- Descripción:
