domingo, 28 de agosto de 2022

Bootstrapping semi-supervised annotation method for potential suicidal messages

 The suicide of a person is a tragedy that deeply affects families, communities, and countries. According to the standardized rate of suicides per number of inhabitants worldwide, in 2022 there will be approximately about 903,450 suicides and 18,069,000 unconsummated suicides, affecting people of all ages, countries, races, beliefs, social status, economic status, sex, etc. The publication of suicidal intentions by users of social networks has led to the initiation of research processes in this field, to detect them and encourage them not to commit suicide. This study focused on determining a semi-supervised method to populate the Life Corpus, using a bootstrapping technique, to automatically detect and classify texts extracted from social networks and forums related to suicide and depression based on initial supervised samples. To carry out the experiments we used two different classifiers: Support Vector Machine (SVM) (with Bag of Words (BoW) features with and without Term-Frequency/Inverse Document Frequency (Tf/Idf), as a weighted term, and with or without stopwords) and Rasa (with the default feature extraction system). In addition, we performed the experiments using five data collections: Life, Reddit, Life+Reddit, Life_en, and Life_en + Reddit. Using the semi-supervised method, we managed to increase the size of the Life Corpus from 102 to 273 samples with texts from the social network Reddit, in a combination Life+Reddit+BoW_Embeddings, with the SVM classifier, with which a macro f1 value of 0.80 was achieved. These texts were in turn evaluated by annotators manually with a Cohen's Kappa level of agreement of 0.86.

https://www.sciencedirect.com/science/article/pii/S2214782922000264 

Assessment of supervised classifiers for the task of detecting messages with suicidal ideation

 

According to the World Health Organization (WHO) close to 800,000 people worldwide die by suicide each year, and many more attempts to do it. In consequence, the WHO recognizes suicide as a global public health priority, which affects not only rich countries but poor and middle-income countries as well. This study makes a systematic analysis of 28 supervised classifiers using different features of the corpus Life to detect messages with suicidal ideation and depression to know if these can be used in an automatic prevention online system.

The Life Corpus, used in this research, is a bilingual text corpus (English and Spanish) oriented to the detection of suicide ideation. This corpus was constructed retrieving texts from several social networks and its quality was measured using mutual annotation agreement. The different experiments determined that the classifier with the best performance was KStar, with the corpus features POS-SYNSETS-NUM, achieving the best results with the ROC Area metrics of 0,81036 and F-measure of 0,7148. The present research fulfilled the objective of discovering which supervised classifiers and which features are the most suitable for the automatic classification of messages with suicidal ideation using the Life Corpus.

Also, given the imbalance of the results, a new precision measure was developed called the Two-dimensional Accuracy and Recovery Index (GDP), which can provide better results, in unbalanced systems, than the usual measures to assess the quality of the results (measure F, Area ROC), and thus increase the number of messages at risk of suicidal ideation, detected at the cost of receiving more messages that are not related to suicide or vice versa.

https://www.sciencedirect.com/science/article/pii/S2405844020312561 

sábado, 27 de agosto de 2022

El conocimiento del mundo en tus manos. Las herramientas de investigación Web y de escritorio que todo investigador debe conocer

Quiero compartir con ustedes mi nuevo libro "El conocimiento del mundo en tus manos - Las herramientas de investigación Web y de escritorio que todo investigador debe conocer", disponible en la plataforma Amazon, El cual considero es una valiosa herramienta para que estudiantes de colegio, universidades tanto de pregrado como de posgrado, y personas que deseen ingresar al mundo de la investigación puedan tomar de guía, ya que este sigue una metodología que permite llegar a conocer de forma profunda el tema que se dese investigar, apoyándose en herramientas informáticas en línea tanto libres como de pago, con las cuales generar conocimiento, y referencias bibliográficas para apoyar el desarrollo de sus documentos académicos tales como trabajos diarios, tesis, libros u artículos científicos. 

El libro está escrito en un lenguaje inclusivo y cuenta con una gran cantidad de imágenes, con el fin de que sea apto todo tipo de lectores. El libro abarca las siguientes secciones: 

  • Buscadores no especializados como Google, Bing, Yahoo y YouTube. 
  • Plataformas de cursos MOOC como: Miriadax, Edx, Coursera, entre otros. 
  • Plataformas de búsqueda de documentos académicos como Google Scholar, Scopus y Web of Science. 
  • Plataformas web de revisión sistemática de la literatura en el marco de la ingeniería del software Parsifal.
  • Gestores de referencias bibliográficas Mendeley y Zotero. 
  • Uso de referencias académicas en editores de texto como Microsoft Word y Overleaf. - Principales bases de datos científicas. - Grupos de investigación. - Sitios Web de interés.  Libro online


Inteligencia Artificial 2023

 1. Bard de Google  2. GPT-4 de OpenAI  3. Claude de Anthropic AI  4. Sparrow de DeepMind  5. MusicLM de Google  6. Phenaki de Google  7. Op...