The Accuracy Analysis of Different Machine Learning Classifiers for Detecting Suicidal Ideation and Content

Authors

  • Divya Dewangan Research Scholar, Shri Shankaracharya Technical Campus, Bhilai, Chhattisgarh, India
  • Smita Selot Professor and Head, Shri Shankaracharya Technical Campus, Bhilai, Chhattisgarh, India
  • Sreejit Panicker Training Head, Techment Technology, New STPI, Bhilai, Chhattisgarh, India

DOI:

https://doi.org/10.51983/ajes-2023.12.1.3694

Keywords:

Risk of Self-Harm/Suicide, Mental Health, Machine Learning Algorithms, Social Media, Frequency Based Featuring, Prediction Based Featuring

Abstract

Suicide is the matter of purposely causing one’s death and suicidal ideation refers to thoughts or preoccupations with ending one’s own life. Studies have explored verbal and written communications related to suicide, including analyzing suicide notes, online discussions, and social media posts to identify linguistic and content markers that may help in early detection and intervention. The primary purpose of this study is to detect signs of risk of suicide/self-harm in social media users by investigating several frequency-based featuring and prediction-based featuring methods along with different baseline machine learning classifiers. The algorithms applied for analysis are Decision Tree, K-Nearest Neighbors, Random Forest, Multinomial Naïve Bayes, and SVM. Our experimental results showed that the best performance is obtained by the FastText embedding with SVM model having the highest accuracy of 93.76% which outperforms other baselines. The aim of this work is to learn the significance of analysis and do a comparative study of algorithms to find the best suited algorithm.

References

WHO Website, 2022. [Online]. Available: https://www.who.int/ campaigns/world-suicide-prevention-day/2022.

N. Ahuja, A Short Textbook of Psychiatry, 7th ed. Jaypee Brothers Medical Publishers, New Delhi, India, 2011.

J. Glazzerd and S. Stone, Webpage on Selected Topics in child and Adolescent Mental Health, 2020. [Online]. Available: https://books.google.com.ng/books?hl=en&lr=&id=T3L8DwAAQBAJ&oi=fnd&pg=PA7&dq=restricting++social+media+use+ in+mental+health&ots=b3Eq_VFXw&sig=8G92Jii84EYOB87UDtbue935NE4&redir_esc=y #v=onepage&q=restricting%20%20social%20media%20use%20in%20mental%20health&f=false.

M. Chatterjee, P. Kumar, P. Samanta, and D. Sarkar, "Suicide ideation detection from online social media: A multi-modal feature based technique," International Journal of Information Management Data Insights, vol. 2, no. 2, Nov. 2022. DOI: https://doi.org/10.1016/j.jjimei.2022.100103.

M. R. Islam, M. A. Kabir, A. Ahmed, A. R. M. Kamal, H. Wand, and A. Ulhaq, "Depression detection from social media networks data using machine learning techniques," Health Information Science and System, vol. 6, no. 8, 2018, DOI: https://doi.org/10.1007/s13755-018-0046-0.

S. Ghosal and A. Jain, "Depression and Suicide Risk Detection on Social Media using fastText embedding and XGBoost classifier," in International conference on Machine Learning and Data Engineering. Procedia Computer Science, 2023, vol. 218, pp. 1631-1639, DOI: 10.1016/j.procs.2023.01.141.

M. Birjali, A. Beni-Hssane, and M. Erritali, "Machine Learning and Semantic sentiment Analysis based Algorithm for Suicide Sentiment Prediction in Social Network," in Proc. of the 8th International Conference on Emerging Ubiquitous System and Pervasive Networks, pp. 1877-0509, 2017, DOI: 10.1016/j.procs.2017.08.290.

P. Burnap et al., "Multi-class machine classification of suicide-related communication on Twitter," Online Social Networks and Media, vol. 2, pp. 32-44, 2017, DOI: https://doi.org/10.1016/j.osnem.2017.08.001.

T. Zang, A. M. Schoene, and S. Ananaidou, "Automatic identification of suicide notes with a transformer-based deep learning model," Internet Intervention, vol. 25, 2021, DOI: https://doi.org/10.1016/j.invent.2021.100422.

M. M. Tadesse, H. Lin, B. Xu, and L. Yang, "Detection of Depression-Related Posts in Reddit Social Media Forum," IEEE Access, vol. 7, pp. 44883-44893, DOI: 10.1109/ACCESS.2019.2909180.

A. Kumar, T. E. Trueman, and A. K. Abinesh, "Suicidal risk identification in social media," in Proceeding of the 5th Internaltional Conference on AI in Computational Linguistics, vol. 189, 2021, DOI: 10.1016/j.procs.2021.05.106.

S. Lasri, E. H. Nfaoui, and F. E. Haoussi, "Suicide Ideation Detection on Social Networks: Short Literature Review," in Proceeding International Conference on Innovative Data Communication Technology and Application, vol. 215, 2022, DOI: 10.1016/j.procs.2022.12.073.

M. Salehi, S. Ghahari, M. Hosseinzadeh, and L. Ghalichi, "Domestic violence risk prediction in Iran using a machine learning approach by analyzing Persian Textual content in social media," Heliyon, vol. 9, 2023, DOI: https://doi.org/10.1016/j.heliyon.2023.e15667.

M. S. Zulfiker et al., "In-depth analysis of machine learning approaches to predict depression," Current Research in Behavioral Sciences, vol. 2, 2021, DOI: https://doi.org/10.1016/j.crbeha.2021.100044.

D. Lekkas, R. J. Klein, and N. C. Jacobson, "Predicting acute suicidal ideation on instagram using ensemble machine learning models," Internet Interventions, vol. 25, pp. 100424, 2021, DOI: https://doi.org/10.1016/j.invent.2021.100424.

G. Berkelmans et al., "Identifying populations at ultra-high risk of suicide using a novel machine learning method," Comprehensive Psychiatry, vol. 123, pp. 152380, 2023, DOI: https://doi.org/10.1016/j.comppsych.2023.152380.

R. W. A. Caicedo, J. M. G. Soriano, and H. A. M. Sasieta, "Bootstrapping semi-supervised annotation method for potential suicidal messages," Internet Interventions, vol. 28, pp. 100519, 2022, DOI: https://doi.org/10.1016/j.invent.2022.100519.

S. T. Rabani et al., "Detecting suicidality on social media: Machine Learning at rescue," Egyptian Informatics Journal, vol. 24, pp. 291-302, 2023 DOI: https://doi.org/10.1016/j.eij.2023.04.003.

D. R. Kabul and A. V. Nimkar, "A survey on word embedding techniques and semantic similarity for paraphrase identification," International Journal of Computational Systems Engineering, vol. 5, no. 1, 2019.

S. Selot and S. Panicker, "Comparative performance of Random Forest and Support vector Machine on Sentiment Analysis of Reviews of Indian Tourism," IT in Industry, vol. 9, no. 2, 2021.

Z. Keita, "Towards Data Science webpage on Text data representation with one-hot encoding, Tf-Idf, Count Vectors, Co-occurrence Vectors and Word2Vec," 2021. [Online]. Available: https://towardsdatascience.com/text-data-representation-with-one-hot-encoding-tf-idf-count-vectors-co-occurrence-vectors-and-f1bccbd98bef.

A. Chen, "Analytics Vidhya webpage on How does MeaLeon use NLP? Part 3: Some Results Comparing One Hot Encoding and TF-IDF," 2020. [Online]. Available: https://medium.com/analytics-vidhya/how-does-mealeon-use-nlp-part-3-some-results-comparing-one-hot-encoding-and-tf-idf-e664c879882d.

M. M. Trusca, "Efficiency of SVM classifier with Word2Vec and Doc2Vec models," in Proc. of the 13th International Conference on Applied Statistics, pp. 496-503, 2019, DOI: 10.2478/icas-2019-0043.

D. Dewangan, S. Selot, and S. Panicker, "Implementation of Machine Learning Techniques for Depression in Text Messages: A Survey," i-Manager’s Journal on Computer Science, vol. 9, no. 4, 2022, DOI: 10.26634/jcom.9.4.18549.

Downloads

Published

26-05-2023

How to Cite

Dewangan, D., Selot, S., & Panicker, S. (2023). The Accuracy Analysis of Different Machine Learning Classifiers for Detecting Suicidal Ideation and Content. Asian Journal of Electrical Sciences, 12(1), 46–56. https://doi.org/10.51983/ajes-2023.12.1.3694