Pengembangan Algoritma Deteksi Emosi Melalui Analisis Suara untuk Aplikasi Konseling Digital

Ari Putra Wibowo; gunawan Prayitno

doi:10.59031/jnts.v3i2.792

Authors

Ari Putra Wibowo Institut Widya Pratama Pekalongan
gunawan Prayitno STMIK PESAT NABIRE

DOI:

https://doi.org/10.59031/jnts.v3i2.792

Keywords:

Counseling, Digitalization, Emotion Detection, Mental Health, Speech

Abstract

Mental health issues such as depression, anxiety, and stress continue to increase globally and are recognized as critical factors that influence social functioning, productivity, and overall quality of life. Conventional mental health services are often limited by barriers including high cost, geographical distance, and persistent stigma that discourage individuals from seeking timely help. The digital era provides an alternative through the integration of technology into mental health counseling, offering greater accessibility, flexibility, and anonymity. Nevertheless, a key limitation of many digital counseling platforms lies in their inability to fully capture and respond to the emotional nuances of users during interactions. This study aims to address that gap by developing a speech-based emotion detection framework designed to be integrated into digital counseling environments. The proposed methodology includes the collection and preprocessing of speech samples, feature extraction using acoustic parameters, and training machine learning models to classify emotions in real time. Experimental results demonstrate that this approach significantly improves the accuracy of emotion detection, enabling digital counseling systems to provide more adaptive and personalized support. Beyond counseling, the research highlights the broader applicability of speech emotion recognition in education, telemedicine, and interactive digital assistants, all of which benefit from improved sensitivity to human emotions. These findings underscore the potential of artificial intelligence to strengthen digital mental health interventions, ensuring services that are not only more efficient and inclusive but also capable of fostering long-term emotional well-being in diverse populations.

References

Amaro, I., Greca, A. D., & Tortora, G. (2024). HAYT application: The use of NLP to improve the diagnosis and treatment of anxiety and depression. Proceedings - 2024 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2024, 6774–6781. https://doi.org/10.1109/BIBM62325.2024.10822039

Anser, M. K., Nabi, A. A., Ahmad, I., Abro, M. M. Q., & Zaman, K. (2025). Advancing mental health care: A comprehensive review of digital tools and technologies for enhancing diagnosis, treatment, and wellness. Health Care Science, 4(3), 163–178. https://doi.org/10.1002/hcs2.70018

Bayraktar, U., Kilimci, H., Kilinc, H. H., & Kilimci, Z. H. (2023). Assessing audio-based transformer models for speech emotion recognition. ISAS 2023 - 7th International Symposium on Innovative Approaches in Smart Technologies, Proceedings. https://doi.org/10.1109/ISAS60782.2023.10391313

Dhaouadi, S., Abdelkrim, H., & Saoud, S. B. (2019). Speech emotion recognition: Models implementation evaluation. Proceedings of International Conference on Advanced Systems and Emergent Technologies, IC_ASET 2019, 256–261. https://doi.org/10.1109/ASET.2019.8871014

Gopalakrishnan, S., Eswar, R., Kishoor, T. K., Thamizhmaran, U., & Bharathan, M. (2025). Multimodal emotion recognition: An integrated approach using facial, audio and text analysis. Proceedings of 3rd International Conference on Augmented Intelligence and Sustainable Systems, ICAISS 2025, 122–127. https://doi.org/10.1109/ICAISS61471.2025.11041993

Hanlon, C., Patel, V., & Nadkarni, A. (2024). Mental health care models in low-and middle-income countries. In Tasman’s psychiatry (5th ed., pp. 3347–3393). Springer. https://doi.org/10.1007/978-3-030-51366-5_156

Hidayat, A. N., & Usanto. (2024). Investigating the impact of digital counseling platforms on high school students' well-being for enhancing emotional resilience and academic performance. Edelweiss Applied Science and Technology, 8(6), 720–727. https://doi.org/10.55214/25768484.v8i6.2143

Islam, A., Foysal, M., & Ahmed, M. I. (2024). Emotion recognition from speech audio signals using CNN-BiLSTM hybrid model. 2024 3rd International Conference on Advancement in Electrical and Electronic Engineering, ICAEEE 2024. https://doi.org/10.1109/ICAEEE62219.2024.10561755

Ivanets, N. N., Kinkulkina, M. A., & Tikhonova, Y. G. (2023). Digital interventions in mental health: Challenges and perspectives. National Health Care (Russia), 4(2), 5–14. https://doi.org/10.47093/2713-069X.2023.4.2.5-14

Jadhav, A., Kadam, V., Prasad, S., Waghmare, N., & Dhule, S. (2023). An emotion recognition from speech using LSTM. International Conference on Sustainable Computing and Smart Systems, ICSCSS 2023 - Proceedings, 834–842. https://doi.org/10.1109/ICSCSS57650.2023.10169351

Jothimani, S., Sangeethaa, S. N., & Premalatha, K. (2022). Advanced deep learning techniques with attention mechanisms for acoustic emotion classification. 5th International Conference on Inventive Computation Technologies, ICICT 2022 - Proceedings, 1235–1240. https://doi.org/10.1109/ICICT54344.2022.9850908

Kalra, H. (2023). LSTM based feature learning and CNN based classification for speech emotion recognition. 2023 International Conference on Data Science and Network Security, ICDSNS 2023. https://doi.org/10.1109/ICDSNS58469.2023.10244802

Kuhn, E., Saleem, M., Klein, T., Köhler, C., Fuhr, D. C., Lahutina, S., Minarik, A., ... Böge, K. (2024). Interdisciplinary perspectives on digital technologies for global mental health. PLOS Global Public Health, 4(2), e0002867. https://doi.org/10.1371/journal.pgph.0002867

Kushwaha, N., Mishra, N., Lalawat, R. S., Jaswal, G., Gupta, V. K., & Padhy, P. K. (2024). Voice signals feature extraction and classification of bedridden patients. 3rd International Conference on Communication, Control, and Intelligent Systems, CCIS 2024. https://doi.org/10.1109/CCIS63231.2024.10931884

Lata, S., Kishore, N., & Sangwan, P. (2024). Deep learning approaches and security domains in sentiment analysis. Proceedings - 1st International Conference on Electronics, Communication and Signal Processing, ICECSP 2024. https://doi.org/10.1109/ICECSP61809.2024.10698274

Lenson, A. K. S., & Airlangga, G. (2023). Comparative analysis of MLP, CNN, and RNN models in automatic speech recognition: Dissecting performance metric. Buletin Ilmiah Sarjana Teknik Elektro, 5(4), 576–583. https://doi.org/10.12928/biste.v5i4.9668

Mishra, K., Priya, P., & Ekbal, A. (2023). Help me heal: A reinforced polite and empathetic mental health and legal counseling dialogue system for crime victims. Proceedings of the 37th AAAI Conference on Artificial Intelligence, AAAI 2023, 37, 14408–14416. https://doi.org/10.1609/aaai.v37i12.26685

Mukherjee, S., Mundra, S., & Mundra, A. (2023). Speech emotion recognition using convolutional neural networks on spectrograms and Mel-frequency cepstral coefficients images. In Lecture Notes in Networks and Systems (Vol. 615, pp. 33–41). Springer. https://doi.org/10.1007/978-981-19-9304-6_4

Murthy, A., Meghana, D. S., & Nitheesh Kumar, B. S. (2025). Harnessing LSTM networks for effective speech emotion recognition. 2025 IEEE 2nd International Conference on Advances in Modern Age Technologies for Health and Engineering Science, AMATHE 2025 - Proceedings. https://doi.org/10.1109/AMATHE65477.2025.11081302

Naslund, J. A., & Babalola, D. (2025). Digital interventions for mental health care. In The digital doctor: How digital health can transform healthcare (pp. 261–273). Elsevier. https://doi.org/10.1016/B978-0-443-15728-8.00023-9

Onisha, T. A., Kim, J., & Seol, J. (2024). Multi label sound classification using deep learning models. 2024 IEEE/ACIS 22nd International Conference on Software Engineering Research, Management and Applications, SERA 2024 - Proceedings, 129–134. https://doi.org/10.1109/SERA61261.2024.10685563

Park, Y., Lee, S., Lim, I., Kim, S., Park, H., & Keum, H. (2023). Real-time emotion recognition and expression for avatar-mediated mental health counseling: Improving nonverbal communication and emotional response. Proceedings - 2023 IEEE SmartWorld ... Metaverse 2023. https://doi.org/10.1109/SWC57546.2023.10448929

Rafikova, A. S., Valueva, E. A., & Panfilova, A. S. (2022). Voice and psychological characteristics: A contemporary review [Голос и психологические свойства человека: обзор современных исследований]. Psychology, Journal of the Higher School of Economics, 19(1), 195–215. https://doi.org/10.17323/1813-8918-2022-1-195-215

Rosita, Y. D., Firmansyah, M. R., & Utami, A. (2025). Exploring bibliometric trends in speech emotion recognition (2020–2024). IAES International Journal of Artificial Intelligence, 14(4), 3421–3434. https://doi.org/10.11591/ijai.v14.i4.pp3421-3434

Stein, O. A., & Prost, A. (2024). Exploring the societal implications of digital mental health technologies: A critical review. SSM - Mental Health, 6, 100373. https://doi.org/10.1016/j.ssmmh.2024.100373

Suryamritha, M., Balaji, V., Kannan, S., & Murali, K. (2024). Speaker identification using CNN-LSTM model on RAVDESS dataset: A deep learning approach. 2024 4th International Conference on Intelligent Technologies, CONIT 2024. https://doi.org/10.1109/CONIT61985.2024.10626802

Torous, J., Wadley, G., Wolters, M. K., & Calvo, R. A. (2019). 4th symposium on computing and mental health: Designing ethical e-mental health services. Conference on Human Factors in Computing Systems - Proceedings. https://doi.org/10.1145/3290607.3298997

Wu, J.-Y., Tsai, Y.-Y., Chen, Y.-J., Hsiao, F.-C., Hsu, C.-H., Lin, Y.-F., & Liao, L.-D. (2025). Digital transformation of mental health therapy by integrating digitalized cognitive behavioral therapy and eye movement desensitization and reprocessing. Medical and Biological Engineering and Computing, 63(2), 339–354. https://doi.org/10.1007/s11517-024-03209-6

Yousuf, A., & George, D. S. (2025). Feature extraction of audio data for speaker’s gender classification. Journal of Physics: Conference Series, 2998(1), 012003. https://doi.org/10.1088/1742-6596/2998/1/012003

Zhou, J., & Yang, L. (2024). Research on audio scene classification method based on deep learning technology in sound processing. ACM International Conference Proceeding Series, 664–668. https://doi.org/10.1145/3675417.3675527

Pengembangan Algoritma Deteksi Emosi Melalui Analisis Suara untuk Aplikasi Konseling Digital

Authors

DOI:

Keywords:

Abstract

References

Downloads

Published

How to Cite

Issue

Section

License

New Block