Decoding silent speech: a machine learning perspective on data, methods, and frameworks

Chowdhury, Adiba Tabassum; Newaz, Mehrin; Saha, Purnata; AbuHaweeleh, Mohannad Natheef; Mohsen, Sara; Bushnaq, Diala; Chabbouh, Malek; Aljindi, Raghad; Pedersen, Shona; Chowdhury, Muhammad E. H.

المؤلف	Chowdhury, Adiba Tabassum
المؤلف	Newaz, Mehrin
المؤلف	Saha, Purnata
المؤلف	AbuHaweeleh, Mohannad Natheef
المؤلف	Mohsen, Sara
المؤلف	Bushnaq, Diala
المؤلف	Chabbouh, Malek
المؤلف	Aljindi, Raghad
المؤلف	Pedersen, Shona
المؤلف	Chowdhury, Muhammad E. H.
تاريخ الإتاحة	2025-04-13T04:46:18Z
تاريخ النشر	2025
اسم المنشور	Neural Computing and Applications
المصدر	Scopus
المعرّف	http://dx.doi.org/10.1007/s00521-024-10456-z
الرقم المعياري الدولي للكتاب	9410643
معرّف المصادر الموحد	http://hdl.handle.net/10576/64159
الملخص	At the nexus of signal processing and machine learning (ML), silent speech recognition (SSR) has evolved as a game-changing technology that allows for communication without audible voice. This study offers a thorough overview of SSR, tracing its evolution from early waveform analysis to the most recent ML methods. We start by examining current SSR techniques using ML and determining the essential conditions for efficient SSR systems. After that, we look at the datasets and data collection techniques currently employed in SSR research, highlighting the difficulties posed by the variety of articulatory movements and the scarcity of data. Examining state-of-the-art SSR frameworks, the paper covers important topics such signal processing, feature extraction, ML techniques for decoding and optimizing and assessing the performance of SSR models. We emphasize how deep learning (DL) and ML models have evolved to increase SSR resilience and accuracy. The field's proposed procedures are examined, with an emphasis on sophisticated feature extraction and classification methods. Modern SSR techniques are compared in terms of performance, highlighting the advantages and disadvantages of different models. There is also discussion of ethical issues, especially those pertaining to privacy and consent. The integration of multimodal information-visual cues, electromyography signals, and neuroimaging data-to improve SSR systems is covered in this work. We investigate the functions of transfer learning and domain adaptation in handling cross-subject variability. Lastly, the study offers suggestions and future prospects for SSR research, providing practitioners, engineers, and academics with a road map. As SSR continues to push the frontiers of human-machine interaction, our study aims to increase our collective understanding of the technological advances and societal effects of SSR in the ML age.
راعي المشروع	Open Access funding provided by the Qatar National Library. This work was made possible by Undergraduate Research Experience Program grant # UREP29-043-3-012 from Qatar National Research Fund (QNRF) and is also supported by Qatar University Student Grant: QUST-1-CENG-2024-1722. The statements made herein are solely the responsibility of the authors. The open-access publication cost is covered by Qatar National Library.
اللغة	en
الناشر	Springer Science and Business Media Deutschland GmbH
الموضوع	Deep learning Machine learning Silent speech Speech decoding Speech recognition Waves to words
العنوان	Decoding silent speech: a machine learning perspective on data, methods, and frameworks
النوع	Article Review
dc.accessType	Open Access

الملفات في هذه التسجيلة

الاسم:: s00521-024-10456-z.pdf
الحجم:: 1.923Mb
الصيغة:: PDF

عرض / فتح

هذه التسجيلة تظهر في المجموعات التالية

الهندسة الكهربائية [‎2817‎ items ]
أبحاث الطب [‎1673‎ items ]

عرض بسيط للتسجيلة

Decoding silent speech: a machine learning perspective on data, methods, and frameworks

الملفات في هذه التسجيلة

هذه التسجيلة تظهر في المجموعات التالية

Video