Decoding silent speech: a machine learning perspective on data, methods, and frameworks

Chowdhury, Adiba Tabassum; Newaz, Mehrin; Saha, Purnata; AbuHaweeleh, Mohannad Natheef; Mohsen, Sara; Bushnaq, Diala; Chabbouh, Malek; Aljindi, Raghad; Pedersen, Shona; Chowdhury, Muhammad E. H.

Author	Chowdhury, Adiba Tabassum
Author	Newaz, Mehrin
Author	Saha, Purnata
Author	AbuHaweeleh, Mohannad Natheef
Author	Mohsen, Sara
Author	Bushnaq, Diala
Author	Chabbouh, Malek
Author	Aljindi, Raghad
Author	Pedersen, Shona
Author	Chowdhury, Muhammad E. H.
Available date	2025-04-13T04:46:18Z
Publication Date	2025
Publication Name	Neural Computing and Applications
Resource	Scopus
Identifier	http://dx.doi.org/10.1007/s00521-024-10456-z
ISSN	9410643
URI	http://hdl.handle.net/10576/64159
Abstract	At the nexus of signal processing and machine learning (ML), silent speech recognition (SSR) has evolved as a game-changing technology that allows for communication without audible voice. This study offers a thorough overview of SSR, tracing its evolution from early waveform analysis to the most recent ML methods. We start by examining current SSR techniques using ML and determining the essential conditions for efficient SSR systems. After that, we look at the datasets and data collection techniques currently employed in SSR research, highlighting the difficulties posed by the variety of articulatory movements and the scarcity of data. Examining state-of-the-art SSR frameworks, the paper covers important topics such signal processing, feature extraction, ML techniques for decoding and optimizing and assessing the performance of SSR models. We emphasize how deep learning (DL) and ML models have evolved to increase SSR resilience and accuracy. The field's proposed procedures are examined, with an emphasis on sophisticated feature extraction and classification methods. Modern SSR techniques are compared in terms of performance, highlighting the advantages and disadvantages of different models. There is also discussion of ethical issues, especially those pertaining to privacy and consent. The integration of multimodal information-visual cues, electromyography signals, and neuroimaging data-to improve SSR systems is covered in this work. We investigate the functions of transfer learning and domain adaptation in handling cross-subject variability. Lastly, the study offers suggestions and future prospects for SSR research, providing practitioners, engineers, and academics with a road map. As SSR continues to push the frontiers of human-machine interaction, our study aims to increase our collective understanding of the technological advances and societal effects of SSR in the ML age.
Sponsor	Open Access funding provided by the Qatar National Library. This work was made possible by Undergraduate Research Experience Program grant # UREP29-043-3-012 from Qatar National Research Fund (QNRF) and is also supported by Qatar University Student Grant: QUST-1-CENG-2024-1722. The statements made herein are solely the responsibility of the authors. The open-access publication cost is covered by Qatar National Library.
Language	en
Publisher	Springer Science and Business Media Deutschland GmbH
Subject	Deep learning Machine learning Silent speech Speech decoding Speech recognition Waves to words
Title	Decoding silent speech: a machine learning perspective on data, methods, and frameworks
Type	Article Review
dc.accessType	Open Access

Files in this item

Name:: s00521-024-10456-z.pdf
Size:: 1.923Mb
Format:: PDF

View/Open

This item appears in the following Collection(s)

Electrical Engineering [‎2817‎ items ]
Medicine Research [‎1673‎ items ]

Show simple item record

Decoding silent speech: a machine learning perspective on data, methods, and frameworks

Files in this item

This item appears in the following Collection(s)

Video