Enhancing Influenza Detection through Integrative Machine Learning and Nasopharyngeal Metabolomic Profiling: A Comprehensive Study
Author | Sumon, Md. S. |
Author | Hossain, Md S. |
Author | Al-Sulaiti, Haya |
Author | Yassine, Hadi M. |
Author | Chowdhury, Muhammad E. H. |
Available date | 2025-04-23T05:28:10Z |
Publication Date | 2024 |
Publication Name | Diagnostics |
Resource | Scopus |
Identifier | http://dx.doi.org/10.3390/diagnostics14192214 |
ISSN | 20754418 |
Abstract | Background/Objectives: Nasal and nasopharyngeal swabs are commonly used for detecting respiratory viruses, including influenza, which significantly alters host cell metabolites. This study aimed to develop a machine learning model to identify biomarkers that differentiate between influenza-positive and -negative cases using clinical metabolomics data. Method: A publicly available dataset of 236 nasopharyngeal samples screened via liquid chromatography-quadrupole time-of-flight (LC/Q-TOF) mass spectrometry was used. Among these, 118 samples tested positive for influenza (40 A H1N1, 39 A H3N2, 39 Influenza B), while 118 were negative controls. A stacking-based model was proposed using the top 20 selected features. Thirteen machine learning models were initially trained, and the top three were combined using predicted probabilities to form a stacking classifier. Results: The ExtraTrees stacking model outperformed other models, achieving 97.08% accuracy. External validation on a prospective cohort of 96 symptomatic individuals (48 positive and 48 negatives for influenza) showed 100% accuracy. SHAP values were used to enhance model explainability. Metabolites such as Pyroglutamic Acid (retention time: 0.81 min, m/z: 84.0447) and its in-source fragment ion (retention time: 0.81 min, m/z: 130.0507) showed minimal impact on influenza-positive cases. On the other hand, metabolites with a retention time of 10.34 min and m/z 106.0865, and a retention time of 8.65 min and m/z 211.1376, demonstrated significant positive contributions. Conclusions: This study highlights the effectiveness of integrating metabolomics data with machine learning for accurate influenza diagnosis. The stacking-based model, combined with SHAP analysis, provided robust performance and insights into key metabolites influencing predictions. |
Sponsor | This study was supported by the collaborative grant from Qatar University# QUCG-BRC-24/25-463. The statements made herein are solely the responsibility of the authors. |
Language | en |
Publisher | Multidisciplinary Digital Publishing Institute (MDPI) |
Subject | influenza diagnosis metabolomics model explainability nasopharyngeal swabs stacking machine learning |
Type | Article |
Issue Number | 19 |
Volume Number | 14 |
Files in this item
This item appears in the following Collection(s)
-
Biomedical Research Center Research [776 items ]
-
Biomedical Sciences [784 items ]
-
Electrical Engineering [2821 items ]