A Comprehensive Machine Learning Approach for COVID-19 Target Discovery in the Small-Molecule Metabolome
Author | Sumon, Md Shaheenur Islam |
Author | Hossain, Md Sakib Abrar |
Author | Al-Sulaiti, Haya |
Author | Yassine, Hadi M. |
Author | Chowdhury, Muhammad E.H. |
Available date | 2025-04-22T08:56:00Z |
Publication Date | 2025-01-11 |
Publication Name | Metabolites |
Identifier | http://dx.doi.org/10.3390/metabo15010044 |
Citation | Sumon, M. S. I., Hossain, M. S. A., Al-Sulaiti, H., Yassine, H. M., & Chowdhury, M. E. (2025). A Comprehensive Machine Learning Approach for COVID-19 Target Discovery in the Small-Molecule Metabolome. Metabolites, 15(1), 44. |
Abstract | Background/Objectives: Respiratory viruses, including Influenza, RSV, and COVID-19, cause various respiratory infections. Distinguishing these viruses relies on diagnostic methods such as PCR testing. Challenges stem from overlapping symptoms and the emergence of new strains. Advanced diagnostics are crucial for accurate detection and effective management. This study leveraged nasopharyngeal metabolome data to predict respiratory virus scenarios including control vs. RSV, control vs. Influenza A, control vs. COVID-19, control vs. all respiratory viruses, and COVID-19 vs. Influenza A/RSV. Method: We proposed a stacking-based ensemble technique, integrating the top three best-performing ML models from the initial results to enhance prediction accuracy by leveraging the strengths of multiple base learners. Key techniques such as feature ranking, standard scaling, and SMOTE were used to address class imbalances, thus enhancing model robustness. SHAP analysis identified crucial metabolites influencing positive predictions, thereby providing valuable insights into diagnostic markers. Results: Our approach not only outperformed existing methods but also revealed top dominant features for predicting COVID-19, including Lysophosphatidylcholine acyl C18:2, Kynurenine, Phenylalanine, Valine, Tyrosine, and Aspartic Acid (Asp). Conclusions: This study demonstrates the effectiveness of leveraging nasopharyngeal metabolome data and stacking-based ensemble techniques for predicting respiratory virus scenarios. The proposed approach enhances prediction accuracy, provides insights into key diagnostic markers, and offers a robust framework for managing respiratory infections. |
Sponsor | This study was supported by the collaborative grants from Qatar University: QUCG-BRC- 24/25-463. Open access publication is covered by the Qatar National Library. |
Language | en |
Publisher | Multidisciplinary Digital Publishing Institute (MDPI) |
Subject | COVID-19 diagnostic markers machine learning metabolomics respiratory viruses |
Type | Article |
Issue Number | 1 |
Volume Number | 15 |
ESSN | 2218-1989 |
Files in this item
This item appears in the following Collection(s)
-
Electrical Engineering [2821 items ]