Show simple item record

AdvisorAbbasi, Saddam Akber
AdvisorAbdallah, Atiyeh
AuthorShajahan, Tahsin Raahila
Available date2023-10-01T10:12:42Z
Publication Date2023-06
URIhttp://hdl.handle.net/10576/48144
AbstractBody composition is critical for health outcomes and has been researched in various populations and conditions like obesity, diabetes, and many more. Qatar Biobank collected anthropometric and biomedical data from individuals across all age groups. Body fat and lean mass are important measures of body composition which help identify several health risks including cardiovascular health and nutrition. Machine learning (ML) algorithms in Python were used to predict Total Fat Percentage (TFP) and Total Lean Mass (TLM). All the variables in the dataset are used to test different ML algorithms on the TFP variable. Based on performance metrics like R2, Mean Absolute Error and Root Mean Square Error; linear regression, support vector regression (SVR) and extreme gradient boosting (XGBoost) performed well. Subsequently, further analysis on these models were performed using feature selection methods like forward, backward, stepwise and information gain for multiple cross-validation (CV) levels. We found that backward selection with a 10-CV on the SVR model predicted TFP the best with R2 of 86.7% (train), R2 of 80.2% (test) and MAE of 0.025 (train), MAE of 0.030 (test). Some of the best variables selected via this model are: testosterone, urea, gender, body mass index (BMI) and bone mineral density (BMD) Next, TLM is analyzed using the three models that were selected earlier for TFP. It was found that linear regression and SVR models predicted TLM well, while XGBoost performed poorly. Since backward selection with 10-CV produced good results for TFP, the same is applied to the models for feature selection. Based on the results obtained we conclude that linear regression model after feature section predicts TLM the best with R2 of 83.7% (train), R2 of 82.9% (test) and MAE of 0.313 (train), MAE of 0.313 (test). Some of the best variables explaining TLM are: gender, age, BMI, cholesterol and BMD.
Languageen
SubjectMachine Learning
Body Fat
TitleAssessment and Prediction of Body Fat Composition Using A Variety of Machine Learning Algorithms
TypeMaster Thesis
DepartmentApplied Sciences


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record