ON VARIABLE SELECTION WITH THE PRESENCE OF MISSING DATA IN LONGITUDINAL PANEL STUDIES
Abstract
Longitudinal data are valuable in various disciplines because they provide helpful developmental patterns over time. However, frequently, it is challenging to have a high dimension of covariates and ubiquitous missing values in longitudinal data due to individual nonresponse and drop out. Response measurements in longitudinal studies are correlated within-subjects, where this challenge needs to be adequately handled using the linear mixed model (LMM) to get valid inferences and standard errors. LMMs provide an effective and flexible way to accommodate two types of parameters for between-subject correlation and within-subject variation. The powerful two-stage adaptive LASSO method for variable selection adopted provided promising results in LMMs. The joint modeling multiple imputations for handling missingness provided a consistent estimation of parameters and variance components. Several researchers discussed the variable selection criteria and missing data handling in longitudinal studies separately. Hence, the thesis proposed a computationally efficient combining algorithm of multiple imputations and penalized variable selection using the stacked (homogeneous) approach. The homogeneous algorithm showed better estimation and selection properties.
DOI/handle
http://hdl.handle.net/10576/32125Collections
- Mathematics, Statistics & Physics [33 items ]