Performance of variable and function selection methods for estimating the nonlinear health effects of correlated chemical mixtures: A simulation study.
Lazarevic, N, Knibbs, LD, Sly, Peter and Barnett, AG 2020, Performance of variable and function selection methods for estimating the nonlinear health effects of correlated chemical mixtures: A simulation study., Stat Med, vol. 39, no. 27, pp. 3947-3967, doi: 10.1002/sim.8701.
Attached Files
Name
Description
MIMEType
Size
Downloads
Title
Performance of variable and function selection methods for estimating the nonlinear health effects of correlated chemical mixtures: A simulation study.
Statistical methods for identifying harmful chemicals in a correlated mixture often assume linearity in exposure-response relationships. Nonmonotonic relationships are increasingly recognized (eg, for endocrine-disrupting chemicals); however, the impact of nonmonotonicity on exposure selection has not been evaluated. In a simulation study, we assessed the performance of Bayesian kernel machine regression (BKMR), Bayesian additive regression trees (BART), Bayesian structured additive regression with spike-slab priors (BSTARSS), generalized additive models with double penalty (GAMDP) and thin plate shrinkage smoothers (GAMTS), multivariate adaptive regression splines (MARS), and lasso penalized regression. We simulated realistic exposure data based on pregnancy exposure to 17 phthalates and phenols in the US National Health and Nutrition Examination Survey using a multivariate copula. We simulated data sets of size N = 250 and compared methods across 32 scenarios, varying by model size and sparsity, signal-to-noise ratio, correlation structure, and exposure-response relationship shapes. We compared methods in terms of their sensitivity, specificity, and estimation accuracy. In most scenarios, BKMR, BSTARSS, GAMDP, and GAMTS achieved moderate to high sensitivity (0.52-0.98) and specificity (0.21-0.99). BART and MARS achieved high specificity (≥0.90), but low sensitivity in low signal-to-noise ratio scenarios (0.20-0.51). Lasso was highly sensitive (0.71-0.99), except for quadratic relationships (≤0.27). Penalized regression methods that assume linearity, such as lasso, may not be suitable for studies of environmental chemicals hypothesized to have nonmonotonic relationships with outcomes. Instead, BKMR, BSTARSS, GAMDP, and GAMTS are attractive methods for flexibly estimating the shapes of exposure-response relationships and selecting among correlated exposures.
Every reasonable effort has been made to ensure that permission has been obtained for items included in DRO. If you believe that your rights have been infringed by this repository, please contact drosupport@deakin.edu.au.