romaniuk-multipleimputation-2014.pdf (352.97 kB)
Multiple imputation in a longitudinal cohort study: a case study of sensitivity to imputation methods
journal contribution
posted on 2014-11-01, 00:00 authored by Helena RomaniukHelena Romaniuk, George C Patton, John B CarlinMultiple imputation has entered mainstream practice for the analysis of incomplete data. We have used it extensively in a large Australian longitudinal cohort study, the Victorian Adolescent Health Cohort Study (1992-2008). Although we have endeavored to follow best practices, there is little published advice on this, and we have not previously examined the extent to which variations in our approach might lead to different results. Here, we examined sensitivity of analytical results to imputation decisions, investigating choice of imputation method, inclusion of auxiliary variables, omission of cases with excessive missing data, and approaches for imputing highly skewed continuous distributions that are analyzed as dichotomous variables. Overall, we found that decisions made about imputation approach had a discernible but rarely dramatic impact for some types of estimates. For model-based estimates of association, the choice of imputation method and decisions made to build the imputation model had little effect on results, whereas estimates of overall prevalence and prevalence stratified by subgroup were more sensitive to imputation method and settings. Multiple imputation by chained equations gave more plausible results than multivariate normal imputation for prevalence estimates but appeared to be more susceptible to numerical instability related to a highly skewed variable.
History
Journal
American journal of epidemiologyVolume
180Issue
9Pagination
920 - 932Publisher
Oxford University PressLocation
Oxford, Eng.Publisher DOI
ISSN
0002-9262eISSN
1476-6256Language
engPublication classification
C1 Refereed article in a scholarly journalCopyright notice
2014, The AuthorUsage metrics
Categories
No categories selectedKeywords
longitudinal cohort studymissing datamultiple imputationsensitivity analysisAmphetamine-Related DisordersLogistic ModelsLongitudinal StudiesMarijuana SmokingProbabilityVictoriaepidemiologyScience & TechnologyLife Sciences & BiomedicinePublic, Environmental & Occupational HealthMISSING-DATACANNABIS USEMENTAL-HEALTHSUBSTANCE USESTRATEGIESDISORDERSEQUATIONSSMOKING
Licence
Exports
RefWorks
BibTeX
Ref. manager
Endnote
DataCite
NLM
DC