Background: Studies using data from longitudinal health survey of older adults usually
assumed the data were missing completely at random (MCAR) or missing at random (MAR).
Thus subsequent analyses used multiple imputation or likelihood-based methods to handle
missing data. However, little existing research actually examines whether the data met the
MCAR/MAR assumptions before performing data analyses.
Methods: This study first summarized seven commonly used statistical methods to test
the missing mechanism and discussed their application conditions. Then using two-wave
longitudinal data from the Health and Retirement Study (HRS; wave 2014-2015 and wave 2016-
2017; n=18,747), this study applied different approaches to test the missingness mechanism of
several demographic and health variables.
Results: Results indicated the data did not meet the MCAR assumption even though they
had a very low nonresponse rate. Health measures met the MAR assumptions. Demographic
variables provided good auxiliary information for health variables. Ridout’s logistic regression
model demonstrated applicability to a wide range of scenarios.
Conclusion: Our findings supported the MAR assumptions for the demographic and
health variables in HRS, and therefore provided statistical justification to HRS researchers about
using imputation or likelihood-based methods to deal with missing data. However, researchers
are encouraged to test the missingness mechanism of the specific variables/data when using a
new dataset, and choose the appropriate methods depending on the research goal and nature of
the data. Development of related statistical packages is urgently needed to facilitate the
application of methods testing missingness mechanism to social and behavioral research.