In this document we describe a project to link records from the 1940 U.S. Census to records
for respondents to the Health and Retirement Survey (HRS).
The project is part of a larger effort to conduct parallel linkages to the 1940 Census for
respondents to the HRS, the Panel Study of Income Dynamics (PSID), the Wisconsin
Longitudinal Study (WLS), the National Social Life, Health, and Aging Project (NSHAP), and
the National Health and Aging Trends Study (NHATS). In each cohort study, many sample
members were alive at the time of the 1940 federal census and were thus enumerated
(along with their families and household members). These five ongoing longitudinal studies
are central components of America’s data infrastructure for interdisciplinary research on
aging and the life course; physical and mental health, disability, and well-being; later-life
work, economic well-being, and retirement; end-of-life issues, and many other topics.
Adding information about sample members from the 1940 Census will expand the utility of
all five projects and will enable important research on the effects of early life social,
economic, environmental, contextual, and other factors on subsequent life outcomes.
Broadly, the project described in this document involved (1) preparing and formatting data
files containing respondents’ identifying information; (2) deploying machine learning
algorithms to mechanically link project records to the 1940 U.S. Census; (3) hand linking
records that could not be machine linked and hand-verifying a portion of those that could;
and (4) documenting the new measures and making them available as part of the HRS’s
restricted access dissemination systems in a manner consistent with HRS respondents’
privacy rights. In this document we describe the linking procedures, explain the structure of
the resulting linked files and how they can be accessed, and provide information about
linkage rates and the reliability and validity of the links.