Feasibility and reliability of automated coding of occupation in the Health and Retirement Study

TitleFeasibility and reliability of automated coding of occupation in the Health and Retirement Study
Publication TypeReport
Year of Publication2018
AuthorsMcFall, BHelppie, Sonnega, A
Series TitleMichigan Retirement Research Center Working Paper Series
Document NumberWP 2018-392
InstitutionSurvey Research Center, Institute for Social Research, University of Michigan
CityAnn Arbor
KeywordsMeta-analyses, Survey Methodology

Due to advances in computing power and the increase in coverage of longitudinal datasets in the Health and Retirement Study (HRS) that provide information about detailed occupations, demand has increased among researchers for improved occupation and industry data. The detailed data are currently hard to use because they were coded at different times, and the codeframes are, therefore, not consistent over time. Additionally, the HRS gathers new occupation and industry information from respondents every two years, and coding of new data at each wave is costly and time-consuming. In this project, we tested the NIOSH Industry and Occupation Computerized Coding System (NIOCCS) to see if it could improve processes for coding data from the HRS. We tested results from NIOCCS against results from a human coder for multiple datasets. NIOCCs does reasonably well compared to coding results from a highly trained, professional occupation and industry coder, with kappa inter-rater reliability on detailed codes of just under 70 percent and agreement rates on broader codes of around 80 percent; however, code rates for NIOCCS for the datasets tested ranged from 60 percent to 72 percent, as compared to a professional coder’s ability to code those same datasets that ranged from 95 percent to 100 percent. In its current form, we find that NIOCCS is a tool that might be best used to reduce the number of cases human coders must code, either in coding historical data to a consistent codeframe or in coding data from future HRS waves. However, it is not yet ready to fully replace human coders.

Citation Key10010