|Title||Feasibility and reliability of automated coding of occupation in the Health and Retirement Study|
|Year of Publication||2018|
|Authors||McFall, BHelppie, Sonnega, A|
|Series Title||Michigan Retirement Research Center Working Paper Series|
|Document Number||WP 2018-392|
|Institution||Survey Research Center, Institute for Social Research, University of Michigan|
|Keywords||Meta-analyses, Survey Methodology|
Due to advances in computing power and the increase in coverage of longitudinal datasets in the Health and Retirement Study (HRS) that provide information about detailed occupations, demand has increased among researchers for improved occupation and industry data. The detailed data are currently hard to use because they were coded at different times, and the codeframes are, therefore, not consistent over time. Additionally, the HRS gathers new occupation and industry information from respondents every two years, and coding of new data at each wave is costly and time-consuming. In this project, we tested the NIOSH Industry and Occupation Computerized Coding System (NIOCCS) to see if it could improve processes for coding data from the HRS. We tested results from NIOCCS against results from a human coder for multiple datasets. NIOCCs does reasonably well compared to coding results from a highly trained, professional occupation and industry coder, with kappa inter-rater reliability on detailed codes of just under 70 percent and agreement rates on broader codes of around 80 percent; however, code rates for NIOCCS for the datasets tested ranged from 60 percent to 72 percent, as compared to a professional coder’s ability to code those same datasets that ranged from 95 percent to 100 percent. In its current form, we find that NIOCCS is a tool that might be best used to reduce the number of cases human coders must code, either in coding historical data to a consistent codeframe or in coding data from future HRS waves. However, it is not yet ready to fully replace human coders.